Myth: CPU Requests Guarantee Reserved CPU for Containers

During a performance investigation, I once saw a pod with cpu: 500m requests getting throttled heavily under load.

The immediate reaction from the team was:

“How can this be throttled? We requested half a core!”

This misunderstanding comes up frequently in interviews, production incidents, and even architectural reviews.

Kubernetes does reserve memory strictly based on requests, so people assume CPU works the same way.
The word request sounds like a hard reservation.
Many blog posts oversimplify CPU scheduling behavior and ignore Linux cgroups and CFS mechanics.

CPU in Kubernetes is a compressible resource, not a strictly reserved one.

What actually happens:

CPU requests are used only for scheduling, not reservation.
The Linux Completely Fair Scheduler (CFS) shares CPU time across all runnable processes.
Your pod gets priority proportional to its CPU request, but not exclusive access.

If your pod is idle, other pods can use that CPU.

Under contention, the kernel may throttle your pod based on its CPU limit, not its request.

In short:

CPU requests influence fairness, not exclusivity.

Pod Spec

resources:
  requests:
    cpu: "100m"
  limits:
    cpu: "100m"

Scenario

Observe Throttling

kubectl top pod <pod-name>

Prometheus metric:

container_cpu_cfs_throttled_seconds_total

You’ll clearly see throttling even though the request was met.

CPU requests do not reserve CPU cores.
Requests affect scheduling priority, not guaranteed CPU availability.
CPU is shared dynamically using Linux CFS.
Limits control throttling; requests control placement.
Overestimating CPU requests can hurt cluster efficiency.
Underestimating CPU limits can cause latency spikes.
Understanding this distinction is critical for:
- Performance tuning
- Capacity planning
- Avoiding false assumptions during incidents

Did this myth change how you understand Kubernetes?