Kubernetes Resource Management for the CKAD Exam: Requests, Limits, ResourceQuota & LimitRange

Resource management is the quiet workhorse of the Certified Kubernetes Application Developer (CKAD) exam. It rarely gets a flashy dedicated question, but requests, limits, and quotas thread through the entire “Application Environment, Configuration and Security” domain — the single largest slice of the exam at roughly 25% of your score. Get them wrong and your Pods stay Pending, get OOMKilled, or throttle to a crawl. Get them right and you unlock predictable scheduling, stable workloads, and a handful of fast, deterministic points.

This guide turns resource management into a mechanical skill. By the end you will be able to read a question, decide whether it needs a request, a limit, a ResourceQuota, or a LimitRange, write the YAML from memory, and diagnose the three failure modes the exam loves to test: a Pod that won’t schedule, a container that gets killed, and a namespace that rejects new workloads.

Why Resource Management Matters on the CKAD

Kubernetes is a scheduler before it is anything else. When you submit a Pod, the scheduler has to find a node with enough free capacity to run it. It can only do that if you tell it how much the Pod needs. That declaration is the request. Separately, the kubelet enforces a ceiling so one container can’t starve its neighbors — that ceiling is the limit.

The CKAD tests this from the developer’s seat. You are not tuning the cluster; you are shipping an application that behaves correctly under the resource constraints an administrator has set. That means three recurring tasks:

Add CPU and memory requests and limits to a container.
Work inside a namespace governed by a ResourceQuota without tripping it.
Understand how a LimitRange silently injects defaults into your Pods.

Each is small. Together they are worth real points, and they reappear inside troubleshooting and deployment questions where a missing request is the hidden reason a Pod never starts.

Requests vs. Limits: The Core Distinction

This is the one concept everything else builds on, so internalise it precisely:

Concept	What it does	Who enforces it	When it bites
Request	Reserves capacity; used by the scheduler to place the Pod	kube-scheduler	At scheduling time — too-large requests leave the Pod `Pending`
Limit	Hard ceiling the container may consume	kubelet / container runtime	At runtime — exceeding it throttles CPU or kills memory

A request is a promise the scheduler makes to your container: “this much will always be available to you.” A limit is a promise your container makes to the cluster: “I will never use more than this.” The gap between them is where bursting happens.

Here is a container with all four values set:

apiVersion: v1
kind: Pod
metadata:
  name: web
spec:
  containers:
    - name: web
      image: nginx
      resources:
        requests:
          cpu: "250m"
          memory: "64Mi"
        limits:
          cpu: "500m"
          memory: "128Mi"

This Pod is guaranteed 250 millicores of CPU and 64 MiB of memory the moment it schedules, and it may burst up to 500m CPU and 128 MiB before the kubelet steps in.

Understanding the Units

The units trip people up under time pressure, so commit them to memory.

CPU is measured in cores. 1 means one full vCPU. The suffix m means millicores — thousandths of a core:

Value	Meaning
`1` or `1000m`	One full CPU core
`500m`	Half a core
`250m`	A quarter of a core
`100m`	One tenth of a core

Memory is measured in bytes, but you always write a suffix. Know the difference between the power-of-two and power-of-ten suffixes:

Suffix	Meaning	Example
`Mi`	Mebibyte (1024² bytes)	`128Mi` = 134,217,728 bytes
`Gi`	Gibibyte (1024³ bytes)	`1Gi` = 1,073,741,824 bytes
`M`	Megabyte (1000² bytes)	`128M` = 128,000,000 bytes
`G`	Gigabyte (1000³ bytes)	`1G` = 1,000,000,000 bytes

On the exam, stick with Mi and Gi unless told otherwise — they are the convention in nearly every official manifest and avoid the off-by-a-few-percent confusion of the decimal suffixes.

CPU vs. Memory: Two Very Different Enforcement Models

This distinction is a favourite source of “why did my Pod die?” questions, and it is the single most important runtime nuance in the whole topic.

CPU is compressible. If a container hits its CPU limit, the kernel simply throttles it — the process runs slower but keeps living. Nothing is killed. A CPU-throttled container shows up as high latency, not a crash.

Memory is incompressible. You cannot “slow down” memory usage. If a container tries to allocate beyond its memory limit, the kernel’s OOM (out-of-memory) killer terminates the process. The container’s last state becomes OOMKilled and, depending on the restart policy, it restarts — often into a CrashLoopBackOff if the workload immediately tries to grab the same memory again.

# A container that was OOMKilled shows it in the last state:
kubectl describe pod web | grep -A5 "Last State"
# Last State:     Terminated
#   Reason:       OOMKilled
#   Exit Code:    137

Exit code 137 (128 + signal 9, SIGKILL) is the fingerprint of an OOM kill. When you see it on the exam or in real life, the fix is almost always: raise the memory limit, or fix the application’s memory leak. Raising the request alone does nothing here — the limit is the ceiling that triggered the kill.

Quality of Service (QoS) Classes

Kubernetes derives a QoS class for every Pod from how you set requests and limits. The class decides which Pods get evicted first when a node runs out of memory. You don’t set QoS directly — you earn it through your resource spec. The CKAD expects you to recognise all three.

QoS Class	How a Pod earns it	Eviction priority
Guaranteed	Every container has requests and limits set, and request == limit for both CPU and memory	Evicted last (most protected)
Burstable	At least one container has a request or limit, but they don’t meet the Guaranteed bar	Evicted in the middle
BestEffort	No requests or limits set on any container	Evicted first (least protected)

To get Guaranteed, requests must equal limits for both resources:

resources:
  requests:
    cpu: "500m"
    memory: "256Mi"
  limits:
    cpu: "500m"      # equal to request
    memory: "256Mi"  # equal to request

Check the class Kubernetes assigned with:

kubectl get pod web -o jsonpath='{.status.qosClass}'
# Guaranteed

A likely exam phrasing is “ensure this Pod has the highest scheduling priority / is the last to be evicted.” That is code for make it Guaranteed — set requests equal to limits on every container.

ResourceQuota: Capping a Namespace

A ResourceQuota is an administrator’s tool to cap the aggregate resource consumption of an entire namespace. As a developer sitting the CKAD, your job is usually to either create a quota or — more commonly — to ship Pods that don’t violate one.

apiVersion: v1
kind: ResourceQuota
metadata:
  name: team-quota
  namespace: dev
spec:
  hard:
    requests.cpu: "2"
    requests.memory: 2Gi
    limits.cpu: "4"
    limits.memory: 4Gi
    pods: "10"

This quota says the dev namespace may have at most 10 Pods, and the sum of all container requests may not exceed 2 cores and 2Gi, with limits summing to no more than 4 cores and 4Gi.

Inspect current usage versus the hard cap with:

kubectl describe resourcequota team-quota -n dev
# Resource         Used   Hard
# --------         ----   ----
# limits.cpu       500m   4
# limits.memory    256Mi  4Gi
# pods             1      10
# requests.cpu     250m   2
# requests.memory  64Mi   2Gi

The Quota Gotcha That Costs People Points

Here is the rule that catches everyone: once a namespace has a ResourceQuota that constrains requests.cpu, requests.memory, limits.cpu, or limits.memory, every new Pod in that namespace MUST specify the corresponding requests and limits. A Pod with no resource spec will be rejected outright:

Error from server (Forbidden): error when creating "pod.yaml":
pods "test" is forbidden: failed quota: team-quota:
must specify limits.cpu,limits.memory,requests.cpu,requests.memory

If you see this error on the exam, you have two ways out: add the missing requests and limits to the Pod spec, or rely on a LimitRange to supply them automatically — which brings us to the final piece.

You can create a quota imperatively too, which is faster on the exam:

kubectl create quota team-quota \
  --hard=requests.cpu=2,requests.memory=2Gi,limits.cpu=4,limits.memory=4Gi,pods=10 \
  -n dev

LimitRange: Defaults and Boundaries Per Container

A LimitRange operates at the individual Pod or container level within a namespace. It does two distinct jobs that the exam tests separately:

Inject defaults into containers that don’t specify requests/limits.
Enforce min/max boundaries, rejecting containers that ask for too little or too much.

apiVersion: v1
kind: LimitRange
metadata:
  name: container-limits
  namespace: dev
spec:
  limits:
    - type: Container
      default:            # applied as the LIMIT if none is set
        cpu: "500m"
        memory: "256Mi"
      defaultRequest:     # applied as the REQUEST if none is set
        cpu: "100m"
        memory: "128Mi"
      min:                # smallest a container may request
        cpu: "50m"
        memory: "64Mi"
      max:                # largest a container may request
        cpu: "1"
        memory: "1Gi"

The default block sets the limit when a container omits one; defaultRequest sets the request. This is the elegant fix for the quota gotcha above: with a LimitRange in place, a bare Pod that specifies no resources still gets requests and limits injected, so it satisfies the ResourceQuota automatically.

Watch the precedence so a question can’t trick you: values you set explicitly in the Pod always win. The LimitRange only fills in what you leave blank. If your container requests 2 CPU but the LimitRange max is 1, the Pod is rejected for exceeding the maximum.

kubectl describe limitrange container-limits -n dev

A Complete Exam-Style Walkthrough

Here is the kind of layered task the CKAD strings together. Practise it end-to-end until each step is reflex.

In the dev namespace, create a Pod named cache running redis with a CPU request of 200m, a CPU limit of 400m, a memory request of 128Mi, and a memory limit of 256Mi. Then confirm its QoS class.

The fastest route is to generate the skeleton imperatively, then add the resources block:

# 1. Scaffold the Pod YAML
kubectl run cache --image=redis -n dev --dry-run=client -o yaml > cache.yaml

Edit cache.yaml to add the resources under the container:

    resources:
      requests:
        cpu: "200m"
        memory: "128Mi"
      limits:
        cpu: "400m"
        memory: "256Mi"

# 2. Apply and verify
kubectl apply -f cache.yaml
kubectl get pod cache -n dev -o jsonpath='{.status.qosClass}'
# Burstable  (requests != limits, so not Guaranteed)

Because the requests don’t equal the limits, this Pod is Burstable — exactly what you’d expect. If the question instead demanded a Guaranteed Pod, you would set the limits equal to the requests.

For a quick edit of an existing Deployment’s resources without hand-editing YAML, kubectl set resources is a huge time-saver:

kubectl set resources deployment web \
  --requests=cpu=200m,memory=128Mi \
  --limits=cpu=500m,memory=256Mi

Troubleshooting Resource Problems: A Decision Tree

When a workload misbehaves around resources, work through these in order. kubectl describe pod and the Pod’s events are your primary tools, and kubectl top shows live usage.

Symptom	Likely cause	Fix
Pod stuck `Pending`, event “Insufficient cpu/memory”	No node has enough free capacity for the request	Lower the request, or free/add node capacity
Pod rejected at creation, “must specify limits/requests”	A `ResourceQuota` requires them and the Pod omits them	Add requests/limits or add a `LimitRange` with defaults
Pod rejected, “maximum … exceeded” / “minimum … required”	Request violates a `LimitRange` min/max	Adjust the request within the allowed band
Container `OOMKilled`, exit code 137	Process exceeded its memory limit	Raise the memory limit or fix the leak
App slow / high latency, no crashes	CPU throttling at the CPU limit	Raise the CPU limit
Quota rejects a new Pod, “exceeded quota”	Namespace totals would breach the `ResourceQuota`	Reduce the Pod’s footprint or delete unused Pods

The two commands you reach for most:

# Live CPU/memory usage per Pod (needs metrics-server)
kubectl top pod -n dev

# How much of each node is already requested/limited
kubectl describe node <node> | grep -A8 "Allocated resources"

The single most common gotcha: a Pod stuck Pending with “Insufficient memory” is a request problem, not a limit problem. The scheduler only ever looks at requests when placing a Pod. Lower the request (or make room) and it schedules.

Speed Tips for Exam Day

Scaffold, don’t type. kubectl run ... --dry-run=client -o yaml gives you a valid Pod skeleton in seconds; add the resources block by hand. Typing full Pod YAML from scratch wastes minutes you don’t have.
Use kubectl set resources for any task that modifies an existing Deployment’s CPU/memory — it’s faster and less error-prone than editing YAML.
Bookmark the docs. The official Kubernetes Managing Resources for Containers page has copy-paste-ready blocks for requests, limits, and QoS. Searching it during the exam is allowed.
Read the QoS requirement carefully. “Highest priority / evicted last” means Guaranteed, which means request == limit on every container — a single missing limit drops the whole Pod to Burstable.
Verify with jsonpath. kubectl get pod <name> -o jsonpath='{.status.qosClass}' confirms the class instantly without scrolling through describe.

Where This Fits in Your CKAD Prep

Resource management interlocks with almost everything else on the exam. A misconfigured request is a common hidden cause behind the failures covered in the CKAD application troubleshooting guide, and requests and limits sit right alongside the environment and security concepts in the CKAD ConfigMaps and Secrets guide. To see how this topic is weighted against the others, review the CKAD exam domains breakdown, and for the imperative commands referenced throughout, keep the CKAD kubectl cheat sheet close. If you’re still mapping your timeline, the CKAD study plan slots resource management into a realistic week-by-week schedule.

The fastest way to make these patterns automatic is repetition under timed conditions — set requests and limits, push a Pod into and out of Guaranteed, trip a ResourceQuota on purpose and recover, and watch a deliberately under-provisioned container get OOMKilled. Working through full performance-style scenarios is exactly the kind of practice that turns hesitation into reflex. The Certified Kubernetes Application Developer (CKAD) Mock Exam Bundle is built around these hands-on, exam-style tasks with detailed explanations, so you walk into the real exam having already solved each resource pattern several times.

Frequently Asked Questions

What is the difference between a request and a limit in Kubernetes?

A request is the amount of CPU or memory the scheduler reserves for a container — it guarantees that much capacity and is used to decide which node the Pod lands on. A limit is the hard ceiling the container may consume at runtime, enforced by the kubelet. Requests affect scheduling; limits affect runtime behaviour. A Pod can use more than its request (up to its limit) if the node has spare capacity.

What happens when a container exceeds its memory limit?

The Linux kernel’s out-of-memory (OOM) killer terminates the process. The container’s last state becomes OOMKilled with exit code 137, and it restarts according to the Pod’s restart policy. Because CPU is compressible and memory is not, exceeding a CPU limit only throttles the container (it runs slower), while exceeding a memory limit kills it.

How do QoS classes work in Kubernetes?

Kubernetes assigns each Pod a Quality of Service class based on its resource spec. A Pod is Guaranteed if every container sets requests equal to limits for both CPU and memory, Burstable if it has some requests or limits but doesn’t meet that bar, and BestEffort if no requests or limits are set at all. Under memory pressure, BestEffort Pods are evicted first and Guaranteed Pods last.

What is the difference between a ResourceQuota and a LimitRange?

A ResourceQuota caps the aggregate resource usage of an entire namespace — the total CPU, memory, or object count across all Pods. A LimitRange operates per container or per Pod within a namespace, injecting default requests/limits and enforcing minimum and maximum values. They’re complementary: a LimitRange’s defaults often exist precisely so Pods automatically satisfy a ResourceQuota.

Why is my Pod rejected with “must specify limits.cpu, requests.cpu”?

The namespace has a ResourceQuota that constrains CPU and memory requests/limits, which makes those fields mandatory for every new Pod. Either add explicit requests and limits to the Pod spec, or create a LimitRange in the namespace that provides default and defaultRequest values so bare Pods get them injected automatically.

Why is my Pod stuck in Pending with “Insufficient cpu”?

The scheduler cannot find a node whose unreserved capacity satisfies the Pod’s CPU request. This is a scheduling problem, not a runtime limit problem. Lower the request to fit available capacity, free up resources by removing other Pods, or add a node. Remember the scheduler only ever considers requests — never limits — when placing a Pod.

Should I always set both requests and limits?

For exam tasks, set whatever the question asks. In production, setting requests is essential for reliable scheduling, and setting memory limits protects nodes from runaway containers. CPU limits are more nuanced — some teams omit them to allow bursting — but for the CKAD you should be comfortable specifying all four values and understand the QoS class each combination produces.