Resource management is the quiet workhorse of the Certified Kubernetes Application Developer (CKAD) exam. It rarely gets a flashy dedicated question, but requests, limits, and quotas thread through the entire “Application Environment, Configuration and Security” domain — the single largest slice of the exam at roughly 25% of your score. Get them wrong and your Pods stay Pending, get OOMKilled, or throttle to a crawl. Get them right and you unlock predictable scheduling, stable workloads, and a handful of fast, deterministic points.
This guide turns resource management into a mechanical skill. By the end you will be able to read a question, decide whether it needs a request, a limit, a ResourceQuota, or a LimitRange, write the YAML from memory, and diagnose the three failure modes the exam loves to test: a Pod that won’t schedule, a container that gets killed, and a namespace that rejects new workloads.
Why Resource Management Matters on the CKAD
Kubernetes is a scheduler before it is anything else. When you submit a Pod, the scheduler has to find a node with enough free capacity to run it. It can only do that if you tell it how much the Pod needs. That declaration is the request. Separately, the kubelet enforces a ceiling so one container can’t starve its neighbors — that ceiling is the limit.
The CKAD tests this from the developer’s seat. You are not tuning the cluster; you are shipping an application that behaves correctly under the resource constraints an administrator has set. That means three recurring tasks:
- Add CPU and memory requests and limits to a container.
- Work inside a namespace governed by a ResourceQuota without tripping it.
- Understand how a LimitRange silently injects defaults into your Pods.
Each is small. Together they are worth real points, and they reappear inside troubleshooting and deployment questions where a missing request is the hidden reason a Pod never starts.
Requests vs. Limits: The Core Distinction
This is the one concept everything else builds on, so internalise it precisely:
| Concept | What it does | Who enforces it | When it bites |
|---|---|---|---|
| Request | Reserves capacity; used by the scheduler to place the Pod | kube-scheduler | At scheduling time — too-large requests leave the Pod Pending |
| Limit | Hard ceiling the container may consume | kubelet / container runtime | At runtime — exceeding it throttles CPU or kills memory |
A request is a promise the scheduler makes to your container: “this much will always be available to you.” A limit is a promise your container makes to the cluster: “I will never use more than this.” The gap between them is where bursting happens.
Here is a container with all four values set:
apiVersion: v1
kind: Pod
metadata:
name: web
spec:
containers:
- name: web
image: nginx
resources:
requests:
cpu: "250m"
memory: "64Mi"
limits:
cpu: "500m"
memory: "128Mi"
This Pod is guaranteed 250 millicores of CPU and 64 MiB of memory the moment it schedules, and it may burst up to 500m CPU and 128 MiB before the kubelet steps in.
Understanding the Units
The units trip people up under time pressure, so commit them to memory.
CPU is measured in cores. 1 means one full vCPU. The suffix m means millicores — thousandths of a core:
| Value | Meaning |
|---|---|
1 or 1000m | One full CPU core |
500m | Half a core |
250m | A quarter of a core |
100m | One tenth of a core |
Memory is measured in bytes, but you always write a suffix. Know the difference between the power-of-two and power-of-ten suffixes:
| Suffix | Meaning | Example |
|---|---|---|
Mi | Mebibyte (1024² bytes) | 128Mi = 134,217,728 bytes |
Gi | Gibibyte (1024³ bytes) | 1Gi = 1,073,741,824 bytes |
M | Megabyte (1000² bytes) | 128M = 128,000,000 bytes |
G | Gigabyte (1000³ bytes) | 1G = 1,000,000,000 bytes |
On the exam, stick with Mi and Gi unless told otherwise — they are the convention in nearly every official manifest and avoid the off-by-a-few-percent confusion of the decimal suffixes.
CPU vs. Memory: Two Very Different Enforcement Models
This distinction is a favourite source of “why did my Pod die?” questions, and it is the single most important runtime nuance in the whole topic.
CPU is compressible. If a container hits its CPU limit, the kernel simply throttles it — the process runs slower but keeps living. Nothing is killed. A CPU-throttled container shows up as high latency, not a crash.
Memory is incompressible. You cannot “slow down” memory usage. If a container tries to allocate beyond its memory limit, the kernel’s OOM (out-of-memory) killer terminates the process. The container’s last state becomes OOMKilled and, depending on the restart policy, it restarts — often into a CrashLoopBackOff if the workload immediately tries to grab the same memory again.
# A container that was OOMKilled shows it in the last state:
kubectl describe pod web | grep -A5 "Last State"
# Last State: Terminated
# Reason: OOMKilled
# Exit Code: 137
Exit code 137 (128 + signal 9, SIGKILL) is the fingerprint of an OOM kill. When you see it on the exam or in real life, the fix is almost always: raise the memory limit, or fix the application’s memory leak. Raising the request alone does nothing here — the limit is the ceiling that triggered the kill.
Quality of Service (QoS) Classes
Kubernetes derives a QoS class for every Pod from how you set requests and limits. The class decides which Pods get evicted first when a node runs out of memory. You don’t set QoS directly — you earn it through your resource spec. The CKAD expects you to recognise all three.
| QoS Class | How a Pod earns it | Eviction priority |
|---|---|---|
| Guaranteed | Every container has requests and limits set, and request == limit for both CPU and memory | Evicted last (most protected) |
| Burstable | At least one container has a request or limit, but they don’t meet the Guaranteed bar | Evicted in the middle |
| BestEffort | No requests or limits set on any container | Evicted first (least protected) |
To get Guaranteed, requests must equal limits for both resources:
resources:
requests:
cpu: "500m"
memory: "256Mi"
limits:
cpu: "500m" # equal to request
memory: "256Mi" # equal to request
Check the class Kubernetes assigned with:
kubectl get pod web -o jsonpath='{.status.qosClass}'
# Guaranteed
A likely exam phrasing is “ensure this Pod has the highest scheduling priority / is the last to be evicted.” That is code for make it Guaranteed — set requests equal to limits on every container.
ResourceQuota: Capping a Namespace
A ResourceQuota is an administrator’s tool to cap the aggregate resource consumption of an entire namespace. As a developer sitting the CKAD, your job is usually to either create a quota or — more commonly — to ship Pods that don’t violate one.
apiVersion: v1
kind: ResourceQuota
metadata:
name: team-quota
namespace: dev
spec:
hard:
requests.cpu: "2"
requests.memory: 2Gi
limits.cpu: "4"
limits.memory: 4Gi
pods: "10"
This quota says the dev namespace may have at most 10 Pods, and the sum of all container requests may not exceed 2 cores and 2Gi, with limits summing to no more than 4 cores and 4Gi.
Inspect current usage versus the hard cap with:
kubectl describe resourcequota team-quota -n dev
# Resource Used Hard
# -------- ---- ----
# limits.cpu 500m 4
# limits.memory 256Mi 4Gi
# pods 1 10
# requests.cpu 250m 2
# requests.memory 64Mi 2Gi
The Quota Gotcha That Costs People Points
Here is the rule that catches everyone: once a namespace has a ResourceQuota that constrains requests.cpu, requests.memory, limits.cpu, or limits.memory, every new Pod in that namespace MUST specify the corresponding requests and limits. A Pod with no resource spec will be rejected outright:
Error from server (Forbidden): error when creating "pod.yaml":
pods "test" is forbidden: failed quota: team-quota:
must specify limits.cpu,limits.memory,requests.cpu,requests.memory
If you see this error on the exam, you have two ways out: add the missing requests and limits to the Pod spec, or rely on a LimitRange to supply them automatically — which brings us to the final piece.
You can create a quota imperatively too, which is faster on the exam:
kubectl create quota team-quota \
--hard=requests.cpu=2,requests.memory=2Gi,limits.cpu=4,limits.memory=4Gi,pods=10 \
-n dev
LimitRange: Defaults and Boundaries Per Container
A LimitRange operates at the individual Pod or container level within a namespace. It does two distinct jobs that the exam tests separately:
- Inject defaults into containers that don’t specify requests/limits.
- Enforce min/max boundaries, rejecting containers that ask for too little or too much.
apiVersion: v1
kind: LimitRange
metadata:
name: container-limits
namespace: dev
spec:
limits:
- type: Container
default: # applied as the LIMIT if none is set
cpu: "500m"
memory: "256Mi"
defaultRequest: # applied as the REQUEST if none is set
cpu: "100m"
memory: "128Mi"
min: # smallest a container may request
cpu: "50m"
memory: "64Mi"
max: # largest a container may request
cpu: "1"
memory: "1Gi"
The default block sets the limit when a container omits one; defaultRequest sets the request. This is the elegant fix for the quota gotcha above: with a LimitRange in place, a bare Pod that specifies no resources still gets requests and limits injected, so it satisfies the ResourceQuota automatically.
Watch the precedence so a question can’t trick you: values you set explicitly in the Pod always win. The LimitRange only fills in what you leave blank. If your container requests 2 CPU but the LimitRange max is 1, the Pod is rejected for exceeding the maximum.
kubectl describe limitrange container-limits -n dev
A Complete Exam-Style Walkthrough
Here is the kind of layered task the CKAD strings together. Practise it end-to-end until each step is reflex.
In the
devnamespace, create a Pod namedcacherunningrediswith a CPU request of 200m, a CPU limit of 400m, a memory request of 128Mi, and a memory limit of 256Mi. Then confirm its QoS class.
The fastest route is to generate the skeleton imperatively, then add the resources block:
# 1. Scaffold the Pod YAML
kubectl run cache --image=redis -n dev --dry-run=client -o yaml > cache.yaml
Edit cache.yaml to add the resources under the container:
resources:
requests:
cpu: "200m"
memory: "128Mi"
limits:
cpu: "400m"
memory: "256Mi"
# 2. Apply and verify
kubectl apply -f cache.yaml
kubectl get pod cache -n dev -o jsonpath='{.status.qosClass}'
# Burstable (requests != limits, so not Guaranteed)
Because the requests don’t equal the limits, this Pod is Burstable — exactly what you’d expect. If the question instead demanded a Guaranteed Pod, you would set the limits equal to the requests.
For a quick edit of an existing Deployment’s resources without hand-editing YAML, kubectl set resources is a huge time-saver:
kubectl set resources deployment web \
--requests=cpu=200m,memory=128Mi \
--limits=cpu=500m,memory=256Mi
Troubleshooting Resource Problems: A Decision Tree
When a workload misbehaves around resources, work through these in order. kubectl describe pod and the Pod’s events are your primary tools, and kubectl top shows live usage.
| Symptom | Likely cause | Fix |
|---|---|---|
Pod stuck Pending, event “Insufficient cpu/memory” | No node has enough free capacity for the request | Lower the request, or free/add node capacity |
| Pod rejected at creation, “must specify limits/requests” | A ResourceQuota requires them and the Pod omits them | Add requests/limits or add a LimitRange with defaults |
| Pod rejected, “maximum … exceeded” / “minimum … required” | Request violates a LimitRange min/max | Adjust the request within the allowed band |
Container OOMKilled, exit code 137 | Process exceeded its memory limit | Raise the memory limit or fix the leak |
| App slow / high latency, no crashes | CPU throttling at the CPU limit | Raise the CPU limit |
| Quota rejects a new Pod, “exceeded quota” | Namespace totals would breach the ResourceQuota | Reduce the Pod’s footprint or delete unused Pods |
The two commands you reach for most:
# Live CPU/memory usage per Pod (needs metrics-server)
kubectl top pod -n dev
# How much of each node is already requested/limited
kubectl describe node <node> | grep -A8 "Allocated resources"
The single most common gotcha: a Pod stuck Pending with “Insufficient memory” is a request problem, not a limit problem. The scheduler only ever looks at requests when placing a Pod. Lower the request (or make room) and it schedules.
Speed Tips for Exam Day
- Scaffold, don’t type.
kubectl run ... --dry-run=client -o yamlgives you a valid Pod skeleton in seconds; add theresourcesblock by hand. Typing full Pod YAML from scratch wastes minutes you don’t have. - Use
kubectl set resourcesfor any task that modifies an existing Deployment’s CPU/memory — it’s faster and less error-prone than editing YAML. - Bookmark the docs. The official Kubernetes Managing Resources for Containers page has copy-paste-ready blocks for requests, limits, and QoS. Searching it during the exam is allowed.
- Read the QoS requirement carefully. “Highest priority / evicted last” means Guaranteed, which means request == limit on every container — a single missing limit drops the whole Pod to Burstable.
- Verify with jsonpath.
kubectl get pod <name> -o jsonpath='{.status.qosClass}'confirms the class instantly without scrolling throughdescribe.
Where This Fits in Your CKAD Prep
Resource management interlocks with almost everything else on the exam. A misconfigured request is a common hidden cause behind the failures covered in the CKAD application troubleshooting guide, and requests and limits sit right alongside the environment and security concepts in the CKAD ConfigMaps and Secrets guide. To see how this topic is weighted against the others, review the CKAD exam domains breakdown, and for the imperative commands referenced throughout, keep the CKAD kubectl cheat sheet close. If you’re still mapping your timeline, the CKAD study plan slots resource management into a realistic week-by-week schedule.
The fastest way to make these patterns automatic is repetition under timed conditions — set requests and limits, push a Pod into and out of Guaranteed, trip a ResourceQuota on purpose and recover, and watch a deliberately under-provisioned container get OOMKilled. Working through full performance-style scenarios is exactly the kind of practice that turns hesitation into reflex. The Certified Kubernetes Application Developer (CKAD) Mock Exam Bundle is built around these hands-on, exam-style tasks with detailed explanations, so you walk into the real exam having already solved each resource pattern several times.
Frequently Asked Questions
What is the difference between a request and a limit in Kubernetes?
A request is the amount of CPU or memory the scheduler reserves for a container — it guarantees that much capacity and is used to decide which node the Pod lands on. A limit is the hard ceiling the container may consume at runtime, enforced by the kubelet. Requests affect scheduling; limits affect runtime behaviour. A Pod can use more than its request (up to its limit) if the node has spare capacity.
What happens when a container exceeds its memory limit?
The Linux kernel’s out-of-memory (OOM) killer terminates the process. The container’s last state becomes OOMKilled with exit code 137, and it restarts according to the Pod’s restart policy. Because CPU is compressible and memory is not, exceeding a CPU limit only throttles the container (it runs slower), while exceeding a memory limit kills it.
How do QoS classes work in Kubernetes?
Kubernetes assigns each Pod a Quality of Service class based on its resource spec. A Pod is Guaranteed if every container sets requests equal to limits for both CPU and memory, Burstable if it has some requests or limits but doesn’t meet that bar, and BestEffort if no requests or limits are set at all. Under memory pressure, BestEffort Pods are evicted first and Guaranteed Pods last.
What is the difference between a ResourceQuota and a LimitRange?
A ResourceQuota caps the aggregate resource usage of an entire namespace — the total CPU, memory, or object count across all Pods. A LimitRange operates per container or per Pod within a namespace, injecting default requests/limits and enforcing minimum and maximum values. They’re complementary: a LimitRange’s defaults often exist precisely so Pods automatically satisfy a ResourceQuota.
Why is my Pod rejected with “must specify limits.cpu, requests.cpu”?
The namespace has a ResourceQuota that constrains CPU and memory requests/limits, which makes those fields mandatory for every new Pod. Either add explicit requests and limits to the Pod spec, or create a LimitRange in the namespace that provides default and defaultRequest values so bare Pods get them injected automatically.
Why is my Pod stuck in Pending with “Insufficient cpu”?
The scheduler cannot find a node whose unreserved capacity satisfies the Pod’s CPU request. This is a scheduling problem, not a runtime limit problem. Lower the request to fit available capacity, free up resources by removing other Pods, or add a node. Remember the scheduler only ever considers requests — never limits — when placing a Pod.
Should I always set both requests and limits?
For exam tasks, set whatever the question asks. In production, setting requests is essential for reliable scheduling, and setting memory limits protects nodes from runaway containers. CPU limits are more nuanced — some teams omit them to allow bursting — but for the CKAD you should be comfortable specifying all four values and understand the QoS class each combination produces.