If there’s one question almost every CKA candidate sees, it’s etcd backup and restore. The exam tests it because etcd is the source of truth for every Kubernetes cluster — losing it means losing the cluster. The good news: this question has a deterministic recipe. Get it right and you bank 7-10 easy points. Get it wrong and you lose them outright (there’s rarely partial credit).
This guide gives you the exact commands, certificate paths, and manifest edits to perform an etcd backup and restore in under 6 minutes. Learn it once, drill it twice, and treat it as guaranteed points on exam day.
Why etcd Matters for the CKA
etcd is the consistent, distributed key-value store that holds every piece of cluster state — every Pod, ConfigMap, Secret, Deployment, and RBAC binding. When the API server starts, it queries etcd. When a Pod is created, the spec is written to etcd. If etcd is gone, your cluster is gone.
A kubeadm-installed cluster runs etcd as a static pod on each control plane node, with TLS certificates stored in /etc/kubernetes/pki/etcd/. The CKA exam expects you to:
- Take a snapshot of a running etcd cluster.
- Verify the snapshot.
- Restore from a snapshot to recover state.
You’ll be given root access to the control plane node. Plan to spend 5-7 minutes on this question.
Prerequisites Before You Run a Single Command
Before running any etcdctl command, locate three pieces of information. They’re all in the etcd static pod manifest:
sudo cat /etc/kubernetes/manifests/etcd.yaml
Find these flags in the spec:
- --listen-client-urls=https://127.0.0.1:2379,https://10.0.0.10:2379
- --trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
- --cert-file=/etc/kubernetes/pki/etcd/server.crt
- --key-file=/etc/kubernetes/pki/etcd/server.key
- --data-dir=/var/lib/etcd
You’ll need:
- An endpoint (use
https://127.0.0.1:2379from the control plane node) - The CA certificate (
--cacert) - The client cert and key (
--certand--key)
These paths are standard on a kubeadm cluster but the exam may use slightly different paths — always check the manifest first.
Install etcdctl (If It’s Not Already Installed)
The exam usually has etcdctl pre-installed. If not:
ETCD_VER=v3.5.10
curl -L https://github.com/etcd-io/etcd/releases/download/${ETCD_VER}/etcd-${ETCD_VER}-linux-amd64.tar.gz -o /tmp/etcd.tar.gz
tar xzf /tmp/etcd.tar.gz -C /tmp
sudo mv /tmp/etcd-${ETCD_VER}-linux-amd64/etcdctl /usr/local/bin/
etcdctl version
For exam day, assume etcdctl is on the path. If it’s not, the question will tell you where to find it.
Take an etcd Snapshot
This is the backup command. Memorize it.
sudo ETCDCTL_API=3 etcdctl \
--endpoints=https://127.0.0.1:2379 \
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
--cert=/etc/kubernetes/pki/etcd/server.crt \
--key=/etc/kubernetes/pki/etcd/server.key \
snapshot save /opt/etcd-backup.db
Expected output:
{"level":"info","ts":...,"caller":"snapshot/v3_snapshot.go:65","msg":"created temporary db file","path":"/opt/etcd-backup.db.part"}
{"level":"info","ts":...,"caller":"snapshot/v3_snapshot.go:79","msg":"fetched snapshot"}
{"level":"info","ts":...,"caller":"snapshot/v3_snapshot.go:88","msg":"saved","path":"/opt/etcd-backup.db"}
Snapshot saved at /opt/etcd-backup.db
If the question specifies a backup location like /var/lib/backup/etcd.db, use that exact path. Save points are easy to lose to small detail mismatches.
Common Mistakes During Backup
- Forgetting
sudo→ “permission denied” on certificate files. - Forgetting
ETCDCTL_API=3→ on some distributions etcdctl defaults to v2 API and the snapshot command fails or behaves unexpectedly. Always set it. - Wrong cert paths → check
etcd.yamlfirst, copy the exact paths. - Wrong endpoint → use
127.0.0.1:2379from the etcd node itself, not the API server’s address.
Verify the Snapshot
Always verify before moving on. This is a free 30 seconds that confirms a valid backup.
sudo ETCDCTL_API=3 etcdctl --write-out=table snapshot status /opt/etcd-backup.db
Expected output:
+---------+----------+------------+------------+
| HASH | REVISION | TOTAL KEYS | TOTAL SIZE |
+---------+----------+------------+------------+
| abcd123 | 12345 | 1234 | 2.0MB |
+---------+----------+------------+------------+
If TOTAL KEYS is 0 or the command errors, your snapshot is bad. Re-run the backup before declaring the question complete.
Restore from an etcd Snapshot
Restore is more involved than backup because you must:
- Restore the snapshot to a new data directory.
- Update the etcd static pod manifest to point at the new data directory.
- Wait for kubelet to restart etcd with the restored data.
Step 1: Restore the Snapshot
sudo ETCDCTL_API=3 etcdctl snapshot restore /opt/etcd-backup.db \
--data-dir=/var/lib/etcd-restore
Expected output:
{"level":"info","ts":...,"caller":"snapshot/v3_snapshot.go:296","msg":"restoring snapshot"...}
{"level":"info","ts":...,"caller":"snapshot/v3_snapshot.go:309","msg":"restored snapshot"...}
Don’t restore over /var/lib/etcd directly — etcd is currently running on that directory, and overwriting live data is asking for corruption. Always restore to a fresh directory.
Step 2: Update the etcd Static Pod Manifest
Edit /etc/kubernetes/manifests/etcd.yaml and change two things:
spec:
containers:
- command:
- etcd
# ... other flags ...
- --data-dir=/var/lib/etcd-restore # was /var/lib/etcd
volumes:
- hostPath:
path: /var/lib/etcd-restore # was /var/lib/etcd
type: DirectoryOrCreate
name: etcd-data
Both edits are required: the --data-dir flag AND the hostPath for the etcd-data volume mount. Miss either one and etcd will start with the wrong data.
Step 3: Wait for Etcd to Restart
When you save the manifest, kubelet detects the change and restarts the etcd static pod. The API server will briefly become unreachable.
# Wait for etcd container to be running
watch sudo crictl ps | grep etcd
# Once etcd is up, the API server will recover within ~30s
kubectl get nodes
If kubectl hangs for more than a minute, something’s wrong. Diagnose:
sudo crictl ps -a | grep etcd
sudo crictl logs $(sudo crictl ps -a | grep etcd | head -1 | awk '{print $1}')
The most common error in the logs is “data dir not empty” — meaning you tried to restore on top of an existing data directory. Delete the data dir and re-run the restore.
Step 4: Verify the Restore
Check that resources from the snapshot are visible:
kubectl get pods -A
kubectl get configmaps -A
If you took the snapshot before creating a specific deployment and that deployment is now gone, the restore worked. If everything looks identical to the live state, you may have restored to the wrong directory or skipped the manifest edit.
Restoring on a Different Node (Disaster Recovery)
The exam sometimes asks you to restore an etcd backup onto a fresh control plane. The steps are similar:
- SCP the snapshot to the new control plane node.
- Run the restore command with
--data-dir=/var/lib/etcd. - Ensure
etcd.yamlis the kubeadm default (--data-dir=/var/lib/etcd). - Restart kubelet or the etcd container.
For a multi-node etcd cluster, use the --initial-cluster and --name flags during restore:
sudo ETCDCTL_API=3 etcdctl snapshot restore /opt/etcd-backup.db \
--data-dir=/var/lib/etcd-restore \
--name=master-1 \
--initial-cluster=master-1=https://10.0.0.10:2380 \
--initial-advertise-peer-urls=https://10.0.0.10:2380
The CKA primarily tests single-node etcd restore. Multi-node restore appears occasionally but is rarely the primary task — focus on the single-node flow.
Common etcd Question Variants on the Exam
You’ll see one of these flavors:
Variant 1: Take a Snapshot at a Specific Path
“Save an etcd snapshot to
/opt/etcd-backup.dbusing the etcd certificates.”
This is the simplest version. Run the snapshot save command. Verify with snapshot status. Done.
Variant 2: Restore from an Existing Snapshot
“An etcd snapshot exists at
/opt/etcd-backup.db. Restore it to a new data directory/var/lib/etcd-restoreand update the etcd static pod manifest.”
The full three-step flow. The grader checks both that the data directory exists with restored data AND that the manifest points at it.
Variant 3: Both Backup AND Restore
“Take a snapshot, then restore from it after a simulated failure.”
Combines both flows. Allocate 8-10 minutes for this version.
Variant 4: Multi-Cluster Switching
“On cluster
foo, take a snapshot. On clusterbar, restore from a different snapshot.”
The trap here is forgetting to switch context. Always run kubectl config use-context <cluster> and SSH to the correct control plane before running etcdctl.
etcd Backup Best Practices for Production (Bonus Knowledge)
The CKA tests the mechanics, but understanding the production picture deepens your answers:
- Schedule snapshots every 30 minutes via cron or a systemd timer in real environments.
- Store snapshots offsite (S3, GCS) — local snapshots don’t help if the host dies.
- Encrypt snapshots at rest — they contain Secrets in plain (unless EncryptionConfig is enabled).
- Test restores quarterly in a staging cluster. An untested backup is an unverified backup.
Drill This Until It’s Automatic
Build a kubeadm cluster (see our Kubernetes lab setup for CKA guide) and practice this flow until you can complete a full backup-and-restore in under 6 minutes without consulting notes:
- Create some workloads (
kubectl create deploy practice --image=nginx --replicas=3). - Take a snapshot.
- Delete the workloads.
- Restore from the snapshot.
- Verify the workloads are back.
Time yourself. If you can’t beat 8 minutes consistently, drill again. This is one of the easiest places to bank exam points.
Validate Your Speed Under Exam Conditions
The only way to know whether your etcd workflow holds up under exam pressure is to take a full-length scored mock. Our CKA Mock Exam Bundle includes etcd questions in every simulator, with the same UI and certificates layout as the real exam. You’ll find out exactly how long this question takes you under timed conditions — and that’s the number that matters.
Frequently Asked Questions
Q: Do I need to memorize the certificate paths?
A: No. Always read them from /etc/kubernetes/manifests/etcd.yaml first. The kubeadm defaults are standard, but the exam can use custom paths.
Q: What’s the difference between etcdctl v2 and v3?
A: The CKA only uses v3. Always set ETCDCTL_API=3. The v3 API supports snapshot save and snapshot restore; v2 doesn’t.
Q: Can I run etcdctl from my own laptop? A: No, you need access to etcd’s TLS certs and endpoint, both of which are on the control plane node. Always SSH to the control plane first.
Q: What if etcd is running as a systemd service, not a static pod?
A: Some bare-metal clusters run etcd as a systemd unit. The backup command is identical; only the restore step differs (you edit /etc/etcd/etcd.conf instead of a static pod manifest, then restart with systemctl restart etcd). The CKA primarily tests the kubeadm static pod flow.
Q: How long should an etcd backup take? A: A few seconds for a typical exam cluster. If it takes more than 30 seconds, your endpoint or certificates are wrong.
Q: Will I lose points for restoring to the original /var/lib/etcd?
A: It will likely fail because the directory isn’t empty, costing you the question. Always restore to a fresh path.
Q: Can I script this for the exam?
A: You can — but only after you’ve practiced it manually enough to know exactly what each step does. Pre-typing a script in a ~/notes file and adapting it on exam day is a legitimate strategy.
Ready to make etcd backup-and-restore an automatic 7 points on your CKA? Drill it on a real cluster, then validate your speed with our CKA Mock Exam Bundle.