Back to Blog

Top 50 Kubernetes Interview Questions for Certified Professionals

50 essential Kubernetes interview questions for certified professionals. Organized by difficulty with detailed answers covering architecture, networking, and more.

By Sailor Team , April 5, 2026

Top 50 Kubernetes Interview Questions for Certified Professionals

Landing a Kubernetes role requires more than passing certification exams—you need to demonstrate deep understanding in technical interviews. This comprehensive guide covers 50 essential interview questions organized by difficulty level, with detailed answers that explain the “why” behind key concepts.

Whether you’re a CKAD, CKA, or CKS certified professional, these questions prepare you for technical discussions with hiring teams.

Beginner-Level Questions (1-15)

These questions assess foundational Kubernetes knowledge and are typically asked early in interviews.

1. What is Kubernetes and what problems does it solve?

Answer: Kubernetes is an open-source container orchestration platform that automates deployment, scaling, and management of containerized applications. It solves several critical problems:

  • Scalability: Automatically scale applications based on demand
  • High availability: Distribute applications across nodes and restart failed containers
  • Resource efficiency: Optimize resource utilization across infrastructure
  • Rolling updates: Deploy new versions with zero downtime
  • Self-healing: Automatically restart failed containers and replace nodes
  • Service discovery: Enable container-to-container communication dynamically
  • Storage orchestration: Manage persistent storage across distributed systems

2. What is a Pod and why is it the smallest deployable unit?

Answer: A Pod is the smallest deployable unit in Kubernetes. It’s a wrapper around one or more containers (typically one) that share:

  • Network namespace (single IP address)
  • Storage volumes
  • Container runtime options

Pods are ephemeral—they’re created and destroyed dynamically. Using Pods as the smallest unit (rather than containers) allows Kubernetes to:

  • Manage container lifecycles efficiently
  • Support sidecar containers for logging/monitoring
  • Ensure containers in a Pod can communicate via localhost
  • Provide consistent network identity

3. Explain the difference between Deployments, StatefulSets, and DaemonSets

Answer:

ResourcePurposeUse Case
DeploymentManage stateless replicated applicationsWeb servers, APIs, stateless services
StatefulSetManage stateful applications with persistent identityDatabases, message queues, applications needing stable network identity
DaemonSetRun a Pod on every nodeNode monitoring, log collection, network daemons

Key differences:

  • Deployments: Replicas are interchangeable; order of startup doesn’t matter
  • StatefulSets: Each replica has stable identity (pod-0, pod-1, pod-2); persistent storage; ordered deployment
  • DaemonSets: One Pod per node; automatically scale when nodes added/removed

4. What is a Service and why is it needed?

Answer: A Service is an abstraction that defines a logical set of Pods and a policy for accessing them. Services are needed because:

  • Stable endpoint: Pods are ephemeral; Services provide stable IPs/DNS names
  • Load balancing: Distribute traffic across multiple Pods
  • Service discovery: Other Pods discover services via DNS (service-name.namespace.svc.cluster.local)
  • Abstraction: Decouple clients from specific Pod instances

Service types:

  • ClusterIP: Internal cluster communication only
  • NodePort: Expose on node ports (external access)
  • LoadBalancer: Use cloud provider’s load balancer
  • ExternalName: Map to external DNS name

5. What is a Namespace and when would you use multiple Namespaces?

Answer: A Namespace is a virtual cluster within a physical Kubernetes cluster. It provides:

  • Resource isolation: Resources (Pods, Services, etc.) are scoped to namespaces
  • RBAC enforcement: Control which users/service accounts can access which namespaces
  • Resource quotas: Limit resource consumption per namespace
  • Multi-tenancy: Run multiple applications without interference

Use multiple namespaces for:

  • Multi-team environments: Separate teams work in separate namespaces
  • Environment separation: Dev, staging, production in different namespaces
  • Application isolation: Isolate different applications
  • Multi-tenancy: Isolate customers in SaaS applications
  • Resource quotas: Limit resource usage per namespace

6. What is RBAC and why is it important?

Answer: RBAC (Role-Based Access Control) is Kubernetes’ built-in authorization system. It controls what authenticated users and service accounts can do.

Key components:

  • Role: Set of permissions (e.g., “can create/read/update/delete pods”)
  • RoleBinding: Grants a Role to a user/service account
  • ClusterRole: Cluster-wide version of Role
  • ClusterRoleBinding: Cluster-wide version of RoleBinding

Why important:

  • Security: Implement principle of least privilege
  • Multi-tenancy: Isolate teams/applications
  • Compliance: Meet security requirements
  • Auditability: Track who can do what

7. Explain the Kubernetes architecture and key components

Answer: Kubernetes follows a master-worker (control plane-node) architecture:

Control Plane components:

  • API Server: Exposes Kubernetes API; manages cluster state
  • etcd: Key-value database storing cluster state
  • Scheduler: Assigns Pods to nodes based on resource requirements
  • Controller Manager: Runs controller processes (Deployment, StatefulSet, etc.)
  • Cloud Controller Manager: Interfaces with cloud provider (optional)

Node components:

  • kubelet: Ensures Pods are running on node; reports node status
  • Container runtime: Runs containers (Docker, containerd, etc.)
  • kube-proxy: Maintains network rules for Services

Communication flow: API Server is the central hub; all components communicate through it.

8. What is the difference between requests and limits in resource management?

Answer:

  • Requests: Amount of CPU/memory Kubernetes guarantees for the container

    • Used by scheduler to find suitable nodes
    • Container can use less than requested
    • Example: 100m CPU means guaranteed 100 millicores
  • Limits: Maximum CPU/memory container can use

    • Container is throttled at CPU limits
    • Container is killed if exceeding memory limits
    • Prevents noisy neighbors affecting other containers

Best practice: Set both requests and limits; requests = expected usage, limits = maximum acceptable usage.

9. What is a ConfigMap and how does it differ from Secrets?

Answer: ConfigMap: Stores non-sensitive configuration data (key-value pairs)

  • Used for application configuration
  • Data is not encrypted
  • Can contain small files or large configurations
  • Example: Database host, log level, feature flags

Secret: Stores sensitive data (passwords, tokens, certificates)

  • Intended for confidential information
  • Base64 encoded (not encrypted by default; requires additional security)
  • Can contain TLS certificates, OAuth tokens, SSH keys
  • Should be encrypted at rest using Kubernetes encryption provider

Key difference: ConfigMaps for non-sensitive, Secrets for sensitive data.

10. Explain the Pod lifecycle

Answer: Pods go through several phases:

  1. Pending: Pod created but not yet running

    • Waiting for resources/scheduling
    • Pulling container images
    • Starting containers
  2. Running: Pod and containers running

    • At least one container running
    • May have other containers starting/restarting
  3. Succeeded: All containers terminated with exit code 0

    • Used for Jobs and one-time tasks
    • Pod doesn’t restart
  4. Failed: At least one container terminated with non-zero exit code

    • Pod remains in Failed state
    • Suitable for debugging
  5. Unknown: Pod status cannot be determined

    • Usually indicates communication problem with kubelet

Container states within Pod:

  • Waiting: Container not yet started (pulling image, creating volume, etc.)
  • Running: Container running
  • Terminated: Container finished or crashed

11. What is a liveness probe and how does it differ from a readiness probe?

Answer: Liveness Probe: Determines if container is alive

  • Kubernetes restarts container if liveness probe fails
  • Example: Check if application process is still running
  • Restart unhealthy containers automatically
  • Use when: Container can hang without crashing

Readiness Probe: Determines if container is ready to receive traffic

  • Kubernetes removes Pod from Service if readiness probe fails
  • Example: Check if application is fully initialized and ready to handle requests
  • Traffic only sent to “ready” Pods
  • Use when: Application needs startup time before accepting traffic

Key difference:

  • Liveness: Is the container alive? If no → restart
  • Readiness: Is the container ready? If no → don’t send traffic

12. What is a PersistentVolume and PersistentVolumeClaim?

Answer: PersistentVolume (PV): Cluster-wide storage resource

  • Provisioned by administrator
  • Has lifecycle independent of Pods
  • Represents actual storage (NFS, block storage, cloud storage, etc.)
  • Example: 100Gi of NFS storage

PersistentVolumeClaim (PVC): Storage request by application

  • Pod requests storage via PVC
  • Kubernetes matches PVC with suitable PV
  • If no matching PV exists, storage can be dynamically provisioned
  • Example: Application requests 50Gi storage with ReadWriteOnce access mode

Access modes:

  • ReadWriteOnce (RWO): Single node read/write
  • ReadOnlyMany (ROX): Multiple nodes read-only
  • ReadWriteMany (RWX): Multiple nodes read/write

13. Explain how Kubernetes networking works for inter-pod communication

Answer: Kubernetes networking model assumes:

  • Every Pod has its own IP address
  • All Pods can communicate directly (no NAT)
  • Containers in Pod communicate via localhost
  • Pods on different nodes communicate via overlay network

Implementation:

  • CNI plugin: Container Network Interface plugin creates pod network
  • Overlay network: Virtual network overlay on physical network (Flannel, Weave, Calico)
  • Service mesh: Optional layer for advanced networking (Istio, Linkerd)

Communication flow:

  1. Container A in Pod1 sends traffic to Pod2’s IP
  2. CNI plugin routes traffic across nodes
  3. Container B in Pod2 receives traffic
  4. Return traffic flows back similarly

Important: Pod IPs are not persistent; Pods are ephemeral. Use Services for stable endpoints.

14. What is the purpose of a NetworkPolicy?

Answer: A NetworkPolicy is a Kubernetes resource that controls network traffic between Pods and external systems.

Functions:

  • Ingress rules: Control traffic entering Pods
  • Egress rules: Control traffic leaving Pods
  • Pod selectors: Specify which Pods the policy applies to
  • Namespace selectors: Allow/deny traffic from specific namespaces

Example use case:

# Only allow traffic from frontend Pods on port 5432
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: database-policy
spec:
  podSelector:
    matchLabels:
      app: database
  policyTypes:
  - Ingress
  ingress:
  - from:
    - podSelector:
        matchLabels:
          app: frontend
    ports:
    - protocol: TCP
      port: 5432

Important: Default behavior is allow-all; NetworkPolicies restrict traffic.

15. Explain the difference between a Rolling Update and Blue-Green Deployment

Answer: Rolling Update (Kubernetes default):

  • Gradually replace old Pods with new ones
  • Maintains service availability throughout
  • Can take longer (old and new Pods run simultaneously)
  • Less resource-intensive
  • Rollback possible by re-deploying previous version

Blue-Green Deployment:

  • Maintain two identical environments (blue=current, green=new)
  • Deploy new version to green environment
  • Test thoroughly
  • Switch traffic to green instantly
  • Faster switchover but requires double resources temporarily

Kubernetes implementation:

  • Rolling update: Built into Deployment resource via rolling update strategy
  • Blue-green: Manually managed by creating second Deployment and switching Service selector

Intermediate-Level Questions (16-33)

These questions assess practical experience and deeper understanding.

16. How would you troubleshoot a Pod stuck in Pending state?

Answer: Diagnosis steps:

  1. Check Pod status: kubectl describe pod <pod-name>
  2. Look for events indicating the issue
  3. Common causes:
    • No available nodes: Insufficient resources (CPU/memory)
    • PVC not bound: Pod waiting for PersistentVolume
    • Image pull error: Container image cannot be pulled
    • Admission controller rejection: Security/quota policy violation

Troubleshooting commands:

# Check Pod events and status
kubectl describe pod <pod-name> -n <namespace>

# Check node resources
kubectl top nodes
kubectl describe node <node-name>

# Check resource requests vs available resources
kubectl get nodes -o yaml | grep -A 5 "allocatable"

# Check PVC status if using persistent storage
kubectl describe pvc <pvc-name> -n <namespace>

# Check for admission issues in cluster events
kubectl get events -n <namespace> --sort-by='.lastTimestamp'

Solutions:

  • Add more nodes or increase node capacity
  • Create PersistentVolume matching PVC requirements
  • Fix image pull issue (image name, registry credentials)
  • Adjust pod resource requests/limits

17. What is horizontal pod autoscaling (HPA) and how does it work?

Answer: HPA automatically scales the number of Pods based on metrics (CPU, memory, custom metrics).

How it works:

  1. Metrics server collects Pod metrics from kubelets
  2. HPA controller queries metrics regularly (default: 15 seconds)
  3. If metric exceeds target, HPA scales up Pods
  4. If metric falls below target, HPA scales down Pods

Example HPA:

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: app-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: app
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80

Important considerations:

  • Requires metrics server running in cluster
  • Takes time to scale (default scale-up: 3 minutes between scale events)
  • Should be used with resource requests/limits for effective scaling

18. How would you perform a zero-downtime deployment?

Answer: Strategy:

  1. Use rolling update strategy (Kubernetes default)
  2. Configure proper health checks (liveness and readiness probes)
  3. Set appropriate grace period for shutdown
  4. Use Pod Disruption Budgets to minimize disruption

Deployment configuration:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: app
spec:
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1        # One extra Pod during update
      maxUnavailable: 0  # No Pods unavailable
  template:
    spec:
      terminationGracePeriodSeconds: 30  # Time for graceful shutdown
      containers:
      - name: app
        readinessProbe:
          httpGet:
            path: /health
            port: 8080
          initialDelaySeconds: 10
          periodSeconds: 5
        livenessProbe:
          httpGet:
            path: /health
            port: 8080
          initialDelaySeconds: 30
          periodSeconds: 10

Pod Disruption Budget:

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: app-pdb
spec:
  minAvailable: 1
  selector:
    matchLabels:
      app: app

Key considerations:

  • Ensure application handles graceful shutdown
  • Implement proper health checks
  • Use readiness probes to prevent traffic to initializing Pods
  • Allow enough time for graceful shutdown
  • Test in staging environment first

19. Explain how Kubernetes handles node failures

Answer: When a node fails, Kubernetes automatically recovers:

Detection:

  1. kubelet on node stops sending heartbeat
  2. Kubernetes waits for node-monitor-grace-period (default: 40 seconds)
  3. Node marked as NotReady

Recovery:

  1. Node controller marks node as NotReady/SchedulingDisabled
  2. Pods on failed node move to terminating state
  3. Pod Eviction: Pods evicted after pod-eviction-timeout (default: 5 minutes)
  4. Scheduler creates replacement Pods on healthy nodes

Important behaviors:

  • Stateful Pods (StatefulSets) are NOT automatically recreated
  • Deployment Pods are recreated (via Deployment controller)
  • Persistent data must be on separate volumes (not node-local)
  • Services route around failed Pods automatically

Best practices:

  • Use Deployments for stateless applications
  • Use StatefulSets for stateful applications with separate persistent storage
  • Use Pod Disruption Budgets to control eviction
  • Configure appropriate grace periods for shutdown

20. How would you implement role-based access control (RBAC) for multiple teams?

Answer: Architecture:

  1. Create namespace per team
  2. Create Role/RoleBinding for team resources
  3. Create ServiceAccount for team automation
  4. Use ClusterRole for shared resources (read-only)

Example implementation:

# Namespace for team-a
apiVersion: v1
kind: Namespace
metadata:
  name: team-a
---
# ServiceAccount for team-a
apiVersion: v1
kind: ServiceAccount
metadata:
  name: team-a-sa
  namespace: team-a
---
# Role allowing team-a to manage deployments/pods in their namespace
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: team-a-role
  namespace: team-a
rules:
- apiGroups: ["apps"]
  resources: ["deployments", "statefulsets"]
  verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
- apiGroups: [""]
  resources: ["pods", "services", "configmaps"]
  verbs: ["get", "list", "watch", "create", "delete"]
---
# RoleBinding connects Role to ServiceAccount
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: team-a-rolebinding
  namespace: team-a
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: team-a-role
subjects:
- kind: ServiceAccount
  name: team-a-sa
  namespace: team-a
---
# ClusterRole for read-only access to cluster resources
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: read-only-role
rules:
- apiGroups: [""]
  resources: ["nodes"]
  verbs: ["get", "list", "watch"]
- apiGroups: [""]
  resources: ["namespaces"]
  verbs: ["get", "list", "watch"]
---
# ClusterRoleBinding for read-only access
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: team-a-read-only
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: read-only-role
subjects:
- kind: ServiceAccount
  name: team-a-sa
  namespace: team-a

21. How would you secure secrets in Kubernetes?

Answer: Built-in security (insufficient alone):

  • Secrets are base64 encoded by default (not encrypted)
  • Only accessible via RBAC

Encryption at rest:

# Enable encryption provider in API server
--encryption-provider-config=/etc/kubernetes/encryption.yaml

External secret management:

  • Use Sealed Secrets (encrypts Secret data)
  • Use HashiCorp Vault with Kubernetes auth
  • Use cloud provider secret management (AWS Secrets Manager, Azure Key Vault)

Example with Sealed Secrets:

# Install sealed-secrets controller
helm repo add sealed-secrets https://bitnami-labs.github.io/sealed-secrets

# Seal a secret
echo "my-secret" | kubectl create secret generic mysecret --dry-run=client --from-file=/dev/stdin -o yaml | kubeseal -f -

# Deploy sealed secret (can be committed to git)
kubectl apply -f sealed-secret.yaml

Best practices:

  1. Enable encryption at rest on API server
  2. Use external secret management for sensitive data
  3. Implement RBAC to restrict Secret access
  4. Audit access to Secrets
  5. Rotate secrets regularly
  6. Never commit Secrets to Git (use Sealed Secrets or external management)

22. Explain service mesh and when to use it

Answer: A service mesh is a dedicated infrastructure layer managing service-to-service communication. It uses sidecar proxies (Envoy) to intercept and manage traffic.

Popular service meshes:

  • Istio: Most feature-rich, complex
  • Linkerd: Lightweight, Kubernetes-native
  • Consul: From HashiCorp, integrates with Consul

Features:

  • Traffic management: Routing, circuit breaking, retry logic
  • Security: mTLS, authorization policies
  • Observability: Distributed tracing, metrics, logging
  • Resilience: Timeout, rate limiting, load balancing

When to use:

  • Complex microservices architectures
  • Need for sophisticated traffic routing
  • Security requirements (mTLS between services)
  • Need for detailed observability
  • Polyglot environments (multiple languages/frameworks)

When NOT to use:

  • Simple architectures (few services)
  • Overhead not justified by requirements
  • Performance critical applications (adds latency)
  • Team unfamiliar with service mesh concepts

23. How would you implement multi-tenancy in Kubernetes?

Answer: Levels of isolation (from least to most secure):

  1. Namespace isolation: Logical separation

    • Different namespaces per tenant
    • Network policies isolate traffic
    • RBAC controls access
    • Resource quotas limit per-tenant usage
  2. Network isolation: Add virtual network boundaries

    • Network policies enforce isolation
    • Dedicated VPC per tenant (in cloud)
    • Service mesh mTLS for encryption
  3. Storage isolation: Separate persistent storage

    • Different storage classes per tenant
    • PersistentVolumes not shared
    • Encryption keys per tenant
  4. Compute isolation: Physical separation

    • Dedicated nodes per tenant (node affinity)
    • Pod security policies restrict pod capabilities
    • Separate clusters for highly sensitive tenants

Example namespace-based multi-tenancy:

# Tenant 1
apiVersion: v1
kind: Namespace
metadata:
  name: tenant-1
---
apiVersion: v1
kind: ResourceQuota
metadata:
  name: tenant-1-quota
  namespace: tenant-1
spec:
  hard:
    requests.cpu: "10"
    requests.memory: "20Gi"
    limits.cpu: "20"
    limits.memory: "40Gi"
---
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: tenant-1-isolation
  namespace: tenant-1
spec:
  podSelector: {}
  policyTypes:
  - Ingress
  - Egress
  ingress:
  - from:
    - podSelector: {}
  egress:
  - to:
    - podSelector: {}
  - to:
    - namespaceSelector:
        matchLabels:
          name: kube-system

24. How would you monitor and troubleshoot cluster performance?

Answer: Tools and metrics:

  • Prometheus: Scrape and store metrics
  • Grafana: Visualize metrics
  • Kubernetes metrics server: Provides CPU/memory metrics
  • ELK Stack: Centralize logs (Elasticsearch, Logstash, Kibana)
  • Jaeger: Distributed tracing

Key metrics to monitor:

  • Node metrics: CPU, memory, disk usage, network I/O
  • Pod metrics: CPU, memory consumption
  • API server latency: Request processing time
  • etcd performance: Key operations latency
  • Network metrics: Pod-to-pod latency, packet loss

Monitoring setup:

# Prometheus ServiceMonitor example
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: kubernetes-metrics
spec:
  selector:
    matchLabels:
      app: app
  endpoints:
  - port: metrics
    interval: 30s

Troubleshooting commands:

# Node metrics
kubectl top nodes
kubectl top pods --all-namespaces

# Cluster events
kubectl get events --all-namespaces --sort-by='.lastTimestamp'

# API server logs
kubectl logs -n kube-system -l component=kube-apiserver

# etcd health
kubectl exec -n kube-system -it <etcd-pod> -- etcdctl endpoint health

# Check control plane components
kubectl get cs

# Check kubelet logs (on node)
journalctl -u kubelet -f

25. Explain how to implement GitOps in Kubernetes

Answer: GitOps makes Git the single source of truth for cluster state. Tools automatically sync cluster with Git repository.

Tools:

  • ArgoCD: Most popular, declarative GitOps
  • Flux: Progressive delivery with GitOps
  • Jenkins X: CI/CD with GitOps for Kubernetes

Implementation:

  1. Store all manifests in Git repository
  2. Deploy GitOps controller (ArgoCD)
  3. Controller watches Git repository
  4. Any cluster changes revert to Git state (reconciliation)
  5. All changes made through Git pull requests

Example ArgoCD application:

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: myapp
  namespace: argocd
spec:
  project: default
  source:
    repoURL: https://github.com/myorg/myrepo
    targetRevision: main
    path: k8s-manifests/
  destination:
    server: https://kubernetes.default.svc
    namespace: myapp
  syncPolicy:
    automated:
      prune: true
      selfHeal: true
    syncOptions:
    - CreateNamespace=true

Benefits:

  • Auditability: Full Git history of changes
  • Rollback: Easy revert to previous state
  • Declarative: Desired state in Git
  • Collaboration: Pull requests for changes
  • Automation: Automatic reconciliation

26. How would you handle disaster recovery in Kubernetes?

Answer: Backup and restore strategy:

  1. Backup all Kubernetes objects:
# Using Velero (backup and restore tool)
velero install --provider aws --bucket velero-backups

# Create backup
velero backup create my-backup

# List backups
velero backup get
  1. Backup persistent data:

    • Use storage snapshots (AWS EBS, GCP persistent disks)
    • Export data to external storage
    • Version control manifests
  2. Test disaster recovery regularly:

    • Perform restoration drills
    • Measure RTO (Recovery Time Objective)
    • Measure RPO (Recovery Point Objective)

High availability setup:

  • Multi-master control plane
  • etcd cluster with 3+ nodes
  • Distributed across availability zones
  • Load balancer for API server access

Example Velero restore:

# Restore from backup
velero restore create --from-backup my-backup

# Restore specific namespace
velero restore create --from-backup my-backup --include-namespaces prod

27. How would you implement continuous deployment with Kubernetes?

Answer: CI/CD pipeline:

  1. Developer commits code
  2. CI pipeline builds container image
  3. Push image to registry
  4. Deployment manifests updated (usually automatically via GitOps)
  5. CD tool (ArgoCD, Flux) detects changes
  6. CD tool applies manifests to cluster
  7. Kubernetes deploys new version (rolling update)
  8. Monitoring detects issues, triggers rollback if needed

Tools:

  • CI: Jenkins, GitLab CI, GitHub Actions, CircleCI
  • CD: ArgoCD, Flux, Spinnaker, Harness
  • Registry: Docker Hub, ECR, GCR, Artifactory

Safety mechanisms:

  • Readiness/liveness probes prevent bad deployments
  • Rolling updates maintain availability
  • Pod Disruption Budgets minimize impact
  • Automated rollback on failed health checks
  • Smoke tests verify new version works

28. Explain container security best practices in Kubernetes

Answer: Security layers:

  1. Image security:

    • Scan images for vulnerabilities (Trivy, Snyk)
    • Use minimal base images (Alpine, distroless)
    • Sign images (Docker Content Trust)
    • Run as non-root user
  2. Pod security:

    • Use Pod Security Standards
    • Restrict containers to read-only filesystem
    • Drop unnecessary capabilities
    • Run with resource limits
  3. Network security:

    • Implement NetworkPolicies
    • Use mTLS between services
    • Encrypt traffic in transit
  4. Runtime security:

    • Monitor system calls (Falco)
    • Restrict system calls (seccomp)
    • AppArmor/SELinux profiles

Example secure pod:

apiVersion: v1
kind: Pod
metadata:
  name: secure-pod
spec:
  securityContext:
    runAsNonRoot: true
    runAsUser: 1000
    fsGroup: 1000
  containers:
  - name: app
    image: myapp:latest
    securityContext:
      allowPrivilegeEscalation: false
      readOnlyRootFilesystem: true
      capabilities:
        drop:
        - ALL
    resources:
      requests:
        cpu: 100m
        memory: 128Mi
      limits:
        cpu: 500m
        memory: 512Mi

29. How would you troubleshoot a CrashLoopBackOff Pod?

Answer: Diagnosis:

  1. Check Pod logs: kubectl logs <pod-name>
  2. Check previous logs: kubectl logs <pod-name> --previous
  3. Describe Pod: kubectl describe pod <pod-name>
  4. Check events for specific errors

Common causes:

  • Application error: Check logs for exception/error
  • Missing dependencies: Volume not mounted, config not found
  • Resource limits: Insufficient memory causes OOM kill
  • Health check failure: Liveness probe killing healthy container
  • Image pull error: Container image not accessible

Troubleshooting steps:

# Check logs
kubectl logs deployment/myapp

# Check previous logs before crash
kubectl logs deployment/myapp --previous

# Describe for events and status
kubectl describe pod <pod-name>

# Check resource usage
kubectl top pod <pod-name>

# Check events
kubectl get events -n default --sort-by='.lastTimestamp'

# Increase log level for debugging
kubectl set env deployment/myapp LOG_LEVEL=debug

Solutions:

  • Fix application error (update code/image)
  • Mount missing volumes
  • Adjust resource limits
  • Fix health check configuration
  • Verify container image is accessible

30. Explain Kubernetes admission controllers and when you’d use them

Answer: Admission controllers intercept requests to API server before persisting them. They can mutate or validate requests.

Types:

  • Mutating: Modify request (add sidecar, set defaults)
  • Validating: Accept/reject request based on rules

Common controllers:

  • PodSecurityPolicy: Enforce security policies
  • ResourceQuota: Prevent exceeding quotas
  • LimitRanger: Set default resource limits
  • ValidatingWebhook: Custom validation rules
  • MutatingWebhook: Custom mutation rules

Example ValidatingWebhook:

apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingWebhookConfiguration
metadata:
  name: image-policy
webhooks:
- name: image-policy.example.com
  clientConfig:
    service:
      name: image-policy-webhook
      namespace: default
      path: "/validate"
    caBundle: LS0t...
  rules:
  - operations: ["CREATE", "UPDATE"]
    apiGroups: [""]
    apiVersions: ["v1"]
    resources: ["pods"]
  admissionReviewVersions: ["v1"]
  sideEffects: None

31. How would you implement a blue-green deployment in Kubernetes?

Answer: Process:

  1. Deploy blue version (current):

    • Service selector points to blue Deployment
  2. Deploy green version (new):

    • Create separate Deployment
    • Deploy new version, verify it’s working
  3. Test green version:

    • Port-forward to green Pods for testing
    • Run smoke tests
    • Verify functionality
  4. Switch traffic:

    • Update Service selector to point to green Deployment
    • All traffic immediately goes to green
  5. Keep blue for rollback:

    • Blue Deployment remains running
    • Can switch back if issues arise
    • Delete blue after stability confirmed

Example manifests:

# Blue Deployment (current)
apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp-blue
spec:
  selector:
    matchLabels:
      app: myapp
      version: blue
  template:
    metadata:
      labels:
        app: myapp
        version: blue
    spec:
      containers:
      - name: myapp
        image: myapp:v1
---
# Service initially points to blue
apiVersion: v1
kind: Service
metadata:
  name: myapp
spec:
  selector:
    app: myapp
    version: blue  # Points to blue
  ports:
  - port: 80
    targetPort: 8080
---
# Green Deployment (new version ready to deploy)
apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp-green
spec:
  selector:
    matchLabels:
      app: myapp
      version: green
  template:
    metadata:
      labels:
        app: myapp
        version: green
    spec:
      containers:
      - name: myapp
        image: myapp:v2

32. How would you implement request rate limiting in Kubernetes?

Answer: Methods:

  1. API server rate limiting:

    • Built-in rate limiting for API requests
    • Prevents API server overload
  2. Service mesh rate limiting:

    • Istio/Linkerd provide rate limiting
    • Per-service configuration
  3. Application-level rate limiting:

    • Implement in application code
    • Most fine-grained control
  4. Ingress rate limiting:

    • Rate limit at ingress controller level
    • Per-IP or per-request limit

Example Nginx Ingress rate limiting:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: rate-limited-ingress
  annotations:
    nginx.ingress.kubernetes.io/limit-rps: "10"
    nginx.ingress.kubernetes.io/limit-connections: "5"
spec:
  ingressClassName: nginx
  rules:
  - host: api.example.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: myapi
            port:
              number: 80

33. Explain how to debug networking issues in Kubernetes

Answer: Diagnosis process:

  1. Check service connectivity:
# Get service endpoints
kubectl get endpoints <service-name>

# Verify service exists and is accessible
kubectl get svc <service-name>

# Try to connect
kubectl exec <pod-name> -- curl http://<service-name>:80
  1. Check DNS resolution:
# Test DNS from pod
kubectl exec <pod-name> -- nslookup <service-name>
kubectl exec <pod-name> -- nslookup <service-name>.default.svc.cluster.local

# Check CoreDNS logs
kubectl logs -n kube-system -l k8s-app=kube-dns
  1. Check network policies:
# Get network policies
kubectl get networkpolicies -n <namespace>

# Describe policy
kubectl describe networkpolicy <policy-name>
  1. Check pod-to-pod connectivity:
# Test from one pod to another
kubectl exec <pod1-name> -- ping <pod2-ip>

# Use netcat to test specific port
kubectl exec <pod1-name> -- nc -zv <pod2-ip> 8080
  1. Check node networking:
# Verify node routes
kubectl debug node/<node-name> -it --image=ubuntu

# Check iptables rules (inside node-debug container)
iptables -L -n | grep <service-ip>
  1. Use network debugging tools:
# Deploy debugging pod
kubectl run netshoot --image=nicolaka/netshoot --stdin --tty

# From debugging pod, test connectivity
nc -zv <service-ip> <port>
curl http://<service-name>
traceroute <service-ip>

Advanced-Level Questions (34-50)

These questions test expert-level knowledge suitable for senior roles.

34. How would you implement custom resource definitions (CRDs) and operators?

Answer: CRDs extend Kubernetes with custom resource types. Operators are applications managing other applications using CRDs.

CRD example:

apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  name: databases.example.com
spec:
  names:
    kind: Database
    plural: databases
  scope: Namespaced
  group: example.com
  versions:
  - name: v1
    served: true
    storage: true
    schema:
      openAPIV3Schema:
        type: object
        properties:
          spec:
            type: object
            properties:
              size:
                type: integer
              engine:
                type: string
                enum: ["postgres", "mysql"]

Custom resource example:

apiVersion: example.com/v1
kind: Database
metadata:
  name: my-database
spec:
  size: 10
  engine: postgres

Operator components:

  1. CRD: Defines custom resource schema
  2. Controller: Watches custom resources, reconciles desired state
  3. Business logic: Implements actual functionality

Popular operators:

  • Prometheus Operator: Manages Prometheus monitoring
  • Etcd Operator: Manages etcd clusters
  • Kafka Operator: Manages Kafka clusters

Benefits:

  • Encapsulate domain knowledge
  • Declarative infrastructure
  • Automated operational tasks

35. How would you implement cross-cluster communication and federation?

Answer: Approaches:

  1. Service meshes (Istio):

    • Mesh across clusters
    • Automatic service discovery
    • Traffic management between clusters
  2. KubeFed (Kubernetes Federation):

    • Federate resources across clusters
    • Single control plane managing multiple clusters
    • Automatic failover between clusters
  3. Manual federation:

    • Export services from one cluster
    • Import into another cluster
    • Manual DNS or routing configuration

KubeFed example:

# In federation control plane
apiVersion: types.kubefed.io/v1beta1
kind: FederatedDeployment
metadata:
  name: myapp
  namespace: default
spec:
  template:
    spec:
      replicas: 2
      selector:
        matchLabels:
          app: myapp
      template:
        # Deployment template
  placement:
    clusterSelector:
      matchLabels:
        region: us-west
  overrides:
  - clusterName: cluster-1
    clusterOverrides:
    - path: /spec/replicas
      value: 3
  - clusterName: cluster-2
    clusterOverrides:
    - path: /spec/replicas
      value: 1

36. Explain etcd backup and recovery strategies

Answer: Backup strategies:

  1. Snapshot backup:
# Backup etcd snapshot
ETCDCTL_API=3 etcdctl --endpoints=https://127.0.0.1:2379 \
  --cacert=/etc/kubernetes/pki/etcd/ca.crt \
  --cert=/etc/kubernetes/pki/etcd/server.crt \
  --key=/etc/kubernetes/pki/etcd/server.key \
  snapshot save /backup/etcd.snap

# Verify snapshot
ETCDCTL_API=3 etcdctl snapshot status /backup/etcd.snap
  1. Automated backups:
    • Use Velero with etcd plugin
    • Scheduled snapshots
    • Off-cluster storage

Recovery:

# Restore from snapshot
ETCDCTL_API=3 etcdctl snapshot restore /backup/etcd.snap \
  --data-dir=/var/lib/etcd-restored

# Restart etcd with restored data
systemctl stop etcd
rm -rf /var/lib/etcd
mv /var/lib/etcd-restored /var/lib/etcd
systemctl start etcd

Best practices:

  • Regular automated backups
  • Store backups off-cluster
  • Test recovery procedures
  • Monitor backup success
  • Document recovery process

37. How would you optimize Kubernetes for cost?

Answer: Strategies:

  1. Right-sizing:

    • Monitor actual resource usage
    • Adjust requests/limits based on real usage
    • Use VPA (Vertical Pod Autoscaler) to recommend sizing
  2. Bin-packing:

    • Use HPA to scale replicas based on load
    • Use Cluster Autoscaler to scale nodes
    • Pack workloads efficiently on nodes
  3. Spot instances:

    • Use cloud provider spot/preemptible instances
    • Handle pod evictions gracefully
    • Mix on-demand and spot instances
  4. Resource consolidation:

    • Identify unused resources
    • Remove idle workloads
    • Consolidate small workloads
  5. Storage optimization:

    • Use appropriate storage tiers
    • Remove unused volumes
    • Enable storage reclamation

Cost monitoring tools:

  • Kubecost: Kubernetes-native cost monitoring
  • Cloud provider cost management
  • Custom metrics and dashboards

Example VPA:

apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: my-vpa
spec:
  targetRef:
    apiVersion: "apps/v1"
    kind: Deployment
    name: myapp
  updatePolicy:
    updateMode: "Auto"

38. How would you implement finalization logic in Kubernetes?

Answer: Finalizers prevent accidental deletion and allow cleanup operations.

How they work:

  1. When object deletion requested, object marked for deletion but not deleted
  2. Finalizer logic runs (remove external resources, cleanup)
  3. Finalizer removed from object
  4. Object fully deleted

Example with finalizer:

apiVersion: v1
kind: ConfigMap
metadata:
  name: my-config
  finalizers:
  - example.com/cleanup
spec:
  data:
    key: value

In operator controller:

# When delete requested
if obj.metadata.deletionTimestamp:
    if 'example.com/cleanup' in obj.metadata.finalizers:
        # Perform cleanup (delete external resources)
        cleanup_external_resources(obj)
        # Remove finalizer
        obj.metadata.finalizers.remove('example.com/cleanup')
        obj.update()
else:
    # Add finalizer on creation
    obj.metadata.finalizers.append('example.com/cleanup')
    obj.update()

39. Explain Kubernetes authentication and authorization mechanisms

Answer: Authentication (is user who they claim?):

  • X.509 certificates
  • Static token file
  • Service account tokens
  • OIDC provider integration
  • Webhook authentication

Authorization (what can user do?):

  • RBAC (Role-Based Access Control) - most common
  • ABAC (Attribute-Based Access Control)
  • Webhook authorization
  • Node authorization

Example OIDC integration:

# Configure API server with OIDC
--oidc-issuer-url=https://example.com
--oidc-client-id=kubernetes
--oidc-username-claim=email
--oidc-groups-claim=groups

Best practices:

  • Use OIDC for user authentication
  • Use service accounts for application authentication
  • Implement RBAC for fine-grained access control
  • Audit authentication attempts
  • Rotate certificates regularly

40. How would you troubleshoot API server issues?

Answer: Common issues:

  1. API server not responding: Check logs, verify etcd health
  2. High latency: Check etcd performance, API server load
  3. Request failures: Check RBAC, admission controllers, resource quotas

Debugging steps:

# Check API server health
kubectl get cs

# Check API server logs
kubectl logs -n kube-system -l component=kube-apiserver

# Check API server audit logs
kubectl logs -n kube-system -l component=kube-apiserver --timestamps=true

# Monitor API server metrics
# Check:
# - apiserver_request_duration_seconds
# - apiserver_client_certificate_expiration_seconds
# - etcd_request_duration_seconds

# Check resource quotas/limits
kubectl describe resourcequota -A

# Test API connectivity
curl -k https://kubernetes.default/api/v1/namespaces

# Check certificate expiration
kubeadm certs check-expiration

41. Explain how webhook-based extension works in Kubernetes

Answer: Webhooks allow external services to extend Kubernetes functionality by intercepting API calls.

Types:

  • MutatingWebhook: Modifies objects before persistence
  • ValidatingWebhook: Validates objects before persistence
  • Audit webhook: Sends audit events to external service

Example ValidatingWebhook:

apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingWebhookConfiguration
metadata:
  name: example-webhook
webhooks:
- name: example.com/validate
  clientConfig:
    service:
      name: webhook-service
      namespace: default
      path: /validate
    caBundle: LS0t...
  rules:
  - operations: ["CREATE", "UPDATE"]
    apiGroups: [""]
    apiVersions: ["v1"]
    resources: ["pods"]
  admissionReviewVersions: ["v1"]
  sideEffects: None
  timeoutSeconds: 5
  failurePolicy: Fail  # Fail if webhook unreachable

Webhook implementation (example in Python):

from flask import Flask, request, jsonify

app = Flask(__name__)

@app.route('/validate', methods=['POST'])
def validate():
    admission_review = request.get_json()
    pod = admission_review['request']['object']

    # Validation logic
    if 'imagePullPolicy: Always' not in pod:
        return jsonify({
            'apiVersion': 'admission.k8s.io/v1',
            'kind': 'AdmissionReview',
            'response': {
                'uid': admission_review['request']['uid'],
                'allowed': False,
                'status': {'message': 'imagePullPolicy must be Always'}
            }
        })

    return jsonify({
        'apiVersion': 'admission.k8s.io/v1',
        'kind': 'AdmissionReview',
        'response': {
            'uid': admission_review['request']['uid'],
            'allowed': True
        }
    })

42. How would you implement pod security at scale?

Answer: Pod Security Standards (PSS):

  • Baseline: Default, minimal restrictions
  • Restricted: Most restrictive, follows best practices
  • Audit: Only audit violations, don’t block

Implementation:

# Enable PSS admission controller
# In API server configuration: --admission-control=...,PodSecurityPolicy,...

# Label namespace with PSS policy
apiVersion: v1
kind: Namespace
metadata:
  name: production
  labels:
    pod-security.kubernetes.io/enforce: restricted
    pod-security.kubernetes.io/audit: restricted
    pod-security.kubernetes.io/warn: restricted

Pod Security Policies (deprecated, use PSS instead):

apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
  name: restricted-psp
spec:
  privileged: false
  allowPrivilegeEscalation: false
  requiredDropCapabilities:
  - ALL
  volumes:
  - 'configMap'
  - 'emptyDir'
  - 'projected'
  - 'secret'
  - 'downwardAPI'
  - 'persistentVolumeClaim'
  hostNetwork: false
  hostIPC: false
  hostPID: false
  runAsUser:
    rule: 'MustRunAsNonRoot'
  seLinux:
    rule: 'MustRunAs'
  supplementalGroups:
    rule: 'RunAsAny'
  fsGroup:
    rule: 'RunAsAny'
  readOnlyRootFilesystem: false

43. Explain container runtime and container shim architecture

Answer: Container runtime stack:

  1. Container Runtime Interface (CRI): kubelet communicates via gRPC
  2. Container shim: Adapts CRI to container runtime (cri-o, cri-containerd)
  3. Container runtime: Actually runs containers (runc, crun)
  4. Linux kernel: Provides cgroups, namespaces for container isolation

Popular runtimes:

  • containerd: CNCF graduated project, lightweight
  • CRI-O: Kubernetes-native, minimal dependencies
  • Docker: Full-featured, heavier
  • gVisor: Sandboxed containers for security

Runtime selection considerations:

  • Performance: containerd and CRI-O faster
  • Security: gVisor more secure but slower
  • Compatibility: Docker most compatible
  • Simplicity: containerd good balance

Changing runtime:

# Edit kubelet configuration
# Change runtime socket: --container-runtime-endpoint=unix:///run/cri-o/cri.sock

# Restart kubelet
systemctl restart kubelet

# Verify runtime
kubectl get nodes -o wide

44. How would you implement eBPF-based observability in Kubernetes?

Answer: eBPF (Extended Berkeley Packet Filter) provides kernel-level observability without modifying application code.

Tools:

  • Cilium: eBPF-based networking and observability
  • Tetragon: eBPF-based security observability
  • Pixie: Kubernetes-native eBPF observability

Benefits:

  • Kernel-level visibility into system behavior
  • No application instrumentation needed
  • Low overhead
  • Real-time analysis

Example Tetragon (for security monitoring):

apiVersion: cilium.io/v1alpha1
kind: TracingPolicy
metadata:
  name: "k8s-api-calls"
spec:
  rules:
  - rule: "Monitor execve syscalls"
    event: "ProcessExec"
    matchPolicies:
    - matchNamespaces:
      - namespace: "default"
    matchArgs:
    - index: 0
      operator: "Equal"
      values: ["curl", "wget"]
    actions:
    - action: "Notify"

45. Explain how kubelet works and kubelet node heartbeat

Answer: kubelet responsibilities:

  • Ensure Pods are running on node
  • Report node status to API server
  • Run health checks (liveness/readiness probes)
  • Mount volumes and manage container storage
  • Execute container pre/post start hooks

Node heartbeat mechanism:

  1. Every 10 seconds (default): kubelet sends heartbeat via NodeStatus
  2. API server waits 40 seconds (default): For heartbeat before marking NotReady
  3. If no heartbeat: Node marked as NotReady, Pods evicted

NodeStatus includes:

  • Node conditions (Ready, MemoryPressure, DiskPressure, etc.)
  • Node capacity (CPU, memory, storage)
  • Running Pods
  • Container runtime status

Configuration:

# kubelet parameters affecting heartbeat
--node-status-update-frequency=10s  # How often to update node status
--node-monitor-grace-period=40s     # How long before marking NotReady
--node-monitor-not-ready-duration=5m  # Pod eviction timeout

46. How would you optimize application startup time in Kubernetes?

Answer: Strategies:

  1. Reduce image size:

    • Use distroless images
    • Multi-stage builds
    • Remove unnecessary layers
  2. Parallel initialization:

    • Use init containers efficiently
    • Parallel container startup
    • Async initialization in app
  3. Readiness probe tuning:

    • Quick initial checks
    • Aggressive probing during startup
    • Startup probes for slow apps
  4. Resource allocation:

    • Adequate CPU for startup
    • Proper memory allocation
    • Disk speed considerations

Example startup optimization:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: fast-startup-app
spec:
  template:
    spec:
      initContainers:
      - name: init
        image: init:latest
        # Initialization
      containers:
      - name: app
        image: myapp:latest
        startupProbe:
          httpGet:
            path: /ready
            port: 8080
          initialDelaySeconds: 5
          periodSeconds: 5
          failureThreshold: 30  # 150 seconds max
        readinessProbe:
          httpGet:
            path: /ready
            port: 8080
          initialDelaySeconds: 0  # Start immediately after startup
          periodSeconds: 2
        resources:
          requests:
            cpu: 500m  # Sufficient for startup
            memory: 256Mi
          limits:
            cpu: 1000m
            memory: 512Mi

47. Explain how Kubernetes scheduler works and how to influence scheduling decisions

Answer: Scheduler process:

  1. Filter: Find nodes meeting Pod requirements
  2. Score: Rank remaining nodes by fitness
  3. Select: Choose highest-scoring node

Filtering criteria:

  • Resource requests (CPU, memory)
  • Node selectors/affinity
  • Taints/tolerations
  • PVC availability
  • Kubelet ready status

Scoring plugins:

  • Resource balance (spread across nodes)
  • Pod affinity (place pods together/apart)
  • Inter-pod affinity/anti-affinity
  • Priority classes

Ways to influence scheduling:

  1. NodeSelector (simple):
apiVersion: v1
kind: Pod
metadata:
  name: app
spec:
  nodeSelector:
    disktype: ssd  # Node must have this label
  1. Affinity (complex):
apiVersion: v1
kind: Pod
metadata:
  name: app
spec:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: node-type
            operator: In
            values: ["worker"]
    podAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
      - labelSelector:
          matchExpressions:
          - key: app
            operator: In
            values: ["database"]
        topologyKey: kubernetes.io/hostname  # Place with database
    podAntiAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
      - weight: 100
        podAffinityTerm:
          labelSelector:
            matchExpressions:
            - key: app
              operator: In
              values: ["app"]
          topologyKey: kubernetes.io/hostname  # Spread across nodes
  1. Taints and Tolerations:
# Taint node
kubectl taint nodes node1 gpu=true:NoSchedule

# Pod tolerates taint
apiVersion: v1
kind: Pod
metadata:
  name: gpu-app
spec:
  tolerations:
  - key: "gpu"
    operator: "Equal"
    value: "true"
    effect: "NoSchedule"
  containers:
  - name: gpu
    image: gpu-app:latest

48. How would you implement secure communication between Kubernetes components?

Answer: Communication types and security:

  1. API server to kubelet:
# Configure kubelet to use client certificates
--kubelet-client-certificate=/etc/kubernetes/pki/apiserver-kubelet-client.crt
--kubelet-client-key=/etc/kubernetes/pki/apiserver-kubelet-client.key
--kubelet-certificate-authority=/etc/kubernetes/pki/ca.crt

# Configure kubelet to verify API server
--kubeconfig=/etc/kubernetes/kubelet.conf
  1. API server to etcd:
# Configure API server etcd client certificates
--etcd-certfile=/etc/kubernetes/pki/apiserver-etcd-client.crt
--etcd-keyfile=/etc/kubernetes/pki/apiserver-etcd-client.key
--etcd-cafile=/etc/kubernetes/pki/etcd/ca.crt
  1. Service-to-service communication:
# Use NetworkPolicy to restrict
# Use mTLS via service mesh (Istio, Linkerd)
# Or use sidecar proxies for encryption

Best practices:

  • Use TLS for all communication
  • Verify certificates regularly
  • Rotate certificates before expiration
  • Use strong key sizes (2048-bit minimum, 4096-bit preferred)
  • Enable audit logging for sensitive operations

49. Explain how Kubernetes handles container restart policies and Pod restart behavior

Answer: RestartPolicy (pod-level):

  • Always: Always restart (default)
  • OnFailure: Restart only on non-zero exit code
  • Never: Never restart

Restart behavior:

  • Exponential backoff: 100ms → 10 minutes max
  • Restarted on same node (unless node fails)
  • RestartCount incremented each restart

Example:

apiVersion: v1
kind: Pod
metadata:
  name: app-with-restart-policy
spec:
  restartPolicy: OnFailure
  containers:
  - name: app
    image: myapp:latest
    lifecycle:
      preStop:
        exec:
          command: ["/bin/sh", "-c", "sleep 15"]  # Graceful shutdown

Job-specific:

  • Backoff limit: Max restarts before marking failed (default: 6)
  • Suspend: Pause job execution

Considerations:

  • Always restart: Suitable for long-running services
  • OnFailure: Suitable for batch jobs
  • Never: Suitable for one-time tasks

50. How would you handle schema-less Kubernetes API responses?

Answer: Dynamic client implementation (handling unknown fields):

# Python example using client-go equivalent
from kubernetes import client

def handle_dynamic_response(response_obj):
    """Handle Kubernetes API response with unknown fields"""
    # Convert to dict for flexible access
    obj_dict = response_obj.to_dict()

    # Access known fields with fallback
    name = obj_dict.get('metadata', {}).get('name', 'unknown')
    namespace = obj_dict.get('metadata', {}).get('namespace', 'default')

    # Access potentially unknown fields
    spec = obj_dict.get('spec', {})
    unknown_fields = {k: v for k, v in spec.items()
                     if k not in known_fields}

    return name, namespace, unknown_fields

Best practices:

  • Use dynamic clients for flexibility with CRDs
  • Implement proper error handling for missing fields
  • Validate schema using OpenAPI schemas
  • Document expected fields for operators/controllers
  • Handle API version differences gracefully

FAQ for Interview Preparation

Q: Should I memorize all these answers? A: No. Understand the concepts. Interviewers want to see your thinking process, not rote memorization. Use these as study guides.

Q: How should I practice for Kubernetes interviews? A: Combine strategies:

  • Study these questions deeply
  • Practice hands-on in actual Kubernetes clusters
  • Build projects and troubleshoot real problems
  • Take certification exams (practical exam experience helps)
  • Participate in code reviews and discussions

Q: What if I don’t know an answer in the interview? A: Best approach:

  • Be honest: “I don’t know, but here’s what I’d do to find out”
  • Show troubleshooting approach
  • Ask clarifying questions
  • Connect to related knowledge you do have
  • Interviewers respect honest uncertainty over incorrect confidence

Preparing for Your Kubernetes Interviews

Use Sailor.sh to prepare with exam simulators covering all Kubernetes certifications:

  • Practice questions matching interview difficulty levels
  • Performance analytics show your weak areas
  • Comprehensive coverage across all Kubernetes domains
  • Start with free questions at Sailor.sh Practice Tests
  • Upgrade to full access at Sailor.sh Full Platform

Your certification plus deep knowledge from these questions positions you for top Kubernetes roles.

Limited Time Offer: Get 80% off all Mock Exam Bundles | Sale ends in 7 days. Start learning today.

Claim Now