📌 Key Takeaways
- 82% of cloud-native organisations use Kubernetes in production — K8s skills are now essential.
- CKA certification delivers 25-35% salary premium for Kubernetes engineers.
- Kubernetes engineers in Bangalore earn ₹8-14 LPA at entry level, ₹22-38 LPA+ for senior roles.
- Thick Brain Technology offers the Best Kubernetes Course with real cluster labs, CKA prep, and placement support.
Kubernetes has become the operating system of the cloud-native world. In 2026, 82% of cloud-native organisations run Kubernetes in production. For DevOps engineers and cloud architects, Kubernetes proficiency is no longer a differentiator — it is a baseline requirement. This guide covers Kubernetes architecture, CKA certification preparation, career paths, and salary trends for Kubernetes practitioners looking for top-tier training.
📊 Kubernetes Market Snapshot — 2026
What is Kubernetes & Core Concepts?
Kubernetes (K8s) is an open-source container orchestration platform originally developed by Google, now maintained by the Cloud Native Computing Foundation (CNCF). It automates the deployment, scaling, and management of containerised applications across clusters of machines.
At its core, Kubernetes solves the problem of running hundreds or thousands of containers at scale. Instead of manually starting and connecting containers, you declare the desired state of your system in YAML manifests — and Kubernetes continuously works to maintain that state. Pods get scheduled, restarted, scaled, load-balanced, and updated automatically.
Kubernetes Control Plane Architecture
A Kubernetes cluster is divided logically into two parts: the Control Plane (which manages the state of the cluster) and the Worker Nodes (which run the application containers).
1. API Server (kube-apiserver)
The API Server is the entry point for all administrative tasks. It exposes a JSON/YAML over HTTP REST API, which is utilized by kubectl, users, and internal control plane components. The API Server is stateless and stores all data in the etcd database. It handles request validation, authentication, authorization, and admission control pipelines.
2. etcd (Distributed Key-Value Store)
etcd is a distributed, consistent, key-value store that acts as the cluster's single source of truth. It stores details such as active node count, running pods, secrets, and configuration states. etcd uses the Raft consensus algorithm to maintain data consistency. In production, etcd is deployed in HA mode with odd numbers of nodes (typically 3 or 5) to tolerate failures.
Production environments typically deploy etcd in one of two topologies:
- Stacked Control Plane Nodes — where etcd runs on the same control plane VMs alongside other components (simple, lower resource overhead).
- External etcd Nodes — where etcd is run on dedicated VMs separated from the control plane (highly secure, limits blast radius, recommended for large clusters).
3. Controller Manager (kube-controller-manager)
The Controller Manager runs various background controller processes that continuously monitor the actual state of the cluster and make changes to align it with the declared desired state. Examples include the Node Controller (detecting offline nodes), Replication Controller (maintaining the correct pod replica counts), and EndpointSlice Controller (linking services to pods).
4. Scheduler (kube-scheduler)
The Scheduler watches for newly created pods that have no assigned node and selects the optimal node for them to run on. It determines the assignment through a two-phase pipeline:
- Filtering (Predicates) — Eliminates nodes that do not meet the pod's resource requests, node selectors, or tolerations.
- Scoring (Priorities) — Ranks the remaining nodes based on factors like image locality, node affinity rules, and balanced resource utilization to select the winner.
Worker Node Architecture
Worker nodes do the heavy lifting of running your application containers. Each worker node contains three essential components:
1. Kubelet
The Kubelet is a system agent running on each worker node. It registers the node with the API Server and watches for PodSpecs assigned to it. It instructs the local Container Runtime to start, stop, or update containers, and reports node and pod status back to the API Server. It runs a continuous synchronization loop to ensure the containers are healthy.
2. Kube-proxy
Kube-proxy is a network agent running on each node that implements the Kubernetes Service abstraction. It manages network rules on the host node (using iptables or high-performance IPVS mode) to load-balance traffic directed to Services across backing pods. In modern setups, kube-proxy is increasingly replaced by eBPF-based technologies like Cilium for extreme performance.
3. Container Runtime
The Container Runtime is the software responsible for executing the actual containers. Kubernetes supports runtimes that comply with the Container Runtime Interface (CRI) standard. Common OCI-compliant runtimes include containerd (used by Docker and managed services) and CRI-O.
Kubernetes Learning Roadmap: 6-Stage Path
This roadmap is used in Thick Brain Technology's DevOps and Kubernetes Certification program — 60 hours of live training, real cluster labs, and CKA preparation.
Container Fundamentals
Docker, images, container networking, registries, multi-stage builds.
BeginnerKubernetes Core Concepts
Pods, Deployments, Services, Namespaces, ConfigMaps, Secrets.
BeginnerNetworking & Storage
Ingress, Network Policies, CSI, PersistentVolumes, StorageClasses.
IntermediateAdvanced Workloads
StatefulSets, DaemonSets, Jobs, CronJobs, Operators, Helm.
IntermediateManaged Kubernetes (EKS/AKS/GKE)
Control plane management, node groups, scaling, cloud provider integration.
AdvancedGitOps & CKA Prep
ArgoCD, Flux, monitoring, troubleshooting, CKA exam preparation.
AdvancedEKS vs AKS vs GKE: Choosing the Right Managed Kubernetes
If you are deploying K8s in a public cloud, utilizing managed services reduces cluster maintenance. Below is a structural comparison:
| Feature | Amazon EKS | Azure AKS | Google GKE |
|---|---|---|---|
| Cloud Provider | AWS | Microsoft Azure | Google Cloud |
| Control Plane Cost | $0.10/hr | Free | Free (Standard) / $0.10/hr (Autopilot) |
| Autopilot Mode | Fargate | Virtual Nodes | GKE Autopilot |
| Best For | AWS-centric orgs | Microsoft/Azure orgs | Advanced K8s features |
| AI Integration | SageMaker, Bedrock | Azure OpenAI | Vertex AI |
| K8s Version Support | Latest + 3 | Latest + 2 | Latest + 3 |
For cloud-specific deep-dives, check out our specialized resources:
🚀 Ready to enroll in the Best Kubernetes Course?
Book a free 60-minute demo class — deploy your first Kubernetes application live. No payment, no commitment.
CKA Certification: Everything You Need to Know
The Certified Kubernetes Administrator (CKA) is the gold standard Kubernetes certification. Unlike multiple-choice exams, CKA is 100% performance-based — you solve real Kubernetes tasks in a live cluster environment over 2 hours.
| Exam Detail | Value |
|---|---|
| Exam Duration | 2 hours |
| Format | Performance-based (hands-on in live clusters) |
| Passing Score | 66% |
| Cost | USD 395 (includes one free retake) |
| Validity | 3 years |
| Open-book | Yes — kubernetes.io documentation is accessible |
CKA Domain Weightings
- Cluster Architecture, Installation & Configuration — 25%
- Workloads & Scheduling — 15%
- Services & Networking — 20%
- Storage — 10%
- Troubleshooting — 30% ← heaviest domain
Top Kubernetes Certifications 2026
These are the certifications that appear most frequently in senior Kubernetes job descriptions across Bengaluru, Hyderabad and Pune.
Kubernetes Engineer Salary Guide 2026
Salary data based on Bangalore market rates, job postings, and Thick Brain placement data (2025–2026).
| Role | Experience | Bangalore Salary (2026) |
|---|---|---|
| Junior DevOps / K8s Engineer | 0-2 years | ₹6 – 10 LPA |
| DevOps Engineer (K8s specialist) | 2-5 years | ₹12 – 20 LPA |
| Senior K8s / Cloud Architect | 5-8 years | ₹20 – 32 LPA |
| Platform Engineer / SRE | 4-8 years | ₹18 – 30 LPA |
| CKA Premium (any level) | — | +25-35% above base |
Source: Naukri.com, LinkedIn Jobs, Thick Brain placement data, June 2026
Production Deployment Practices
Running Kubernetes in production requires adhering to security, resource, and continuous integration guidelines:
- GitOps & Continuous Delivery — Desired cluster state should be saved in Git. Continuous delivery tools like ArgoCD or Flux pull configurations dynamically and reconcile cluster state, eliminating manual
kubectl applycommands. - Resource Limits and Quotas — Unrestricted containers can consume node host resources, inducing memory pressure. Always declare CPU/Memory Requests (guaranteed amount) and Limits (maximum allowed). Use LimitRanges in each namespace to enforce default values.
- Namespaces Isolation — Restrict team accesses by establishing logical namespaces. Enforce NetworkPolicies to define strict pod-to-pod network microsegmentation.
- Security Contexts — Do not run containers as root. Restrict process accesses by configuring
securityContext.runAsNonRoot: trueand disabling privilege escalation.
Real-World Use Cases
Use Case 1: Scaling Stateless Microservices
A high-traffic e-commerce retail store experiences sudden flash sale load surges. By deploying the checkout microservice in a stateless Deployment wrapper and linking it to a Horizontal Pod Autoscaler (HPA), Kubernetes monitors CPU thresholds. As load increases, the HPA dynamically provisions additional replicas, distributing load cleanly through the service backend.
Use Case 2: Deploying Stateful Applications
Databases like PostgreSQL require stable network identities and persistent disk linkages. By using a StatefulSet instead of a Deployment, Kubernetes ensures pod names are sequential (e.g. db-0, db-1) and maps each instance to its own dedicated PersistentVolume (PV) via StorageClasses. Node migrations or pod crashes do not risk data corruption or loss.
Kubernetes Troubleshooting Guide
DevOps engineers must know how to diagnose common cluster errors:
1. Pod Stuck in Pending State
A pod stays in Pending state when the scheduler cannot assign it to a node. Running kubectl describe pod <pod-name> exposes scheduling events. Common causes include insufficient CPU/Memory resources on existing nodes, failed node selectors, or unfulfilled tolerations for node taints.
2. Pod Stuck in CrashLoopBackOff
This signifies the container is continually starting and crashing. Check application logs via kubectl logs <pod-name> --previous to inspect the failure logs of the crashed container instance. Common causes include configuration errors, unhandled exceptions, missing database connectivity credentials, or an exit code 137 (OOMKilled, meaning the container exceeded its allocated memory limits).
3. ImagePullBackOff
This indicates the container runtime failed to download the image. Verify the image repository URL, tag references, and ensure secret credentials (like imagePullSecrets) are mapped correctly for private container registries.
100 Kubernetes Interview Questions & Answers (2026)
The most comprehensive Kubernetes interview question bank for Bangalore tech companies — covering core concepts, networking, storage, security, troubleshooting, and CKA Prep. Use search and category filters to focus your preparation.
mysql-0, mysql-1) and a dedicated PersistentVolume. Pods start/stop in order. Use for databases (MySQL, PostgreSQL, MongoDB), message queues (Kafka), and any app requiring stable network identity or persistent per-pod storage.localhost. Pods are ephemeral — they are created and destroyed frequently. In practice, you rarely create Pods directly — you use higher-level controllers like Deployments.api.example.com → api-service, app.example.com → web-service. Requires an Ingress Controller (nginx-ingress, AWS ALB Ingress, Traefik).dev, staging, production, or by team. Use ResourceQuotas to limit CPU/memory per namespace. Use LimitRanges to set default requests/limits for pods. Note: namespaces do NOT provide network isolation by default — add NetworkPolicies for that. Cluster-scoped resources (nodes, PVs, ClusterRoles) are not namespaced.kubectl rollout status deployment/app.kubectl top pods to measure actual usage before setting values.*/5 * * * * — every 5 minutes. Important: CronJobs may skip executions if the controller is down — for critical tasks, use a dedicated job scheduler with retry logic.Pending. Check: kubectl describe pod <name> shows scheduler events explaining why a pod is Pending.<service-name>.<namespace>.svc.cluster.local. Pods within the same namespace can use just the service name. Cross-namespace: use the full FQDN. For StatefulSet pods, each gets its own DNS: pod-0.service.namespace.svc.cluster.local. This is how microservices discover each other — no hardcoded IPs. Headless services (ClusterIP: None) return individual pod IPs directly, used by StatefulSets and service meshes.api pod to access the db pod on port 5432, deny all other ingress. Requires a CNI that supports NetworkPolicy (Calico, Cilium — Flannel alone does not). Start with a default-deny policy in each namespace, then add explicit allow rules. Essential for PCI-DSS and SOC2 compliance.ClusterIP: None. It returns the DNS of individual pod IPs instead of a single service IP. Use cases: (1) StatefulSets — each pod needs a stable network identity. (2) Service discovery — when you need to directly address pods (e.g., for database clusters). (3) Custom load balancing — when you want to implement your own load balancing logic.Gateway (proxies), HTTPRoute (routing rules). Gateway API is vendor-neutral and supported by all major Ingress controllers. It is the successor to Ingress and recommended for new projects.kubernetes.io/ingress.class: nginx and cert-manager.io/cluster-issuer: letsencrypt-prod. (2) Service with LoadBalancer — upload TLS certificate to the cloud provider's load balancer (AWS ALB, Azure LB). cert-manager is the preferred approach for production — it's automated, free, and integrates with most Ingress controllers.volumes and volumeMounts.gp2 for AWS EBS, azure-disk for Azure). It includes provisioner (the driver that creates the PV), parameters (size, IOPS, encryption), and reclaimPolicy (Retain, Delete). With dynamic provisioning, when a PVC requests a StorageClass, Kubernetes automatically creates a PV matching the class. This is the standard way to manage storage in production.Released state — you can manually recover the data. Delete — when the PVC is deleted, the PV and underlying storage are automatically deleted. Use Retain for production databases (you want to prevent accidental data loss). Use Delete for ephemeral storage (caches, temporary data).gp3). (3) Create a PVC referencing the StorageClass. (4) The EBS CSI Driver automatically creates an EBS volume and attaches it to the pod. (5) The pod mounts the volume via volumeMounts. For production, use EFS for shared storage (ReadWriteMany) and EBS for single-pod storage (ReadWriteOnce).mysqldump or pg_dump in a Job. (3) Velero (formerly Heptio Ark) — backups entire cluster state (resources + volumes). Velero supports CSI snapshots and is the standard backup solution for Kubernetes. For production, run automated backups with Velero and store them in S3 or Azure Blob.VolumeSnapshot (the resource), VolumeSnapshotClass (configuration). Restore by creating a new PVC from the snapshot. Benefits: (1) Fast recovery — minutes instead of hours. (2) Cost-effective — incremental snapshots. (3) Consistent backups — for database volumes, use pre-stop hooks to flush data before snapshot. Use Velero to automate snapshot creation and retention.pvc.spec.resources.requests.storage update). (4) GCE PD — resizable in place. Best practice: Use dynamic provisioning with a StorageClass that supports expansion (allowVolumeExpansion: true). For EBS, migrate to a larger volume using snapshot restore.Role with get, list, watch verbs on pods resource, create a ServiceAccount, bind them with a RoleBinding. View effective permissions with kubectl auth can-i list pods --as=system:serviceaccount:namespace:sa-name. Always use namespace-scoped Roles over ClusterRoles unless cluster-wide access is genuinely needed.cluster-admin). (3) Permissions used with a ClusterRoleBinding (binding applies to all namespaces).privileged (allow all), baseline (restrict known escalations), restricted (strict). PSA is enabled by default in Kubernetes 1.24+.cluster-admin for service accounts. (3) Use Pod Security Admission to restrict privileged containers. (4) Set network policies to limit pod-to-pod communication. (5) Use secrets management (Vault, External Secrets). Least privilege reduces attack surface and limits blast radius of security incidents.--anonymous-auth=false. (3) Enable RBAC — --authorization-mode=RBAC. (4) Use webhook authentication — integrate with OIDC (Azure AD, Google). (5) Restrict access to etcd — etcd holds secrets and cluster state — use TLS and firewall. (6) Enable audit logging — track API access. (7) Use a network policy to restrict API server access to trusted sources.kubectl describe pod <name> — check Events section for scheduling failures. Step 2: Check for insufficient resources (CPU/memory). Step 3: Check node selector or affinity — no nodes match the criteria. Step 4: Check taints and tolerations — if nodes are tainted, the pod needs a toleration. Step 5: Check PVC binding — if the pod requires a PVC that is not bound. Step 6: If all else fails, check for node capacity — use kubectl get nodes --show-labels.kubectl describe pod <name> — check Events section for OOMKilled, failed probes, or image pull errors. Step 2: kubectl logs <pod> --previous — see logs from the crashed container. Step 3: Check exit code in describe output — exit code 1 is application error, 137 is OOMKilled, 126/127 is missing executable. Common fixes: increase memory limits (OOMKilled), fix liveness probe timing, correct the ENTRYPOINT command, or fix application startup errors visible in logs.kubectl describe pod/<name> shows detailed information about a pod: (1) Events — the most useful part (scheduling failures, image pull issues, OOM). (2) Status — pod phase (Running, Pending, Failed). (3) Node — where the pod is scheduled. (4) Containers — image, ports, volume mounts, probe settings. (5) QoS — Guaranteed, Burstable, BestEffort. Always start troubleshooting with kubectl describe.-c flag: kubectl logs <pod-name> -c <container-name>. To see logs from all containers: kubectl logs <pod-name> --all-containers=true. For a continuous stream: kubectl logs -f <pod-name> -c <container-name>. For multi-container pods (e.g., sidecar patterns), this is essential. Also, kubectl logs --previous shows logs from a previous (crashed) container instance.kubectl exec -it <pod-name> -- /bin/sh (or /bin/bash). This opens an interactive shell inside the container. If the container doesn't have a shell, you can use kubectl debug -it <pod-name> --image=busybox --share-processes to spin up a debug container sharing the target pod's process namespace.kubectl get service --show-labels and verify the selector matches pods. Step 2: kubectl describe service — check Endpoints list. If empty, no pods match the selector. Step 3: kubectl get pods --show-labels — confirm pods have the labels the Service expects. Step 4: Check port mapping — service.spec.ports.targetPort must match containerPort. Step 5: Check network policy — if a NetworkPolicy blocks ingress to the pods.kubectl get nodes — check status. If NotReady, Step 2: kubectl describe node <node-name> — look at Conditions section (OutOfDisk, MemoryPressure, DiskPressure, PIDPressure). Step 3: SSH into the node and check: systemctl status kubelet, journalctl -u kubelet -f. Step 4: Check disk space: df -h. Step 5: Check Docker/containerd: systemctl status docker. Common causes: kubelet not running, out of disk space, network issues.initialDelaySeconds for both to avoid premature failures.kubectl port-forward pod/<pod-name> 8080:80 forwards local port 8080 to pod port 80. This bypasses Services and NetworkPolicies — useful for debugging a single pod directly. For Services: kubectl port-forward service/<service-name> 8080:80. Use port-forward for: (1) Testing an unreleased version. (2) Accessing a debug endpoint. (3) Debugging a specific pod behind a load balancer. Never use port-forward in production.minAvailable: 2 ensures at least 2 pods are running at all times during a drain. Without a PDB, draining a node could evict all replicas of a 3-replica deployment. Set PDBs for every production Deployment — it prevents accidental downtime and is required for passing most security audits.Database, MongoDBCluster). CRDs are simple to create — you define the schema (OpenAPI). API Extension (Aggregated API) is a more complex approach where you build a separate API server and register it with the main API server. Use CRD for most extensions (95% of cases). Use API Aggregation for advanced authentication/authorisation, or when you need a separate API server.values.yaml for configuration. Benefits: (1) Install complex applications with one command (helm install my-nginx ingress-nginx/ingress-nginx). (2) Manage environment-specific config via values files. (3) Upgrade and rollback releases. (4) Share reusable charts via Artifact Hub. In CI/CD, Helm is used to parameterise deployments — update the image tag in values.yaml and run helm upgrade.kubectl autoscale deployment web --cpu-percent=70 --min=2 --max=20. The HPA controller checks metrics every 15 seconds. Requires metrics-server installed in the cluster. Advanced HPA can scale on custom metrics (e.g., requests per second from Prometheus via the custom.metrics.k8s.io API). Set meaningful minimum replicas to avoid cold-start latency, and ensure resource requests are set (HPA needs them for percentage calculation).MutatingWebhookConfiguration or ValidatingWebhookConfiguration.tolerations: [{"key": "gpu", "operator": "Equal", "value": "true", "effect": "NoSchedule"}].kubectl run --image, kubectl create deployment). (4) Know the documentation — exam is open-book, but you need to know what to search for. (5) Time management — don't spend too long on one question. Thick Brain's CKA preparation course includes full practice exams.kubectl run, kubectl expose are faster than writing YAML. (3) Keep a cheat sheet — common commands and YAML snippets (allowed). (4) Practice with a timer — simulate 2-hour exam with 15-20 questions. (5) Don't overthink — most questions have a straightforward solution. (6) Use aliases — alias k=kubectl saves time.--image vs --image-pull-policy. (3) Forgetting to switch to the correct namespace — many questions require -n or --namespace. (4) Not testing after creating a resource — check pod status before moving on. (5) Spending too much time on one question — flag and revisit. (6) Not using the documentation — you have access to kubernetes.io — use it.https://kubernetes.io/docs. Tips: (1) Bookmark key pages — kubernetes.io/docs/reference/kubectl/, kubernetes.io/docs/tasks/. (2) Use search — Ctrl+F is faster than navigating. (3) Look for examples — most pages have YAML examples you can copy and modify. (4) Don't rely on memory — know what to search for, not the exact syntax. (5) Practice with the documentation — during your preparation, use the docs to solve problems.Frequently Asked Questions
Conclusion: Master Kubernetes in 2026
Kubernetes proficiency is no longer optional for DevOps engineers and cloud architects in 2026. Whether you specialise in AWS EKS, Azure AKS, or Google GKE, the core Kubernetes skills transfer across all platforms. Pursuing the CKA certification validates these skills in a way that employers trust.
The market rewards engineers who combine strong Kubernetes fundamentals with real cluster experience. CKA-certified engineers command 25-35% salary premiums and are in high demand across Bangalore's top tech companies.
Thick Brain Technology offers advanced Kubernetes training on all three managed platforms — EKS, AKS and GKE — with real cluster environments and CKA-aligned practice scenarios. Book a free demo class to deploy your first Kubernetes application live.
Master Kubernetes with Real Cluster Labs
Book a free demo class and deploy your first Kubernetes application live. No payment required.
Share this article
