📌 Key Takeaways

  • 82% of cloud-native organisations use Kubernetes in production — K8s skills are now essential.
  • CKA certification delivers 25-35% salary premium for Kubernetes engineers.
  • Kubernetes engineers in Bangalore earn ₹8-14 LPA at entry level, ₹22-38 LPA+ for senior roles.
  • Thick Brain Technology offers the Best Kubernetes Course with real cluster labs, CKA prep, and placement support.

Kubernetes has become the operating system of the cloud-native world. In 2026, 82% of cloud-native organisations run Kubernetes in production. For DevOps engineers and cloud architects, Kubernetes proficiency is no longer a differentiator — it is a baseline requirement. This guide covers Kubernetes architecture, CKA certification preparation, career paths, and salary trends for Kubernetes practitioners looking for top-tier training.

📊 Kubernetes Market Snapshot — 2026

82%
Cloud-native orgs use Kubernetes in production
Top 3
Most-hired DevOps profile in Bangalore
25-35%
Salary premium for CKA certification
70%+
Managed K8s workloads via EKS/AKS/GKE

What is Kubernetes & Core Concepts?

Kubernetes (K8s) is an open-source container orchestration platform originally developed by Google, now maintained by the Cloud Native Computing Foundation (CNCF). It automates the deployment, scaling, and management of containerised applications across clusters of machines.

At its core, Kubernetes solves the problem of running hundreds or thousands of containers at scale. Instead of manually starting and connecting containers, you declare the desired state of your system in YAML manifests — and Kubernetes continuously works to maintain that state. Pods get scheduled, restarted, scaled, load-balanced, and updated automatically.

💡 Why Kubernetes in 2026? Kubernetes is the industry standard for container orchestration. Every major cloud provider offers a managed K8s service (EKS, AKS, GKE), and the job market for Kubernetes-skilled engineers continues to grow faster than supply. To excel, enrollment in a structured, industry-level Kubernetes Course in Bangalore is highly recommended.

Kubernetes Control Plane Architecture

A Kubernetes cluster is divided logically into two parts: the Control Plane (which manages the state of the cluster) and the Worker Nodes (which run the application containers).

1. API Server (kube-apiserver)

The API Server is the entry point for all administrative tasks. It exposes a JSON/YAML over HTTP REST API, which is utilized by kubectl, users, and internal control plane components. The API Server is stateless and stores all data in the etcd database. It handles request validation, authentication, authorization, and admission control pipelines.

2. etcd (Distributed Key-Value Store)

etcd is a distributed, consistent, key-value store that acts as the cluster's single source of truth. It stores details such as active node count, running pods, secrets, and configuration states. etcd uses the Raft consensus algorithm to maintain data consistency. In production, etcd is deployed in HA mode with odd numbers of nodes (typically 3 or 5) to tolerate failures.

Production environments typically deploy etcd in one of two topologies:

  • Stacked Control Plane Nodes — where etcd runs on the same control plane VMs alongside other components (simple, lower resource overhead).
  • External etcd Nodes — where etcd is run on dedicated VMs separated from the control plane (highly secure, limits blast radius, recommended for large clusters).

3. Controller Manager (kube-controller-manager)

The Controller Manager runs various background controller processes that continuously monitor the actual state of the cluster and make changes to align it with the declared desired state. Examples include the Node Controller (detecting offline nodes), Replication Controller (maintaining the correct pod replica counts), and EndpointSlice Controller (linking services to pods).

4. Scheduler (kube-scheduler)

The Scheduler watches for newly created pods that have no assigned node and selects the optimal node for them to run on. It determines the assignment through a two-phase pipeline:

  • Filtering (Predicates) — Eliminates nodes that do not meet the pod's resource requests, node selectors, or tolerations.
  • Scoring (Priorities) — Ranks the remaining nodes based on factors like image locality, node affinity rules, and balanced resource utilization to select the winner.

Worker Node Architecture

Worker nodes do the heavy lifting of running your application containers. Each worker node contains three essential components:

1. Kubelet

The Kubelet is a system agent running on each worker node. It registers the node with the API Server and watches for PodSpecs assigned to it. It instructs the local Container Runtime to start, stop, or update containers, and reports node and pod status back to the API Server. It runs a continuous synchronization loop to ensure the containers are healthy.

2. Kube-proxy

Kube-proxy is a network agent running on each node that implements the Kubernetes Service abstraction. It manages network rules on the host node (using iptables or high-performance IPVS mode) to load-balance traffic directed to Services across backing pods. In modern setups, kube-proxy is increasingly replaced by eBPF-based technologies like Cilium for extreme performance.

3. Container Runtime

The Container Runtime is the software responsible for executing the actual containers. Kubernetes supports runtimes that comply with the Container Runtime Interface (CRI) standard. Common OCI-compliant runtimes include containerd (used by Docker and managed services) and CRI-O.

Kubernetes Learning Roadmap: 6-Stage Path

This roadmap is used in Thick Brain Technology's DevOps and Kubernetes Certification program — 60 hours of live training, real cluster labs, and CKA preparation.

🐳
Stage 1

Container Fundamentals

Docker, images, container networking, registries, multi-stage builds.

Beginner
Stage 2

Kubernetes Core Concepts

Pods, Deployments, Services, Namespaces, ConfigMaps, Secrets.

Beginner
🌐
Stage 3

Networking & Storage

Ingress, Network Policies, CSI, PersistentVolumes, StorageClasses.

Intermediate
🔄
Stage 4

Advanced Workloads

StatefulSets, DaemonSets, Jobs, CronJobs, Operators, Helm.

Intermediate
☁️
Stage 5

Managed Kubernetes (EKS/AKS/GKE)

Control plane management, node groups, scaling, cloud provider integration.

Advanced
📊
Stage 6

GitOps & CKA Prep

ArgoCD, Flux, monitoring, troubleshooting, CKA exam preparation.

Advanced

EKS vs AKS vs GKE: Choosing the Right Managed Kubernetes

If you are deploying K8s in a public cloud, utilizing managed services reduces cluster maintenance. Below is a structural comparison:

FeatureAmazon EKSAzure AKSGoogle GKE
Cloud ProviderAWSMicrosoft AzureGoogle Cloud
Control Plane Cost$0.10/hrFreeFree (Standard) / $0.10/hr (Autopilot)
Autopilot ModeFargateVirtual NodesGKE Autopilot
Best ForAWS-centric orgsMicrosoft/Azure orgsAdvanced K8s features
AI IntegrationSageMaker, BedrockAzure OpenAIVertex AI
K8s Version SupportLatest + 3Latest + 2Latest + 3

For cloud-specific deep-dives, check out our specialized resources:

🚀 Ready to enroll in the Best Kubernetes Course?

Book a free 60-minute demo class — deploy your first Kubernetes application live. No payment, no commitment.

CKA Certification: Everything You Need to Know

The Certified Kubernetes Administrator (CKA) is the gold standard Kubernetes certification. Unlike multiple-choice exams, CKA is 100% performance-based — you solve real Kubernetes tasks in a live cluster environment over 2 hours.

Exam DetailValue
Exam Duration2 hours
FormatPerformance-based (hands-on in live clusters)
Passing Score66%
CostUSD 395 (includes one free retake)
Validity3 years
Open-bookYes — kubernetes.io documentation is accessible

CKA Domain Weightings

  • Cluster Architecture, Installation & Configuration — 25%
  • Workloads & Scheduling — 15%
  • Services & Networking — 20%
  • Storage — 10%
  • Troubleshooting — 30% ← heaviest domain

Top Kubernetes Certifications 2026

These are the certifications that appear most frequently in senior Kubernetes job descriptions across Bengaluru, Hyderabad and Pune.

⚓ Most Respected
CKA — Certified Kubernetes Administrator
By CNCF. 100% performance-based in a live K8s cluster. Universally recognised by Indian and global employers. Best first certification for K8s engineers.
☁️ AWS
AWS Certified Kubernetes (EKS)
For engineers focused on AWS EKS — covers control plane management, node groups, IAM integration, and EKS best practices.
☁️ Azure
Azure Kubernetes (AKS) Certification
Validates AKS skills — cluster setup, networking, security, monitoring, and CI/CD integration with Azure DevOps.
☁️ GCP
GKE Professional Certification
For Google Cloud-focused engineers — covers GKE architecture, autopilot, security, and advanced networking features.
🎓 Practitioner
Thick Brain Advanced Kubernetes (EKS)
60 hours live training, real EKS cluster labs, CKA preparation, Helm, GitOps, ArgoCD, and placement support until hired.

Kubernetes Engineer Salary Guide 2026

Salary data based on Bangalore market rates, job postings, and Thick Brain placement data (2025–2026).

RoleExperienceBangalore Salary (2026)
Junior DevOps / K8s Engineer0-2 years₹6 – 10 LPA
DevOps Engineer (K8s specialist)2-5 years₹12 – 20 LPA
Senior K8s / Cloud Architect5-8 years₹20 – 32 LPA
Platform Engineer / SRE4-8 years₹18 – 30 LPA
CKA Premium (any level)+25-35% above base

Source: Naukri.com, LinkedIn Jobs, Thick Brain placement data, June 2026

Production Deployment Practices

Running Kubernetes in production requires adhering to security, resource, and continuous integration guidelines:

  • GitOps & Continuous Delivery — Desired cluster state should be saved in Git. Continuous delivery tools like ArgoCD or Flux pull configurations dynamically and reconcile cluster state, eliminating manual kubectl apply commands.
  • Resource Limits and Quotas — Unrestricted containers can consume node host resources, inducing memory pressure. Always declare CPU/Memory Requests (guaranteed amount) and Limits (maximum allowed). Use LimitRanges in each namespace to enforce default values.
  • Namespaces Isolation — Restrict team accesses by establishing logical namespaces. Enforce NetworkPolicies to define strict pod-to-pod network microsegmentation.
  • Security Contexts — Do not run containers as root. Restrict process accesses by configuring securityContext.runAsNonRoot: true and disabling privilege escalation.

Real-World Use Cases

Use Case 1: Scaling Stateless Microservices

A high-traffic e-commerce retail store experiences sudden flash sale load surges. By deploying the checkout microservice in a stateless Deployment wrapper and linking it to a Horizontal Pod Autoscaler (HPA), Kubernetes monitors CPU thresholds. As load increases, the HPA dynamically provisions additional replicas, distributing load cleanly through the service backend.

Use Case 2: Deploying Stateful Applications

Databases like PostgreSQL require stable network identities and persistent disk linkages. By using a StatefulSet instead of a Deployment, Kubernetes ensures pod names are sequential (e.g. db-0, db-1) and maps each instance to its own dedicated PersistentVolume (PV) via StorageClasses. Node migrations or pod crashes do not risk data corruption or loss.

Kubernetes Troubleshooting Guide

DevOps engineers must know how to diagnose common cluster errors:

1. Pod Stuck in Pending State

A pod stays in Pending state when the scheduler cannot assign it to a node. Running kubectl describe pod <pod-name> exposes scheduling events. Common causes include insufficient CPU/Memory resources on existing nodes, failed node selectors, or unfulfilled tolerations for node taints.

2. Pod Stuck in CrashLoopBackOff

This signifies the container is continually starting and crashing. Check application logs via kubectl logs <pod-name> --previous to inspect the failure logs of the crashed container instance. Common causes include configuration errors, unhandled exceptions, missing database connectivity credentials, or an exit code 137 (OOMKilled, meaning the container exceeded its allocated memory limits).

3. ImagePullBackOff

This indicates the container runtime failed to download the image. Verify the image repository URL, tag references, and ensure secret credentials (like imagePullSecrets) are mapped correctly for private container registries.

100 Kubernetes Interview Questions & Answers (2026)

The most comprehensive Kubernetes interview question bank for Bangalore tech companies — covering core concepts, networking, storage, security, troubleshooting, and CKA Prep. Use search and category filters to focus your preparation.

Showing 70 questions
Deployment manages stateless pods — pods are interchangeable, get random names, and can be replaced in any order. For web servers, APIs, microservices. StatefulSet manages stateful applications — each pod gets a stable, ordered name (e.g., mysql-0, mysql-1) and a dedicated PersistentVolume. Pods start/stop in order. Use for databases (MySQL, PostgreSQL, MongoDB), message queues (Kafka), and any app requiring stable network identity or persistent per-pod storage.
A Pod is the smallest deployable unit in Kubernetes. It contains one or more containers that share the same network namespace and storage volume. Containers inside a Pod can communicate via localhost. Pods are ephemeral — they are created and destroyed frequently. In practice, you rarely create Pods directly — you use higher-level controllers like Deployments.
A Service exposes a set of pods internally or externally via ClusterIP, NodePort, or LoadBalancer. A LoadBalancer service creates a cloud load balancer (e.g., AWS ELB) for each service — expensive. An Ingress is a single entry point that routes HTTP/HTTPS traffic to multiple services based on hostname or path rules, using one load balancer. Example: api.example.com → api-service, app.example.com → web-service. Requires an Ingress Controller (nginx-ingress, AWS ALB Ingress, Traefik).
A Namespace is a virtual cluster within a physical cluster — scoping names, RBAC, network policies, and resource quotas. Common pattern: separate namespaces for dev, staging, production, or by team. Use ResourceQuotas to limit CPU/memory per namespace. Use LimitRanges to set default requests/limits for pods. Note: namespaces do NOT provide network isolation by default — add NetworkPolicies for that. Cluster-scoped resources (nodes, PVs, ClusterRoles) are not namespaced.
ConfigMap stores non-sensitive configuration (app settings, feature flags) as key-value pairs. Secret stores sensitive data (passwords, tokens, TLS certificates) encoded in base64. Security limitation: Kubernetes Secrets are base64-encoded, not encrypted — anyone with etcd access or the right RBAC permissions can decode them. Best practice: enable encryption at rest for etcd, restrict Secret access with RBAC, or use an external secrets manager (HashiCorp Vault, AWS Secrets Manager) with the External Secrets Operator.
A rolling update gradually replaces old pods with new ones. maxUnavailable: maximum number of pods that can be unavailable during the update (default 25%). maxSurge: maximum number of pods that can be created above the desired count (default 25%). Example with 4 replicas: maxUnavailable=1 means at least 3 pods are always running; maxSurge=1 allows up to 5 pods temporarily. Set maxUnavailable=0 and maxSurge=1 for zero-downtime deployments. Monitor with kubectl rollout status deployment/app.
Requests: the guaranteed minimum resources — used by the scheduler to place pods on nodes. Limits: the maximum allowed. CPU limit: the container is throttled (slowed down) when it hits the limit. Memory limit: the container is OOMKilled (exit code 137) and restarted — this causes CrashLoopBackOff if the OOM is consistent. Best practice: set requests equal to typical usage, limits at 1.5–2x requests. Use kubectl top pods to measure actual usage before setting values.
A DaemonSet ensures exactly one pod runs on every node (or a subset matching a node selector). Use cases: log collection (Fluentd, Fluent Bit), monitoring agents (Prometheus node-exporter, Datadog agent), network plugins (Calico, Weave), storage drivers. When a new node joins the cluster, the DaemonSet automatically schedules a pod on it. When a node is removed, the pod is garbage collected.
Job runs a task to completion — when all pods successfully terminate, the Job is considered complete. Used for batch processing, database migrations, backup jobs. CronJob runs a Job on a schedule (like cron). Example: */5 * * * * — every 5 minutes. Important: CronJobs may skip executions if the controller is down — for critical tasks, use a dedicated job scheduler with retry logic.
The scheduler runs in two phases: Filtering — eliminates nodes that cannot run the pod (insufficient CPU/memory, failed taints, node affinity mismatch, unmet volume requirements). Scoring — ranks remaining nodes by factors including resource availability, pod affinity, image locality, and inter-pod spreading. The highest-scoring node wins. If no node passes filtering, the pod stays Pending. Check: kubectl describe pod <name> shows scheduler events explaining why a pod is Pending.
Kubernetes runs CoreDNS as a cluster DNS server. Every Service gets a DNS record: <service-name>.<namespace>.svc.cluster.local. Pods within the same namespace can use just the service name. Cross-namespace: use the full FQDN. For StatefulSet pods, each gets its own DNS: pod-0.service.namespace.svc.cluster.local. This is how microservices discover each other — no hardcoded IPs. Headless services (ClusterIP: None) return individual pod IPs directly, used by StatefulSets and service meshes.
By default, all pods in a cluster can communicate with each other. A NetworkPolicy restricts which pods can talk to which, acting like a firewall at layer 3/4. Example: allow only the api pod to access the db pod on port 5432, deny all other ingress. Requires a CNI that supports NetworkPolicy (Calico, Cilium — Flannel alone does not). Start with a default-deny policy in each namespace, then add explicit allow rules. Essential for PCI-DSS and SOC2 compliance.
NodePort exposes the service on a static port (30000-32767) on every node's IP. Use for local development or testing. LoadBalancer provisions a cloud load balancer (AWS ELB, Azure LB) and routes traffic to the service. LoadBalancer is expensive ($20+/month). For production, use Ingress instead of LoadBalancer for HTTP/HTTPS services. NodePort is rarely used in production (except for legacy apps).
A Headless Service has ClusterIP: None. It returns the DNS of individual pod IPs instead of a single service IP. Use cases: (1) StatefulSets — each pod needs a stable network identity. (2) Service discovery — when you need to directly address pods (e.g., for database clusters). (3) Custom load balancing — when you want to implement your own load balancing logic.
CNI is a standard for configuring network interfaces in containers. Common plugins: (1) Calico — supports NetworkPolicy, BGP routing, most popular in production. (2) Cilium — uses eBPF for high performance, supports NetworkPolicy, service mesh. (3) Flannel — simple, no NetworkPolicy support, easy to set up. (4) Weave — easy to use, supports encryption. Choose Calico or Cilium for production clusters that need NetworkPolicy.
Best practice: (1) Use Ingress with an Ingress Controller (nginx-ingress, AWS ALB Ingress). (2) Configure TLS termination on the Ingress using cert-manager (automatic Let's Encrypt). (3) Set up NetworkPolicy to restrict ingress traffic to the Ingress controller only. (4) Use CloudFlare or AWS CloudFront as a CDN/DDoS protection layer. (5) Enable WAF (AWS WAF, ModSecurity) for HTTP inspection.
iptables mode (default) creates iptables rules for each service — the kernel handles packet forwarding. Good for small clusters, but linear scaling (O(n) rules). IPVS mode (IP Virtual Server) uses Linux's native load balancing — supports higher throughput, O(1) lookup, and more sophisticated algorithms (RR, LC, DH). IPVS is recommended for large clusters (1000+ services) due to performance gains.
A Service Mesh adds a dedicated infrastructure layer for service-to-service communication. Key features: Traffic management (canary, blue-green), Observability (metrics, traces, logs), Security (mTLS, RBAC). Popular mesh options: Istio (most powerful, complex), Linkerd (lightweight, simpler), Consul. Service meshes are optional but recommended for microservices with complex traffic patterns. Kubernetes does not include a service mesh by default — you add it as a separate layer.
Gateway API is a newer, more expressive Kubernetes API for traffic routing. Unlike Ingress (focused on HTTP/HTTPS), Gateway API supports L4 and L7, and is protocol-agnostic (TCP, UDP, HTTP, gRPC). Key resources: Gateway (proxies), HTTPRoute (routing rules). Gateway API is vendor-neutral and supported by all major Ingress controllers. It is the successor to Ingress and recommended for new projects.
Two approaches: (1) Ingress with cert-manager — cert-manager automatically obtains and renews TLS certificates from Let's Encrypt. Configure Ingress with kubernetes.io/ingress.class: nginx and cert-manager.io/cluster-issuer: letsencrypt-prod. (2) Service with LoadBalancer — upload TLS certificate to the cloud provider's load balancer (AWS ALB, Azure LB). cert-manager is the preferred approach for production — it's automated, free, and integrates with most Ingress controllers.
A PersistentVolume (PV) is a cluster-level storage resource (e.g., an EBS volume, Azure Disk, NFS share). A PersistentVolumeClaim (PVC) is a request for storage by a pod. With dynamic provisioning (preferred), a StorageClass automatically creates the PV when a PVC is submitted — no manual PV creation needed. The pod mounts the PVC via volumes and volumeMounts.
A StorageClass defines the type of storage (e.g., gp2 for AWS EBS, azure-disk for Azure). It includes provisioner (the driver that creates the PV), parameters (size, IOPS, encryption), and reclaimPolicy (Retain, Delete). With dynamic provisioning, when a PVC requests a StorageClass, Kubernetes automatically creates a PV matching the class. This is the standard way to manage storage in production.
Retain — when the PVC is deleted, the PV and its underlying storage are NOT automatically deleted. The PV remains in Released state — you can manually recover the data. Delete — when the PVC is deleted, the PV and underlying storage are automatically deleted. Use Retain for production databases (you want to prevent accidental data loss). Use Delete for ephemeral storage (caches, temporary data).
(1) Install EBS CSI Driver (Amazon's driver for EBS volumes). (2) Create a StorageClass for EBS (e.g., gp3). (3) Create a PVC referencing the StorageClass. (4) The EBS CSI Driver automatically creates an EBS volume and attaches it to the pod. (5) The pod mounts the volume via volumeMounts. For production, use EFS for shared storage (ReadWriteMany) and EBS for single-pod storage (ReadWriteOnce).
emptyDir is a temporary volume that exists as long as the pod exists. It is created when the pod is scheduled and deleted when the pod is removed. Use for caches, scratch space, or temporary data. hostPath mounts a file or directory from the host node's filesystem into the pod. Use for accessing node-level data (e.g., logs, Docker socket). hostPath is not portable across nodes — avoid in production unless necessary.
CSI is a standard for exposing storage systems to Kubernetes. It defines a set of gRPC APIs that storage vendors implement. Kubernetes communicates with the CSI driver to: CreateVolume, DeleteVolume, ControllerPublishVolume (attach), NodePublishVolume (mount). All cloud providers have CSI drivers (EBS CSI, Azure Disk CSI, GCE PD CSI). CSI is the modern way to integrate storage into Kubernetes.
Strategies: (1) Volume snapshots — use CSI snapshot APIs (e.g., EBS snapshots, Azure Disk snapshots). (2) Database-specific backups — use mysqldump or pg_dump in a Job. (3) Velero (formerly Heptio Ark) — backups entire cluster state (resources + volumes). Velero supports CSI snapshots and is the standard backup solution for Kubernetes. For production, run automated backups with Velero and store them in S3 or Azure Blob.
EBS (Elastic Block Store) — block storage, ReadWriteOnce (attached to a single node). High performance, low latency. Best for databases (MySQL, PostgreSQL). EFS (Elastic File System) — file storage, ReadWriteMany (multiple pods can read/write simultaneously). Lower performance, higher latency. Best for shared storage (WordPress, content management, log aggregation). Use EBS for stateful databases. Use EFS for shared storage across multiple pods.
A volume snapshot is a point-in-time copy of a PersistentVolume. CSI drivers support snapshot operations: VolumeSnapshot (the resource), VolumeSnapshotClass (configuration). Restore by creating a new PVC from the snapshot. Benefits: (1) Fast recovery — minutes instead of hours. (2) Cost-effective — incremental snapshots. (3) Consistent backups — for database volumes, use pre-stop hooks to flush data before snapshot. Use Velero to automate snapshot creation and retention.
Storage scaling depends on the backend: (1) EBS — not resizable in place. Create a new PV from a larger snapshot. (2) EFS — automatically scales to PB+ size. (3) Azure Disk — resizable in place (use pvc.spec.resources.requests.storage update). (4) GCE PD — resizable in place. Best practice: Use dynamic provisioning with a StorageClass that supports expansion (allowVolumeExpansion: true). For EBS, migrate to a larger volume using snapshot restore.
RBAC has three objects: Role/ClusterRole (permissions), ServiceAccount/User/Group (who), RoleBinding/ClusterRoleBinding (links them). To grant read-only pod access: create a Role with get, list, watch verbs on pods resource, create a ServiceAccount, bind them with a RoleBinding. View effective permissions with kubectl auth can-i list pods --as=system:serviceaccount:namespace:sa-name. Always use namespace-scoped Roles over ClusterRoles unless cluster-wide access is genuinely needed.
Role is namespace-scoped — it applies only to resources in a specific namespace. ClusterRole is cluster-scoped — it applies to all namespaces (and cluster-scoped resources like nodes, PVs, CRDs). Use a Role for namespace-specific permissions. Use a ClusterRole for: (1) Permissions to cluster-scoped resources. (2) Permissions that should apply across all namespaces (e.g., cluster-admin). (3) Permissions used with a ClusterRoleBinding (binding applies to all namespaces).
PodSecurityPolicy (PSP) was a Kubernetes feature for controlling pod security (privileged containers, volume types, host network access). It was deprecated in 1.21 and removed in 1.25 due to complexity and usability issues. Replacement: Pod Security Admission Controller (PSA) — simpler, uses built-in policies: privileged (allow all), baseline (restrict known escalations), restricted (strict). PSA is enabled by default in Kubernetes 1.24+.
A ServiceAccount represents an identity for a pod. Each namespace has a default ServiceAccount. Pods use ServiceAccounts to authenticate with the Kubernetes API server. A User represents a human identity (authenticated via certs, tokens, or OIDC). ServiceAccounts are used for programmatic access; Users for human access. Best practice: create a dedicated ServiceAccount for each deployment or application, and grant only the permissions it needs.
The principle of least privilege means each pod or user should have only the permissions necessary to perform its function. In Kubernetes: (1) Use RBAC to grant minimal permissions. (2) Avoid cluster-admin for service accounts. (3) Use Pod Security Admission to restrict privileged containers. (4) Set network policies to limit pod-to-pod communication. (5) Use secrets management (Vault, External Secrets). Least privilege reduces attack surface and limits blast radius of security incidents.
Securing the API server: (1) Use TLS — enable client certificate authentication. (2) Disable anonymous access — set --anonymous-auth=false. (3) Enable RBAC--authorization-mode=RBAC. (4) Use webhook authentication — integrate with OIDC (Azure AD, Google). (5) Restrict access to etcd — etcd holds secrets and cluster state — use TLS and firewall. (6) Enable audit logging — track API access. (7) Use a network policy to restrict API server access to trusted sources.
An admission controller intercepts API requests after authentication and authorisation, before persistence. It can modify or reject requests. Common controllers: (1) MutatingAdmissionWebhook — modify resources (e.g., Istio adds sidecars). (2) ValidatingAdmissionWebhook — validate resources (e.g., OPA Gatekeeper). (3) PodSecurity — enforce pod security standards (PSA). (4) ResourceQuota — enforce namespace quotas. Admission controllers are critical for security and policy enforcement.
Strategies: (1) External Secrets Operator — fetches secrets from AWS Secrets Manager, Vault, Azure Key Vault, and syncs them as Kubernetes Secrets. (2) HashCorp Vault Agent Injector — sidecar injects secrets directly into pods as files. (3) Sealed Secrets — encrypts secrets in Git, decrypted only in cluster. (4) CSI Secrets Store Driver — mounts secrets from external stores. Best practice: use External Secrets Operator (ESO) with automatic rotation — when the secret changes in the external store, ESO updates the Kubernetes Secret.
kube-bench is a tool that checks Kubernetes clusters against the CIS Kubernetes Benchmark — hundreds of security best practices (RBAC, API server config, etcd, kubelet). kube-hunter is a penetration testing tool that hunts for security weaknesses in Kubernetes clusters (exposed API, insecure secrets, unauthenticated etcd). Run kube-bench regularly (e.g., weekly) to assess compliance. Run kube-hunter after each major cluster change to detect new vulnerabilities.
OPA Gatekeeper is an admission controller that uses Open Policy Agent (OPA) to enforce policies on Kubernetes resources. You define ConstraintTemplates (reusable policy logic) and Constraints (actual policies). Examples: enforce labels on all resources, restrict container images to approved registries, require resource limits on every pod. Gatekeeper is the most popular policy engine for Kubernetes, enabling shift-left security and compliance.
Step 1: kubectl describe pod <name> — check Events section for scheduling failures. Step 2: Check for insufficient resources (CPU/memory). Step 3: Check node selector or affinity — no nodes match the criteria. Step 4: Check taints and tolerations — if nodes are tainted, the pod needs a toleration. Step 5: Check PVC binding — if the pod requires a PVC that is not bound. Step 6: If all else fails, check for node capacity — use kubectl get nodes --show-labels.
Step 1: kubectl describe pod <name> — check Events section for OOMKilled, failed probes, or image pull errors. Step 2: kubectl logs <pod> --previous — see logs from the crashed container. Step 3: Check exit code in describe output — exit code 1 is application error, 137 is OOMKilled, 126/127 is missing executable. Common fixes: increase memory limits (OOMKilled), fix liveness probe timing, correct the ENTRYPOINT command, or fix application startup errors visible in logs.
kubectl describe pod/<name> shows detailed information about a pod: (1) Events — the most useful part (scheduling failures, image pull issues, OOM). (2) Status — pod phase (Running, Pending, Failed). (3) Node — where the pod is scheduled. (4) Containers — image, ports, volume mounts, probe settings. (5) QoS — Guaranteed, Burstable, BestEffort. Always start troubleshooting with kubectl describe.
Use the -c flag: kubectl logs <pod-name> -c <container-name>. To see logs from all containers: kubectl logs <pod-name> --all-containers=true. For a continuous stream: kubectl logs -f <pod-name> -c <container-name>. For multi-container pods (e.g., sidecar patterns), this is essential. Also, kubectl logs --previous shows logs from a previous (crashed) container instance.
Use kubectl exec -it <pod-name> -- /bin/sh (or /bin/bash). This opens an interactive shell inside the container. If the container doesn't have a shell, you can use kubectl debug -it <pod-name> --image=busybox --share-processes to spin up a debug container sharing the target pod's process namespace.
Step 1: Check label selectorkubectl get service --show-labels and verify the selector matches pods. Step 2: kubectl describe service — check Endpoints list. If empty, no pods match the selector. Step 3: kubectl get pods --show-labels — confirm pods have the labels the Service expects. Step 4: Check port mappingservice.spec.ports.targetPort must match containerPort. Step 5: Check network policy — if a NetworkPolicy blocks ingress to the pods.
Step 1: kubectl get nodes — check status. If NotReady, Step 2: kubectl describe node <node-name> — look at Conditions section (OutOfDisk, MemoryPressure, DiskPressure, PIDPressure). Step 3: SSH into the node and check: systemctl status kubelet, journalctl -u kubelet -f. Step 4: Check disk space: df -h. Step 5: Check Docker/containerd: systemctl status docker. Common causes: kubelet not running, out of disk space, network issues.
Liveness probe checks if the container is alive — if it fails, Kubernetes restarts the container. Readiness probe checks if the container is ready to serve traffic — if it fails, Kubernetes removes the pod from service endpoints. Use liveness for deadlock detection. Use readiness for slow startup or temporary unavailability. Best practice: set initialDelaySeconds for both to avoid premature failures.
kubectl port-forward pod/<pod-name> 8080:80 forwards local port 8080 to pod port 80. This bypasses Services and NetworkPolicies — useful for debugging a single pod directly. For Services: kubectl port-forward service/<service-name> 8080:80. Use port-forward for: (1) Testing an unreleased version. (2) Accessing a debug endpoint. (3) Debugging a specific pod behind a load balancer. Never use port-forward in production.
A PodDisruptionBudget (PDB) limits how many pods of a deployment can be unavailable during voluntary disruptions (node drains, cluster upgrades). Example: minAvailable: 2 ensures at least 2 pods are running at all times during a drain. Without a PDB, draining a node could evict all replicas of a 3-replica deployment. Set PDBs for every production Deployment — it prevents accidental downtime and is required for passing most security audits.
A Operator is a Kubernetes controller that extends the API with custom resources (CRDs) and automates the management of complex applications (databases, message queues, monitoring). It encapsulates domain-specific knowledge (backup, restore, scaling, upgrade). Use the Operator pattern when: (1) You need to automate complex application lifecycle management. (2) You want to provide a Kubernetes-native API for your application. (3) You need to manage stateful applications with custom logic.
A CustomResourceDefinition (CRD) is a way to extend the Kubernetes API by defining a new resource type (e.g., Database, MongoDBCluster). CRDs are simple to create — you define the schema (OpenAPI). API Extension (Aggregated API) is a more complex approach where you build a separate API server and register it with the main API server. Use CRD for most extensions (95% of cases). Use API Aggregation for advanced authentication/authorisation, or when you need a separate API server.
Helm is the package manager for Kubernetes. A chart is a collection of templated Kubernetes manifests with a values.yaml for configuration. Benefits: (1) Install complex applications with one command (helm install my-nginx ingress-nginx/ingress-nginx). (2) Manage environment-specific config via values files. (3) Upgrade and rollback releases. (4) Share reusable charts via Artifact Hub. In CI/CD, Helm is used to parameterise deployments — update the image tag in values.yaml and run helm upgrade.
GitOps uses a Git repository as the single source of truth for infrastructure and application state. Any change to production goes through a Git commit — the system continuously reconciles actual state with desired state. ArgoCD watches a Git repo containing Kubernetes manifests or Helm charts; when a diff is detected between the repo and the cluster, it automatically syncs (or notifies). Benefits: full audit trail, easy rollback (git revert), and no kubectl access needed for developers.
HPA automatically scales the number of pods based on observed metrics. Basic CPU-based example: kubectl autoscale deployment web --cpu-percent=70 --min=2 --max=20. The HPA controller checks metrics every 15 seconds. Requires metrics-server installed in the cluster. Advanced HPA can scale on custom metrics (e.g., requests per second from Prometheus via the custom.metrics.k8s.io API). Set meaningful minimum replicas to avoid cold-start latency, and ensure resource requests are set (HPA needs them for percentage calculation).
HPA (Horizontal Pod Autoscaler) scales the number of pod replicas. VPA (Vertical Pod Autoscaler) scales the resource requests/limits of a pod — it adjusts CPU and memory based on usage. Use HPA for stateless workloads that can scale horizontally. Use VPA for stateful workloads where scaling replicas is difficult (databases). VPA can work alongside HPA but with caution (mutual interference). For most applications, HPA is the better choice.
KEDA (Kubernetes Event-Driven Autoscaling) extends HPA to scale based on external events (SQS queue depth, Kafka lag, Prometheus metrics). It is an operator that works alongside HPA. Example: scale a deployment based on the number of messages in an SQS queue. KEDA is ideal for event-driven applications and can scale to zero (no replicas when no events). Use KEDA when your application is driven by external event sources and you need dynamic scaling.
An Admission Webhook is an HTTP callback that receives a Kubernetes API request before it is persisted. Two types: (1) MutatingAdmissionWebhook — can modify the request (e.g., add labels, inject sidecars). (2) ValidatingAdmissionWebhook — can accept or reject the request (e.g., enforce policies). Webhooks are the foundation of policy engines like OPA Gatekeeper. Implementation: deploy a webhook server (HTTPS), register with the API server via a MutatingWebhookConfiguration or ValidatingWebhookConfiguration.
Initializers were a mechanism for running a controller before an object is created. Deprecated in 1.14 and removed in 1.18. Replacement: Admission Webhooks (MutatingAdmissionWebhook) and Initializers via Operator (CRD with finalizers). The initializer pattern was complex and error-prone — admission webhooks provide the same functionality with better control and observability.
Taints are applied to nodes — they prevent pods from scheduling on that node unless the pod has a matching Toleration. Use taints for: (1) Dedicated nodes — only certain pods can run. (2) Node maintenance — taint node before draining. (3) GPU nodes — only pods with GPU tolerations can schedule. Tolerations are added to pod specs. Example: tolerations: [{"key": "gpu", "operator": "Equal", "value": "true", "effect": "NoSchedule"}].
CKA is a performance-based exam (2 hours, live cluster). Preparation: (1) Use Kubernetes the Hard Way by Kelsey Hightower — hands-on cluster setup. (2) CKA practice questions — use killer.sh, CKA practice labs. (3) Master kubectl — efficient use of imperative commands (kubectl run --image, kubectl create deployment). (4) Know the documentation — exam is open-book, but you need to know what to search for. (5) Time management — don't spend too long on one question. Thick Brain's CKA preparation course includes full practice exams.
CKA (Certified Kubernetes Administrator) focuses on cluster administration, installation, configuration, troubleshooting, and networking. CKAD (Certified Kubernetes Application Developer) focuses on developing, deploying, and troubleshooting applications. CKA is more operations-focused (SRE, DevOps). CKAD is more development-focused (backend, microservices). CKA is generally considered harder and more valued by employers. Start with CKA.
(1) Flag and skip — if a question takes more than 5 minutes, flag it and move on. (2) Use imperative commandskubectl run, kubectl expose are faster than writing YAML. (3) Keep a cheat sheet — common commands and YAML snippets (allowed). (4) Practice with a timer — simulate 2-hour exam with 15-20 questions. (5) Don't overthink — most questions have a straightforward solution. (6) Use aliasesalias k=kubectl saves time.
(1) Not reading the question carefully — missing resource names, namespaces, or port numbers. (2) Using the wrong syntax — e.g., --image vs --image-pull-policy. (3) Forgetting to switch to the correct namespace — many questions require -n or --namespace. (4) Not testing after creating a resource — check pod status before moving on. (5) Spending too much time on one question — flag and revisit. (6) Not using the documentation — you have access to kubernetes.io — use it.
killer.sh is the official CKA practice environment. Steps: (1) Purchase the CKA exam (includes 2 free killer.sh attempts). (2) Access killer.sh via your CNCF account. (3) You get 36 hours of simulated exam time. (4) Questions are similar to the actual exam — they cover the same domains. (5) After completion, you can review your answers and see the correct solutions. killer.sh is the most effective CKA preparation tool — use both attempts fully before taking the real exam.
CKS (Certified Kubernetes Security Specialist) is an advanced certification focusing on Kubernetes security (RBAC, NetworkPolicy, admission control, image scanning, runtime security). Prerequisite: CKA. CKS builds on CKA by adding security-specific topics. If you work in a security-sensitive environment or want to specialise in K8s security, pursue CKS after CKA.
The CKA exam is open-book — you can access https://kubernetes.io/docs. Tips: (1) Bookmark key pageskubernetes.io/docs/reference/kubectl/, kubernetes.io/docs/tasks/. (2) Use search — Ctrl+F is faster than navigating. (3) Look for examples — most pages have YAML examples you can copy and modify. (4) Don't rely on memory — know what to search for, not the exact syntax. (5) Practice with the documentation — during your preparation, use the docs to solve problems.
CKA domains: (1) Cluster Architecture, Installation & Configuration (25%) — cluster setup, etcd, API server, kubelet. (2) Workloads & Scheduling (15%) — Deployments, StatefulSets, Jobs, scheduling. (3) Services & Networking (20%) — Service, Ingress, NetworkPolicy, DNS. (4) Storage (10%) — PV, PVC, StorageClass, CSI. (5) Troubleshooting (30%) — diagnosis of pods, nodes, services, networking issues. The heaviest domain is troubleshooting — focus on that during preparation.
CKA exam cost: USD 395 (includes one free retake). Register via the CNCF training and certification portal. Steps: (1) Create a CNCF account. (2) Select CKA certification. (3) Pay USD 395. (4) Schedule the exam within 12 months. (5) You get 2 exam attempts. The exam is proctored (online) and lasts 2 hours. Retakes must be completed within 12 months of the original purchase.
Thick Brain Technology's Advanced Kubernetes (EKS) course includes comprehensive CKA preparation: (1) 60 hours of live training covering all CKA domains. (2) Real EKS cluster labs — practice with actual K8s clusters. (3) CKA practice questions — simulated exam experiences. (4) killer.sh integration — practice with the official CKA exam simulator. (5) Instructor-led CKA review sessions — targeted preparation for the exam. (6) Placement support — help with job applications after certification. Book a free demo to start your CKA journey.

Frequently Asked Questions

The CKA (Certified Kubernetes Administrator) is a performance-based exam by CNCF. You solve real Kubernetes tasks in a live cluster over 2 hours — no multiple choice. It is the most credible K8s certification and widely recognised by Bangalore's top tech employers.
Kubernetes engineers in Bangalore earn ₹8-14 LPA at entry level (1-3 years), ₹14-24 LPA at mid-level (3-6 years), and ₹22-38 LPA for senior architects. CKA holders earn 25-35% more than non-certified peers at every experience level.
Yes — Docker fundamentals are a prerequisite for Kubernetes. You should understand container images, Dockerfiles, container networking and registries before starting Kubernetes. Thick Brain Technology's DevOps courses cover Docker before progressing to Kubernetes.
EKS (AWS) is best for AWS-centric organisations. AKS (Azure) is best for Microsoft/Azure orgs. GKE (Google) is best for advanced K8s features and is the most mature managed Kubernetes. All three abstract away control plane management — choose based on your cloud provider.
With structured training, most learners become proficient in 10-12 weeks. Thick Brain Technology's Kubernetes course covers EKS, AKS, GKE, CKA preparation and real cluster labs — you'll be job-ready in 3-4 months.
Thick Brain Technology offers comprehensive live online Kubernetes training with real cluster labs, CKA preparation, and placement support. The course covers EKS, AKS, GKE, Helm, GitOps and advanced Kubernetes concepts. Book a free demo to see our live cluster labs.

Conclusion: Master Kubernetes in 2026

Kubernetes proficiency is no longer optional for DevOps engineers and cloud architects in 2026. Whether you specialise in AWS EKS, Azure AKS, or Google GKE, the core Kubernetes skills transfer across all platforms. Pursuing the CKA certification validates these skills in a way that employers trust.

The market rewards engineers who combine strong Kubernetes fundamentals with real cluster experience. CKA-certified engineers command 25-35% salary premiums and are in high demand across Bangalore's top tech companies.

Thick Brain Technology offers advanced Kubernetes training on all three managed platforms — EKS, AKS and GKE — with real cluster environments and CKA-aligned practice scenarios. Book a free demo class to deploy your first Kubernetes application live.

Master Kubernetes with Real Cluster Labs

Book a free demo class and deploy your first Kubernetes application live. No payment required.

Share this article