82% of cloud-native organisations use Kubernetes in production — K8s skills are now essential
Managed Kubernetes (EKS, AKS, GKE) handles 70%+ of production workloads
CKA certification delivers 25-35% salary premium for Kubernetes engineers
Kubernetes engineers in Bangalore earn ₹8-14 LPA at entry level, ₹22-38 LPA+ for senior roles
Thick Brain Technology offers real cluster Kubernetes training with CKA preparation and placement support
Kubernetes has become the operating system of the cloud-native world. In 2026, 82% of cloud-native organisations run Kubernetes in production, and managed Kubernetes services like Amazon EKS, Azure AKS and Google GKE have made it accessible to organisations of all sizes. For DevOps engineers and cloud architects, Kubernetes proficiency is no longer a differentiator — it is a baseline requirement. This guide covers Kubernetes architecture, the three major managed services, CKA certification preparation, career paths and the AI capabilities that modern Kubernetes practitioners need.
📊 Kubernetes Market Snapshot — 2026
82%
Cloud-native orgs use Kubernetes in production
Top 3
Most-hired DevOps profile in Bangalore
25-35%
Salary premium for CKA certification
70%+
Managed K8s workloads via EKS/AKS/GKE
What is Kubernetes?
Kubernetes (K8s) is an open-source container orchestration platform originally developed by Google, now maintained by the Cloud Native Computing Foundation (CNCF). It automates the deployment, scaling and management of containerised applications across clusters of machines.
At its core, Kubernetes solves the problem of running hundreds or thousands of containers at scale. Instead of manually starting and connecting containers, you declare the desired state of your system in YAML manifests — and Kubernetes continuously works to maintain that state. Pods get scheduled, restarted, scaled, load-balanced and updated automatically.
💡 Why Kubernetes in 2026? Kubernetes is the industry standard for container orchestration. Every major cloud provider offers a managed K8s service (EKS, AKS, GKE), and the job market for Kubernetes-skilled engineers continues to grow faster than supply.
Kubernetes Architecture
Control Plane Components
API Server — The frontend of the Kubernetes control plane; all commands go through here
etcd — Distributed key-value store; the source of truth for all cluster state
Scheduler — Assigns newly created pods to nodes based on resource requirements
Controller Manager — Runs controllers that maintain desired state (ReplicaSets, Deployments, etc.)
Worker Node Components
kubelet — Agent running on every node; ensures containers are running as specified
kube-proxy — Handles network proxying and load balancing for services
Container Runtime — Docker, containerd or CRI-O; runs the actual containers
Kubernetes Learning Roadmap: 6-Stage Path
This roadmap is used in Thick Brain Technology's Advanced Kubernetes training program — 60 hours of live training, real cluster labs, and CKA preparation.
The Certified Kubernetes Administrator (CKA) is the gold standard Kubernetes certification. Unlike multiple-choice exams, CKA is 100% performance-based — you solve real Kubernetes tasks in a live cluster environment over 2 hours.
These are the certifications that appear most frequently in senior Kubernetes job descriptions across Bengaluru, Hyderabad and Pune.
⚓ Most Respected
CKA — Certified Kubernetes Administrator
By CNCF. 100% performance-based in a live K8s cluster. Universally recognised by Indian and global employers. Best first certification for K8s engineers.
☁️ AWS
AWS Certified Kubernetes (EKS)
For engineers focused on AWS EKS — covers control plane management, node groups, IAM integration, and EKS best practices.
☁️ Azure
Azure Kubernetes (AKS) Certification
Validates AKS skills — cluster setup, networking, security, monitoring, and CI/CD integration with Azure DevOps.
☁️ GCP
GKE Professional Certification
For Google Cloud-focused engineers — covers GKE architecture, autopilot, security, and advanced networking features.
🎓 Practitioner
Thick Brain Advanced Kubernetes (EKS)
60 hours live training, real EKS cluster labs, CKA preparation, Helm, GitOps, ArgoCD, and placement support until hired.
Kubernetes Engineer Salary Guide 2026
Salary data based on Bangalore market rates, job postings, and Thick Brain placement data (2025–2026).
Role
Experience
Bangalore Salary (2026)
Junior DevOps / K8s Engineer
0-2 years
₹6 – 10 LPA
DevOps Engineer (K8s specialist)
2-5 years
₹12 – 20 LPA
Senior K8s / Cloud Architect
5-8 years
₹20 – 32 LPA
Platform Engineer / SRE
4-8 years
₹18 – 30 LPA
CKA Premium (any level)
—
+25-35% above base
Source: Naukri.com, LinkedIn Jobs, Thick Brain placement data, June 2026
The most comprehensive Kubernetes interview question bank for Bangalore tech companies — covering core concepts, networking, storage, security, troubleshooting, and managed services (EKS, AKS, GKE). Use search and category filters to focus your preparation.
Showing 100 questions
Deployment manages stateless pods — pods are interchangeable, get random names, and can be replaced in any order. For web servers, APIs, microservices. StatefulSet manages stateful applications — each pod gets a stable, ordered name (e.g., mysql-0, mysql-1) and a dedicated PersistentVolume. Pods start/stop in order. Use for databases (MySQL, PostgreSQL, MongoDB), message queues (Kafka), and any app requiring stable network identity or persistent per-pod storage.
A Pod is the smallest deployable unit in Kubernetes. It contains one or more containers that share the same network namespace and storage volume. Containers inside a Pod can communicate via localhost. Pods are ephemeral — they are created and destroyed frequently. In practice, you rarely create Pods directly — you use higher-level controllers like Deployments.
A Service exposes a set of pods internally or externally via ClusterIP, NodePort, or LoadBalancer. A LoadBalancer service creates a cloud load balancer (e.g., AWS ELB) for each service — expensive. An Ingress is a single entry point that routes HTTP/HTTPS traffic to multiple services based on hostname or path rules, using one load balancer. Example: api.example.com → api-service, app.example.com → web-service. Requires an Ingress Controller (nginx-ingress, AWS ALB Ingress, Traefik).
A Namespace is a virtual cluster within a physical cluster — scoping names, RBAC, network policies, and resource quotas. Common pattern: separate namespaces for dev, staging, production, or by team. Use ResourceQuotas to limit CPU/memory per namespace. Use LimitRanges to set default requests/limits for pods. Note: namespaces do NOT provide network isolation by default — add NetworkPolicies for that. Cluster-scoped resources (nodes, PVs, ClusterRoles) are not namespaced.
ConfigMap stores non-sensitive configuration (app settings, feature flags) as key-value pairs. Secret stores sensitive data (passwords, tokens, TLS certificates) encoded in base64. Security limitation: Kubernetes Secrets are base64-encoded, not encrypted — anyone with etcd access or the right RBAC permissions can decode them. Best practice: enable encryption at rest for etcd, restrict Secret access with RBAC, or use an external secrets manager (HashiCorp Vault, AWS Secrets Manager) with the External Secrets Operator.
A rolling update gradually replaces old pods with new ones. maxUnavailable: maximum number of pods that can be unavailable during the update (default 25%). maxSurge: maximum number of pods that can be created above the desired count (default 25%). Example with 4 replicas: maxUnavailable=1 means at least 3 pods are always running; maxSurge=1 allows up to 5 pods temporarily. Set maxUnavailable=0 and maxSurge=1 for zero-downtime deployments. Monitor with kubectl rollout status deployment/app.
Requests: the guaranteed minimum resources — used by the scheduler to place pods on nodes. Limits: the maximum allowed. CPU limit: the container is throttled (slowed down) when it hits the limit. Memory limit: the container is OOMKilled (exit code 137) and restarted — this causes CrashLoopBackOff if the OOM is consistent. Best practice: set requests equal to typical usage, limits at 1.5–2x requests. Use kubectl top pods to measure actual usage before setting values.
A DaemonSet ensures exactly one pod runs on every node (or a subset matching a node selector). Use cases: log collection (Fluentd, Fluent Bit), monitoring agents (Prometheus node-exporter, Datadog agent), network plugins (Calico, Weave), storage drivers. When a new node joins the cluster, the DaemonSet automatically schedules a pod on it. When a node is removed, the pod is garbage collected.
Job runs a task to completion — when all pods successfully terminate, the Job is considered complete. Used for batch processing, database migrations, backup jobs. CronJob runs a Job on a schedule (like cron). Example: */5 * * * * — every 5 minutes. Important: CronJobs may skip executions if the controller is down — for critical tasks, use a dedicated job scheduler with retry logic.
The scheduler runs in two phases: Filtering — eliminates nodes that cannot run the pod (insufficient CPU/memory, failed taints, node affinity mismatch, unmet volume requirements). Scoring — ranks remaining nodes by factors including resource availability, pod affinity, image locality, and inter-pod spreading. The highest-scoring node wins. If no node passes filtering, the pod stays Pending. Check: kubectl describe pod <name> shows scheduler events explaining why a pod is Pending.
Kubernetes runs CoreDNS as a cluster DNS server. Every Service gets a DNS record: <service-name>.<namespace>.svc.cluster.local. Pods within the same namespace can use just the service name. Cross-namespace: use the full FQDN. For StatefulSet pods, each gets its own DNS: pod-0.service.namespace.svc.cluster.local. This is how microservices discover each other — no hardcoded IPs. Headless services (ClusterIP: None) return individual pod IPs directly, used by StatefulSets and service meshes.
By default, all pods in a cluster can communicate with each other. A NetworkPolicy restricts which pods can talk to which, acting like a firewall at layer 3/4. Example: allow only the api pod to access the db pod on port 5432, deny all other ingress. Requires a CNI that supports NetworkPolicy (Calico, Cilium — Flannel alone does not). Start with a default-deny policy in each namespace, then add explicit allow rules. Essential for PCI-DSS and SOC2 compliance.
NodePort exposes the service on a static port (30000-32767) on every node's IP. Use for local development or testing. LoadBalancer provisions a cloud load balancer (AWS ELB, Azure LB) and routes traffic to the service. LoadBalancer is expensive ($20+/month). For production, use Ingress instead of LoadBalancer for HTTP/HTTPS services. NodePort is rarely used in production (except for legacy apps).
A Headless Service has ClusterIP: None. It returns the DNS of individual pod IPs instead of a single service IP. Use cases: (1) StatefulSets — each pod needs a stable network identity. (2) Service discovery — when you need to directly address pods (e.g., for database clusters). (3) Custom load balancing — when you want to implement your own load balancing logic.
CNI is a standard for configuring network interfaces in containers. Common plugins: (1) Calico — supports NetworkPolicy, BGP routing, most popular in production. (2) Cilium — uses eBPF for high performance, supports NetworkPolicy, service mesh. (3) Flannel — simple, no NetworkPolicy support, easy to set up. (4) Weave — easy to use, supports encryption. Choose Calico or Cilium for production clusters that need NetworkPolicy.
Best practice: (1) Use Ingress with an Ingress Controller (nginx-ingress, AWS ALB Ingress). (2) Configure TLS termination on the Ingress using cert-manager (automatic Let's Encrypt). (3) Set up NetworkPolicy to restrict ingress traffic to the Ingress controller only. (4) Use CloudFlare or AWS CloudFront as a CDN/DDoS protection layer. (5) Enable WAF (AWS WAF, ModSecurity) for HTTP inspection.
iptables mode (default) creates iptables rules for each service — the kernel handles packet forwarding. Good for small clusters, but linear scaling (O(n) rules). IPVS mode (IP Virtual Server) uses Linux's native load balancing — supports higher throughput, O(1) lookup, and more sophisticated algorithms (RR, LC, DH). IPVS is recommended for large clusters (1000+ services) due to performance gains.
A Service Mesh adds a dedicated infrastructure layer for service-to-service communication. Key features: Traffic management (canary, blue-green), Observability (metrics, traces, logs), Security (mTLS, RBAC). Popular mesh options: Istio (most powerful, complex), Linkerd (lightweight, simpler), Consul. Service meshes are optional but recommended for microservices with complex traffic patterns. Kubernetes does not include a service mesh by default — you add it as a separate layer.
Gateway API is a newer, more expressive Kubernetes API for traffic routing. Unlike Ingress (focused on HTTP/HTTPS), Gateway API supports L4 and L7, and is protocol-agnostic (TCP, UDP, HTTP, gRPC). Key resources: Gateway (proxies), HTTPRoute (routing rules). Gateway API is vendor-neutral and supported by all major Ingress controllers. It is the successor to Ingress and recommended for new projects.
Two approaches: (1) Ingress with cert-manager — cert-manager automatically obtains and renews TLS certificates from Let's Encrypt. Configure Ingress with kubernetes.io/ingress.class: nginx and cert-manager.io/cluster-issuer: letsencrypt-prod. (2) Service with LoadBalancer — upload TLS certificate to the cloud provider's load balancer (AWS ALB, Azure LB). cert-manager is the preferred approach for production — it's automated, free, and integrates with most Ingress controllers.
A PersistentVolume (PV) is a cluster-level storage resource (e.g., an EBS volume, Azure Disk, NFS share). A PersistentVolumeClaim (PVC) is a request for storage by a pod. With dynamic provisioning (preferred), a StorageClass automatically creates the PV when a PVC is submitted — no manual PV creation needed. The pod mounts the PVC via volumes and volumeMounts.
A StorageClass defines the type of storage (e.g., gp2 for AWS EBS, azure-disk for Azure). It includes provisioner (the driver that creates the PV), parameters (size, IOPS, encryption), and reclaimPolicy (Retain, Delete). With dynamic provisioning, when a PVC requests a StorageClass, Kubernetes automatically creates a PV matching the class. This is the standard way to manage storage in production.
Retain — when the PVC is deleted, the PV and its underlying storage are NOT automatically deleted. The PV remains in Released state — you can manually recover the data. Delete — when the PVC is deleted, the PV and underlying storage are automatically deleted. Use Retain for production databases (you want to prevent accidental data loss). Use Delete for ephemeral storage (caches, temporary data).
(1) Install EBS CSI Driver (Amazon's driver for EBS volumes). (2) Create a StorageClass for EBS (e.g., gp3). (3) Create a PVC referencing the StorageClass. (4) The EBS CSI Driver automatically creates an EBS volume and attaches it to the pod. (5) The pod mounts the volume via volumeMounts. For production, use EFS for shared storage (ReadWriteMany) and EBS for single-pod storage (ReadWriteOnce).
emptyDir is a temporary volume that exists as long as the pod exists. It is created when the pod is scheduled and deleted when the pod is removed. Use for caches, scratch space, or temporary data. hostPath mounts a file or directory from the host node's filesystem into the pod. Use for accessing node-level data (e.g., logs, Docker socket). hostPath is not portable across nodes — avoid in production unless necessary.
CSI is a standard for exposing storage systems to Kubernetes. It defines a set of gRPC APIs that storage vendors implement. Kubernetes communicates with the CSI driver to: CreateVolume, DeleteVolume, ControllerPublishVolume (attach), NodePublishVolume (mount). All cloud providers have CSI drivers (EBS CSI, Azure Disk CSI, GCE PD CSI). CSI is the modern way to integrate storage into Kubernetes.
Strategies: (1) Volume snapshots — use CSI snapshot APIs (e.g., EBS snapshots, Azure Disk snapshots). (2) Database-specific backups — use mysqldump or pg_dump in a Job. (3) Velero (formerly Heptio Ark) — backups entire cluster state (resources + volumes). Velero supports CSI snapshots and is the standard backup solution for Kubernetes. For production, run automated backups with Velero and store them in S3 or Azure Blob.
EBS (Elastic Block Store) — block storage, ReadWriteOnce (attached to a single node). High performance, low latency. Best for databases (MySQL, PostgreSQL). EFS (Elastic File System) — file storage, ReadWriteMany (multiple pods can read/write simultaneously). Lower performance, higher latency. Best for shared storage (WordPress, content management, log aggregation). Use EBS for stateful databases. Use EFS for shared storage across multiple pods.
A volume snapshot is a point-in-time copy of a PersistentVolume. CSI drivers support snapshot operations: VolumeSnapshot (the resource), VolumeSnapshotClass (configuration). Restore by creating a new PVC from the snapshot. Benefits: (1) Fast recovery — minutes instead of hours. (2) Cost-effective — incremental snapshots. (3) Consistent backups — for database volumes, use pre-stop hooks to flush data before snapshot. Use Velero to automate snapshot creation and retention.
Storage scaling depends on the backend: (1) EBS — not resizable in place. Create a new PV from a larger snapshot. (2) EFS — automatically scales to PB+ size. (3) Azure Disk — resizable in place (use pvc.spec.resources.requests.storage update). (4) GCE PD — resizable in place. Best practice: Use dynamic provisioning with a StorageClass that supports expansion (allowVolumeExpansion: true). For EBS, migrate to a larger volume using snapshot restore.
RBAC has three objects: Role/ClusterRole (permissions), ServiceAccount/User/Group (who), RoleBinding/ClusterRoleBinding (links them). To grant read-only pod access: create a Role with get, list, watch verbs on pods resource, create a ServiceAccount, bind them with a RoleBinding. View effective permissions with kubectl auth can-i list pods --as=system:serviceaccount:namespace:sa-name. Always use namespace-scoped Roles over ClusterRoles unless cluster-wide access is genuinely needed.
Role is namespace-scoped — it applies only to resources in a specific namespace. ClusterRole is cluster-scoped — it applies to all namespaces (and cluster-scoped resources like nodes, PVs, CRDs). Use a Role for namespace-specific permissions. Use a ClusterRole for: (1) Permissions to cluster-scoped resources. (2) Permissions that should apply across all namespaces (e.g., cluster-admin). (3) Permissions used with a ClusterRoleBinding (binding applies to all namespaces).
PodSecurityPolicy (PSP) was a Kubernetes feature for controlling pod security (privileged containers, volume types, host network access). It was deprecated in 1.21 and removed in 1.25 due to complexity and usability issues. Replacement: Pod Security Admission Controller (PSA) — simpler, uses built-in policies: privileged (allow all), baseline (restrict known escalations), restricted (strict). PSA is enabled by default in Kubernetes 1.24+.
A ServiceAccount represents an identity for a pod. Each namespace has a default ServiceAccount. Pods use ServiceAccounts to authenticate with the Kubernetes API server. A User represents a human identity (authenticated via certs, tokens, or OIDC). ServiceAccounts are used for programmatic access; Users for human access. Best practice: create a dedicated ServiceAccount for each deployment or application, and grant only the permissions it needs.
The principle of least privilege means each pod or user should have only the permissions necessary to perform its function. In Kubernetes: (1) Use RBAC to grant minimal permissions. (2) Avoid cluster-admin for service accounts. (3) Use Pod Security Admission to restrict privileged containers. (4) Set network policies to limit pod-to-pod communication. (5) Use secrets management (Vault, External Secrets). Least privilege reduces attack surface and limits blast radius of security incidents.
Securing the API server: (1) Use TLS — enable client certificate authentication. (2) Disable anonymous access — set --anonymous-auth=false. (3) Enable RBAC — --authorization-mode=RBAC. (4) Use webhook authentication — integrate with OIDC (Azure AD, Google). (5) Restrict access to etcd — etcd holds secrets and cluster state — use TLS and firewall. (6) Enable audit logging — track API access. (7) Use a network policy to restrict API server access to trusted sources.
An admission controller intercepts API requests after authentication and authorisation, before persistence. It can modify or reject requests. Common controllers: (1) MutatingAdmissionWebhook — modify resources (e.g., Istio adds sidecars). (2) ValidatingAdmissionWebhook — validate resources (e.g., OPA Gatekeeper). (3) PodSecurity — enforce pod security standards (PSA). (4) ResourceQuota — enforce namespace quotas. Admission controllers are critical for security and policy enforcement.
Strategies: (1) External Secrets Operator — fetches secrets from AWS Secrets Manager, Vault, Azure Key Vault, and syncs them as Kubernetes Secrets. (2) HashCorp Vault Agent Injector — sidecar injects secrets directly into pods as files. (3) Sealed Secrets — encrypts secrets in Git, decrypted only in cluster. (4) CSI Secrets Store Driver — mounts secrets from external stores. Best practice: use External Secrets Operator (ESO) with automatic rotation — when the secret changes in the external store, ESO updates the Kubernetes Secret.
kube-bench is a tool that checks Kubernetes clusters against the CIS Kubernetes Benchmark — hundreds of security best practices (RBAC, API server config, etcd, kubelet). kube-hunter is a penetration testing tool that hunts for security weaknesses in Kubernetes clusters (exposed API, insecure secrets, unauthenticated etcd). Run kube-bench regularly (e.g., weekly) to assess compliance. Run kube-hunter after each major cluster change to detect new vulnerabilities.
OPA Gatekeeper is an admission controller that uses Open Policy Agent (OPA) to enforce policies on Kubernetes resources. You define ConstraintTemplates (reusable policy logic) and Constraints (actual policies). Examples: enforce labels on all resources, restrict container images to approved registries, require resource limits on every pod. Gatekeeper is the most popular policy engine for Kubernetes, enabling shift-left security and compliance.
Step 1: kubectl describe pod <name> — check Events section for scheduling failures. Step 2: Check for insufficient resources (CPU/memory). Step 3: Check node selector or affinity — no nodes match the criteria. Step 4: Check taints and tolerations — if nodes are tainted, the pod needs a toleration. Step 5: Check PVC binding — if the pod requires a PVC that is not bound. Step 6: If all else fails, check for node capacity — use kubectl get nodes --show-labels.
Step 1: kubectl describe pod <name> — check Events section for OOMKilled, failed probes, or image pull errors. Step 2: kubectl logs <pod> --previous — see logs from the crashed container. Step 3: Check exit code in describe output — exit code 1 is application error, 137 is OOMKilled, 126/127 is missing executable. Common fixes: increase memory limits (OOMKilled), fix liveness probe timing, correct the ENTRYPOINT command, or fix application startup errors visible in logs.
kubectl describe pod/<name> shows detailed information about a pod: (1) Events — the most useful part (scheduling failures, image pull issues, OOM). (2) Status — pod phase (Running, Pending, Failed). (3) Node — where the pod is scheduled. (4) Containers — image, ports, volume mounts, probe settings. (5) QoS — Guaranteed, Burstable, BestEffort. Always start troubleshooting with kubectl describe.
Use the -c flag: kubectl logs <pod-name> -c <container-name>. To see logs from all containers: kubectl logs <pod-name> --all-containers=true. For a continuous stream: kubectl logs -f <pod-name> -c <container-name>. For multi-container pods (e.g., sidecar patterns), this is essential. Also, kubectl logs --previous shows logs from a previous (crashed) container instance.
Use kubectl exec -it <pod-name> -- /bin/sh (or /bin/bash). This opens a shell inside the running container. For distroless images (no shell), use kubectl debug to start an ephemeral debug container. Inside the container, you can check environment variables, network connectivity, file system, and process state. Always use kubectl exec as a last resort — prefer logs and describe for initial diagnosis.
Step 1: Check label selector — kubectl get service --show-labels and verify the selector matches pods. Step 2: kubectl describe service — check Endpoints list. If empty, no pods match the selector. Step 3: kubectl get pods --show-labels — confirm pods have the labels the Service expects. Step 4: Check port mapping — service.spec.ports.targetPort must match containerPort. Step 5: Check network policy — if a NetworkPolicy blocks ingress to the pods.
Step 1: kubectl get nodes — check status. If NotReady, Step 2: kubectl describe node <node-name> — look at Conditions section (OutOfDisk, MemoryPressure, DiskPressure, PIDPressure). Step 3: SSH into the node and check: systemctl status kubelet, journalctl -u kubelet -f. Step 4: Check disk space: df -h. Step 5: Check Docker/containerd: systemctl status docker. Common causes: kubelet not running, out of disk space, network issues.
Liveness probe checks if the container is alive — if it fails, Kubernetes restarts the container. Readiness probe checks if the container is ready to serve traffic — if it fails, Kubernetes removes the pod from service endpoints. Use liveness for deadlock detection. Use readiness for slow startup or temporary unavailability. Best practice: set initialDelaySeconds for both to avoid premature failures.
kubectl port-forward pod/<pod-name> 8080:80 forwards local port 8080 to pod port 80. This bypasses Services and NetworkPolicies — useful for debugging a single pod directly. For Services: kubectl port-forward service/<service-name> 8080:80. Use port-forward for: (1) Testing an unreleased version. (2) Accessing a debug endpoint. (3) Debugging a specific pod behind a load balancer. Never use port-forward in production.
A PodDisruptionBudget (PDB) limits how many pods of a deployment can be unavailable during voluntary disruptions (node drains, cluster upgrades). Example: minAvailable: 2 ensures at least 2 pods are running at all times during a drain. Without a PDB, draining a node could evict all replicas of a 3-replica deployment. Set PDBs for every production Deployment — it prevents accidental downtime and is required for passing most security audits.
An Operator is a Kubernetes controller that extends the API with custom resources (CRDs) and automates the management of complex applications (databases, message queues, monitoring). It encapsulates domain-specific knowledge (backup, restore, scaling, upgrade). Use the Operator pattern when: (1) You need to automate complex application lifecycle management. (2) You want to provide a Kubernetes-native API for your application. (3) You need to manage stateful applications with custom logic.
A CustomResourceDefinition (CRD) is a way to extend the Kubernetes API by defining a new resource type (e.g., Database, MongoDBCluster). CRDs are simple to create — you define the schema (OpenAPI). API Extension (Aggregated API) is a more complex approach where you build a separate API server and register it with the main API server. Use CRD for most extensions (95% of cases). Use API Aggregation for advanced authentication/authorisation, or when you need a separate API server.
Helm is the package manager for Kubernetes. A chart is a collection of templated Kubernetes manifests with a values.yaml for configuration. Benefits: (1) Install complex applications with one command (helm install my-nginx ingress-nginx/ingress-nginx). (2) Manage environment-specific config via values files. (3) Upgrade and rollback releases. (4) Share reusable charts via Artifact Hub. In CI/CD, Helm is used to parameterise deployments — update the image tag in values.yaml and run helm upgrade.
GitOps uses a Git repository as the single source of truth for infrastructure and application state. Any change to production goes through a Git commit — the system continuously reconciles actual state with desired state. ArgoCD watches a Git repo containing Kubernetes manifests or Helm charts; when a diff is detected between the repo and the cluster, it automatically syncs (or notifies). Benefits: full audit trail, easy rollback (git revert), and no kubectl access needed for developers.
HPA automatically scales the number of pods based on observed metrics. Basic CPU-based example: kubectl autoscale deployment web --cpu-percent=70 --min=2 --max=20. The HPA controller checks metrics every 15 seconds. Requires metrics-server installed in the cluster. Advanced HPA can scale on custom metrics (e.g., requests per second from Prometheus via the custom.metrics.k8s.io API). Set meaningful minimum replicas to avoid cold-start latency, and ensure resource requests are set (HPA needs them for percentage calculation).
HPA (Horizontal Pod Autoscaler) scales the number of pod replicas. VPA (Vertical Pod Autoscaler) scales the resource requests/limits of a pod — it adjusts CPU and memory based on usage. Use HPA for stateless workloads that can scale horizontally. Use VPA for stateful workloads where scaling replicas is difficult (databases). VPA can work alongside HPA but with caution (mutual interference). For most applications, HPA is the better choice.
KEDA (Kubernetes Event-Driven Autoscaling) extends HPA to scale based on external events (SQS queue depth, Kafka lag, Prometheus metrics). It is an operator that works alongside HPA. Example: scale a deployment based on the number of messages in an SQS queue. KEDA is ideal for event-driven applications and can scale to zero (no replicas when no events). Use KEDA when your application is driven by external event sources and you need dynamic scaling.
An Admission Webhook is an HTTP callback that receives a Kubernetes API request before it is persisted. Two types: (1) MutatingAdmissionWebhook — can modify the request (e.g., add labels, inject sidecars). (2) ValidatingAdmissionWebhook — can accept or reject the request (e.g., enforce policies). Webhooks are the foundation of policy engines like OPA Gatekeeper. Implementation: deploy a webhook server (HTTPS), register with the API server via a MutatingWebhookConfiguration or ValidatingWebhookConfiguration.
Initializers were a mechanism for running a controller before an object is created. Deprecated in 1.14 and removed in 1.18. Replacement: Admission Webhooks (MutatingAdmissionWebhook) and Initializers via Operator (CRD with finalizers). The initializer pattern was complex and error-prone — admission webhooks provide the same functionality with better control and observability.
Taints are applied to nodes — they prevent pods from scheduling on that node unless the pod has a matching Toleration. Use taints for: (1) Dedicated nodes — only certain pods can run. (2) Node maintenance — taint node before draining. (3) GPU nodes — only pods with GPU tolerations can schedule. Tolerations are added to pod specs. Example: tolerations: [{"key": "gpu", "operator": "Equal", "value": "true", "effect": "NoSchedule"}].
Amazon Elastic Kubernetes Service (EKS) is a managed Kubernetes service on AWS. EKS runs the control plane (API server, etcd, scheduler) across multiple AZs for high availability. The control plane is managed by AWS — you pay $0.10/hour. The worker nodes run in your AWS account (EC2). EKS integrates with IAM, VPC, CloudWatch, and ALB (via AWS Load Balancer Controller). EKS is the most popular managed Kubernetes in production.
eksctl is the official CLI for EKS. Install: brew install eksctl. Create a cluster: eksctl create cluster --name=prod --version=1.28 --nodegroup-name=standard-workers --node-type=t3.medium --nodes=3 --nodes-min=1 --nodes-max=5 --region=ap-south-1. This creates the control plane, worker nodes, and configures kubectl automatically. EKSCTL is the fastest way to create an EKS cluster for development and production.
The AWS Load Balancer Controller is a Kubernetes controller that manages AWS load balancers (ALB, NLB) based on Ingress and Service resources. It watches for Ingress resources with kubernetes.io/ingress.class: alb and automatically provisions an Application Load Balancer (ALB) and configures routing. Benefits: (1) Integration with ACM for TLS. (2) Path-based routing. (3) Sticky sessions. (4) WAF integration. Standard EKS setup includes this controller.
Managed node groups — AWS handles the lifecycle of EC2 nodes (creation, scaling, termination, AMI updates). You specify instance type and size; AWS manages the rest. Self-managed nodes — you create and manage EC2 instances, install Kubernetes, and join them to the cluster. Use managed node groups for most workloads (less operational overhead). Use self-managed nodes for: (1) Custom AMIs. (2) Special instance types (GPU). (3) Bin packing optimisations.
EKS Fargate is a serverless compute engine for Kubernetes — you define pods, AWS runs them on Fargate without managing any nodes. Benefits: (1) No node management. (2) Pay per pod (not per node). (3) Automatic scaling. Limitations: (1) No DaemonSets. (2) No privileged containers. (3) Pods only (no node-level access). Use Fargate for serverless workloads, batch jobs, and teams that want to avoid node management. Use EC2 for high-performance or stateful workloads.
IRSA allows Kubernetes ServiceAccounts to assume IAM roles. Steps: (1) Create an IAM role with a trust policy that allows the EKS cluster's OIDC provider to assume the role. (2) Annotate the ServiceAccount with the IAM role ARN. (3) The pod uses the ServiceAccount, and the Kubernetes webhook automatically injects the AWS credentials into the pod. Benefits: (1) No static AWS keys. (2) Fine-grained IAM permissions per pod. (3) Security best practice. IRSA is the standard way to access AWS services from EKS.
The EKS cluster autoscaler scales the number of worker nodes based on pod resource demands. It watches for unschedulable pods (due to insufficient node capacity) and adds nodes. It also removes idle nodes. Configuration: set the cluster-autoscaler deployment with the correct AWS region, auto-scaling group names, and scale-up/down policies. Best practice: enable cluster autoscaler for all production EKS clusters to handle workload spikes automatically.
Two approaches: (1) EKS control plane logging — enable logs for API server, scheduler, controller manager, audit logs, and authenticator in the EKS console. (2) CloudWatch Container Insights — automatically collects metrics (CPU, memory, network) and logs from pods and nodes. To enable: kubectl apply -f https://raw.githubusercontent.com/aws-samples/amazon-cloudwatch-container-insights/latest/k8s-deployment-manifest-templates/deployment-mode/daemonset/container-insights-monitoring/quickstart/cwagent-fluentd-quickstart.yaml.
EKS is managed Kubernetes — you get the full Kubernetes API, ecosystem (Helm, ArgoCD, Prometheus), and portability across clouds. ECS is AWS's native container orchestrator — simpler to set up and use, with deep AWS integration (CloudWatch, Service Discovery, IAM). Use EKS for: (1) Multi-cloud or hybrid deployments. (2) Kubernetes-specific tooling. (3) Complex workloads. Use ECS for: (1) AWS-only deployments. (2) Simpler needs. (3) Cost sensitivity (ECS control plane is free).
EKS security best practices: (1) Enable RBAC — use AWS IAM for authentication, map to Kubernetes RBAC. (2) Use IRSA — no static keys. (3) NetworkPolicy — use Calico or Cilium. (4) Encrypt etcd — enable encryption provider for secrets. (5) Enable control plane logging — audit logs, API server logs. (6) Use security groups — restrict node group communication. (7) Run kube-bench — monthly compliance scans. (8) Enable Pod Security Admission (PSA) — enforce pod security standards.
Azure Kubernetes Service (AKS) is Microsoft Azure's managed Kubernetes offering. Compared to EKS: (1) Free control plane — no per-hour charge. (2) Azure AD integration — native authentication. (3) Azure Monitor — integrated monitoring. (4) Virtual Nodes — similar to Fargate (serverless pods). AKS is ideal for Azure-centric organisations and integrates deeply with Azure DevOps, Container Registry, and Azure Policy.
Pre-requisites: install Azure CLI (az), login: az login. Create a resource group: az group create --name rg-aks --location eastus. Create the AKS cluster: az aks create --resource-group rg-aks --name myAKSCluster --node-count 3 --enable-addons monitoring --generate-ssh-keys. Get credentials: az aks get-credentials --resource-group rg-aks --name myAKSCluster. AKS is the easiest managed Kubernetes to set up — often <5 minutes to a running cluster.
AKS Virtual Nodes is a serverless pod experience using Azure Container Instances (ACI). It enables pods to be scheduled on ACI instead of worker nodes. Benefits: (1) No node management. (2) Rapid scaling. (3) Pay-per-second. Use cases: (1) Batch processing. (2) CI/CD pipelines. (3) Burst workloads. Limitations: (1) No DaemonSets. (2) Limited pod networking (only HTTP/HTTPS). (3) No privileged containers. Virtual Nodes are a good fit for event-driven workloads.
Integration steps: (1) Enable AAD integration when creating the cluster: --enable-aad. (2) Assign RBAC roles to AAD groups using kubectl create clusterrolebinding. (3) AKS automatically maps AAD groups to Kubernetes RBAC. (4) Use kubectl --user to authenticate with AAD. Benefits: (1) Centralised identity management. (2) Multi-factor authentication. (3) Conditional access policies. AAD integration is recommended for production AKS clusters.
Azure Policy for Kubernetes is a policy engine that extends OPA Gatekeeper to enforce governance on AKS clusters. Use cases: (1) Enforce namespaced resources. (2) Require specific labels. (3) Block privileged containers. (4) Ensure images come from approved registries. Azure Policy integrates with the Azure Portal — define policies in the portal, and Gatekeeper enforces them in the cluster. It's the recommended policy engine for AKS.
AKS is a full Kubernetes platform — you get the entire Kubernetes API, orchestration, and ecosystem. ACI is a serverless container runtime — simple, quick (seconds), but no Kubernetes features (pods, services, deployments). Use AKS for: (1) Full Kubernetes workloads. (2) Complex orchestration. (3) Stateful apps. Use ACI for: (1) Simple tasks (e.g., batch processing). (2) Quick prototyping. (3) Burst workloads (via Virtual Nodes).
Enable Azure Monitor for containers when creating the cluster (--enable-addons monitoring). It collects CPU, memory, network, and disk metrics from nodes and pods. View in the Azure Portal under "Insights". Features: (1) Workloads dashboard — pod health, replicas. (2) Metrics — CPU/memory per namespace. (3) Logs — container logs with Log Analytics. (4) Alerting — create alerts for high CPU, pod failures. Azure Monitor is the primary monitoring tool for AKS.
AKS security best practices: (1) Enable RBAC with AAD — use Azure AD for identity. (2) Use network policies — enable Calico or Azure Network Policy. (3) Enable Azure Policy for Kubernetes — enforce governance. (4) Use Azure Key Vault with CSI Driver — secrets management. (5) Enable container image scanning (Azure Security Center). (6) Restrict API server access (via NSG or Azure Firewall). (7) Enable AKS-managed Azure AD — easier administration.
The AKS cluster autoscaler (similar to EKS cluster autoscaler) scales the number of worker nodes based on pod resource demands. It watches for unschedulable pods and adds nodes. It also removes idle nodes. Enable via CLI: az aks update --resource-group rg-aks --name myAKSCluster --enable-cluster-autoscaler --min-count 3 --max-count 10. Cluster autoscaler is recommended for all production AKS clusters.
AKS is a single Kubernetes cluster. Azure Kubernetes Fleet Manager is a service for managing multiple AKS clusters across regions (multi-cluster management). Fleet Manager provides: (1) Centralised API for managing many clusters. (2) Propagation of resources across clusters. (3) Multi-cluster load balancing. Use Fleet Manager when you have 10+ AKS clusters and need centralised governance.
Google Kubernetes Engine (GKE) is Google Cloud's managed Kubernetes service. Kubernetes was originally developed at Google, and GKE reflects this heritage. GKE is considered the most mature managed Kubernetes because: (1) Autopilot — fully automated cluster management. (2) Node auto-repair — automatic node replacement. (3) Fast updates — control plane upgrades in minutes. (4) Advanced features — GKE Autopilot, Workload Identity, and Traffic Director.
GKE Autopilot is a fully managed Kubernetes mode where Google handles the entire cluster — you only define pods (including resource requests). No node management, no cluster upgrades. Standard mode gives you control over nodes, node pools, and upgrades. Use Autopilot for: (1) Serverless workloads. (2) Teams that want to avoid infrastructure management. (3) Simple applications. Use Standard mode for: (1) Custom node configurations. (2) GPU support. (3) Advanced networking.
Workload Identity is GKE's equivalent of AWS IRSA — it allows Kubernetes ServiceAccounts to assume GCP IAM roles. Steps: (1) Create a GCP IAM role. (2) Create a Kubernetes ServiceAccount. (3) Annotate the ServiceAccount with the GCP service account. (4) The pod uses the ServiceAccount, and GKE automatically injects the GCP credentials. Benefits: (1) No static keys. (2) Fine-grained permissions. (3) Secure access to GCP services (Cloud Storage, BigQuery, etc.).
Install gcloud CLI, login: gcloud auth login. Create a cluster: gcloud container clusters create my-cluster --zone us-central1-a --num-nodes 3 --machine-type e2-standard-2. Get credentials: gcloud container clusters get-credentials my-cluster --zone us-central1-a. For Autopilot: gcloud container clusters create-auto my-autopilot-cluster --region us-central1. GKE is the easiest managed Kubernetes to get started with — great for learning.
Node auto-repair in GKE automatically monitors node health. If a node fails health checks (e.g., disk pressure, network issues), GKE automatically restarts the node. This reduces the operational burden of node management. Auto-repair is enabled by default for GKE clusters. Together with node auto-upgrade (automatic Kubernetes version upgrades), GKE provides a highly automated cluster management experience.
GKE integrates natively with Google Cloud Observability (formerly Stackdriver). Enable: gcloud container clusters create my-cluster --enable-stackdriver-kubernetes. Features: (1) Kubernetes metrics — pod, node, container metrics. (2) Logging — container logs, control plane logs. (3) Prometheus integration — native Prometheus metrics collection. (4) SLO monitoring — set up service level objectives. Cloud Observability is the primary monitoring tool for GKE.
The GKE cluster autoscaler scales the number of worker nodes based on pod resource demands. It works similarly to EKS and AKS autoscalers: (1) Watches for unschedulable pods. (2) Adds nodes from a node pool. (3) Removes idle nodes after a cooldown period (typically 10 minutes). Enable via gcloud container clusters create my-cluster --enable-autoscaling --min-nodes=3 --max-nodes=10. Cluster autoscaler is recommended for production GKE clusters.
Autopilot is fully managed — Google controls the cluster, node pools, and upgrades. You only define pods and their resource requests. Standard mode gives you control over node pools, instance types, and upgrade strategies. Use Autopilot for: (1) Serverless workloads. (2) Teams that want to avoid infrastructure management. (3) Simple applications. Use Standard mode for: (1) Custom node configurations (GPUs, high memory). (2) Advanced networking. (3) Fine-grained control over cluster resources.
GKE security best practices: (1) Enable RBAC — use Google IAM for authentication. (2) Use Workload Identity — avoid static credentials. (3) Enable NetworkPolicy — use Calico or GKE Dataplane V2. (4) Enable Binary Authorization — only allow trusted images. (5) Use GKE Security Command Center — vulnerability scanning. (6) Enable Pod Security Admission (PSA) — enforce pod security standards. (7) Restrict API server access — use private clusters with VPC.
GKE is the standard managed Kubernetes service. GKE Enterprise (formerly Anthos) is a platform for multi-cluster and hybrid Kubernetes management. It adds: (1) Multi-cluster networking (MCS). (2) Service mesh (Anthos Service Mesh). (3) Policy management (Config Sync). (4) Multi-cloud support (AWS, Azure, on-prem). Use GKE for single-cluster workloads. Use GKE Enterprise for large-scale, multi-cluster, or hybrid deployments.
CKA is a performance-based exam (2 hours, live cluster). Preparation: (1) Use Kubernetes the Hard Way by Kelsey Hightower — hands-on cluster setup. (2) CKA practice questions — use killer.sh, CKA practice labs. (3) Master kubectl — efficient use of imperative commands (kubectl run --image, kubectl create deployment). (4) Know the documentation — exam is open-book, but you need to know what to search for. (5) Time management — don't spend too long on one question. Thick Brain's CKA preparation course includes full practice exams.
CKA (Certified Kubernetes Administrator) focuses on cluster administration, installation, configuration, troubleshooting, and networking. CKAD (Certified Kubernetes Application Developer) focuses on developing, deploying, and troubleshooting applications. CKA is more operations-focused (SRE, DevOps). CKAD is more development-focused (backend, microservices). CKA is generally considered harder and more valued by employers. Start with CKA.
(1) Flag and skip — if a question takes more than 5 minutes, flag it and move on. (2) Use imperative commands — kubectl run, kubectl expose are faster than writing YAML. (3) Keep a cheat sheet — common commands and YAML snippets (allowed). (4) Practice with a timer — simulate 2-hour exam with 15-20 questions. (5) Don't overthink — most questions have a straightforward solution. (6) Use aliases — alias k=kubectl saves time.
(1) Not reading the question carefully — missing resource names, namespaces, or port numbers. (2) Using the wrong syntax — e.g., --image vs --image-pull-policy. (3) Forgetting to switch to the correct namespace — many questions require -n or --namespace. (4) Not testing after creating a resource — check pod status before moving on. (5) Spending too much time on one question — flag and revisit. (6) Not using the documentation — you have access to kubernetes.io — use it.
killer.sh is the official CKA practice environment. Steps: (1) Purchase the CKA exam (includes 2 free killer.sh attempts). (2) Access killer.sh via your CNCF account. (3) You get 36 hours of simulated exam time. (4) Questions are similar to the actual exam — they cover the same domains. (5) After completion, you can review your answers and see the correct solutions. killer.sh is the most effective CKA preparation tool — use both attempts fully before taking the real exam.
CKS (Certified Kubernetes Security Specialist) is an advanced certification focusing on Kubernetes security (RBAC, NetworkPolicy, admission control, image scanning, runtime security). Prerequisite: CKA. CKS builds on CKA by adding security-specific topics. If you work in a security-sensitive environment or want to specialise in K8s security, pursue CKS after CKA.
The CKA exam is open-book — you can access https://kubernetes.io/docs. Tips: (1) Bookmark key pages — kubernetes.io/docs/reference/kubectl/, kubernetes.io/docs/tasks/. (2) Use search — Ctrl+F is faster than navigating. (3) Look for examples — most pages have YAML examples you can copy and modify. (4) Don't rely on memory — know what to search for, not the exact syntax. (5) Practice with the documentation — during your preparation, use the docs to solve problems.
CKA domains: (1) Cluster Architecture, Installation & Configuration (25%) — cluster setup, etcd, API server, kubelet. (2) Workloads & Scheduling (15%) — Deployments, StatefulSets, Jobs, scheduling. (3) Services & Networking (20%) — Service, Ingress, NetworkPolicy, DNS. (4) Storage (10%) — PV, PVC, StorageClass, CSI. (5) Troubleshooting (30%) — diagnosis of pods, nodes, services, networking issues. The heaviest domain is troubleshooting — focus on that during preparation.
CKA exam cost: USD 395 (includes one free retake). Register via the CNCF training and certification portal. Steps: (1) Create a CNCF account. (2) Select CKA certification. (3) Pay USD 395. (4) Schedule the exam within 12 months. (5) You get 2 exam attempts. The exam is proctored (online) and lasts 2 hours. Retakes must be completed within 12 months of the original purchase.
Thick Brain Technology's Advanced Kubernetes (EKS) course includes comprehensive CKA preparation: (1) 60 hours of live training covering all CKA domains. (2) Real EKS cluster labs — practice with actual K8s clusters. (3) CKA practice questions — simulated exam experiences. (4) killer.sh integration — practice with the official CKA exam simulator. (5) Instructor-led CKA review sessions — targeted preparation for the exam. (6) Placement support — help with job applications after certification. Book a free demo to start your CKA journey.
Frequently Asked Questions
The CKA (Certified Kubernetes Administrator) is a performance-based exam by CNCF. You solve real Kubernetes tasks in a live cluster over 2 hours — no multiple choice. It is the most credible K8s certification and widely recognised by Bangalore's top tech employers.
Kubernetes engineers in Bangalore earn ₹8-14 LPA at entry level (1-3 years), ₹14-24 LPA at mid-level (3-6 years), and ₹22-38 LPA for senior architects. CKA holders earn 25-35% more than non-certified peers at every experience level.
Yes — Docker fundamentals are a prerequisite for Kubernetes. You should understand container images, Dockerfiles, container networking and registries before starting Kubernetes. Thick Brain Technology's DevOps courses cover Docker before progressing to Kubernetes.
EKS (AWS) is best for AWS-centric organisations. AKS (Azure) is best for Microsoft/Azure orgs. GKE (Google) is best for advanced K8s features and is the most mature managed Kubernetes. All three abstract away control plane management — choose based on your cloud provider.
With structured training, most learners become proficient in 10-12 weeks. Thick Brain Technology's Kubernetes course covers EKS, AKS, GKE, CKA preparation and real cluster labs — you'll be job-ready in 3-4 months.
Thick Brain Technology offers comprehensive live online Kubernetes training with real cluster labs, CKA preparation, and placement support. The course covers EKS, AKS, GKE, Helm, GitOps and advanced Kubernetes concepts. Book a free demo to see our live cluster labs.
Conclusion: Master Kubernetes in 2026
Kubernetes proficiency is no longer optional for DevOps engineers and cloud architects in 2026. Whether you specialise in AWS EKS, Azure AKS, or Google GKE, the core Kubernetes skills transfer across all platforms. Pursuing the CKA certification validates these skills in a way that employers trust.
The market rewards engineers who combine strong Kubernetes fundamentals with real cluster experience. CKA-certified engineers command 25-35% salary premiums and are in high demand across Bangalore's top tech companies.
Thick Brain Technology offers advanced Kubernetes training on all three managed platforms — EKS, AKS and GKE — with real cluster environments and CKA-aligned practice scenarios. Book a free demo class to deploy your first Kubernetes application live.
⚓
Master Kubernetes with Real Cluster Labs
Book a free demo class and deploy your first Kubernetes application live. No payment required.
Cloud & DevOps Curriculum Experts · Bengaluru, India
The Thick Brain Technology editorial team comprises certified cloud architects, active DevOps practitioners, and career coaches who have collectively trained 10,000+ IT professionals across India. Our content is written by engineers who work with these technologies in production environments daily — not generalist content writers.
10,000+ Students TrainedCKA Certified TrainersEKS Certified Practitioners
📬
Get Weekly Kubernetes Career Guides & Salary Reports
Join 12,000+ IT professionals. Get Kubernetes career tips, CKA exam updates, job alerts and course info every week.
No spam. Unsubscribe any time.
Student Success
Real Students. Real Outcomes.
Our Kubernetes graduates are placed at top tech companies across Bengaluru and India.
"
★★★★★
I was a DevOps engineer for 3 years but struggled with Kubernetes. After Thick Brain's Advanced Kubernetes course, I passed CKA on my first attempt and got a 65% salary hike at a product startup. The real EKS cluster labs were the game-changer.
RK
Ravi Kulkarni
Senior DevOps Engineer, Bengaluru
"
★★★★★
Coming from a support background, I was intimidated. But the course starts with Docker fundamentals and builds up. I now lead the K8s practice at a FinTech company. The CKA preparation was exactly what I needed.
SN
Sneha Nair
Platform Engineer, FinTech · Bangalore
"
★★★★★
The real cluster labs are what made the difference — we used actual EKS clusters, not simulators. After passing my CKA, I negotiated a 70% salary increase at my current company. The Helm and GitOps modules were incredibly valuable.