Kubernetes Tutorial: From Zero to Production
A hands-on Kubernetes tutorial that takes you from zero to deploying and managing containerized applications on a local cluster. Learn core concepts, write YAML manifests, use Helm, explore advanced topics, and understand how Kubernetes works in production on AWS EKS, Google GKE, and Azure AKS.
Complete Tutorial Code
Follow along with the complete source code for this Kubernetes tutorial. Includes setup, deployments, services, Helm charts, advanced topics, and production guidance across six chapters.
View on GitHubTable of Contents
What is Kubernetes?
Kubernetes (often abbreviated as K8s) is an open-source container orchestration platform originally developed by Google. It automates the deployment, scaling, and management of containerized applications across a cluster of machines.
At its core, Kubernetes answers the question: “I have many containers — how do I run them reliably, at scale, across multiple machines?”
New to containers or Docker? Check out the Core Concepts reference guide first.
Key capabilities:
Self-Healing
Automatically restarts failed containers, replaces and reschedules them when nodes die — keeping your application running without manual intervention.
Horizontal Scaling
Scale your application up or down with a single command or automatically based on CPU/memory usage using the Horizontal Pod Autoscaler.
Load Balancing & Service Discovery
Distributes network traffic so deployments are stable. Containers can find each other by name without hardcoded IPs — built-in service discovery.
Secret & Config Management
Store and manage sensitive information separately from your container image using ConfigMaps & Secrets — keeping credentials out of your codebase.
When Should You Use Kubernetes?
Kubernetes is a powerful tool, but it's not always the right one. Here's how it compares to other deployment strategies:
✅ Use Kubernetes when:
- You have multiple microservices that need to communicate, scale independently, and be deployed separately.
- You need fine-grained control over resource allocation (CPU, memory limits per service).
- You're running stateful workloads (databases, queues) alongside stateless ones.
- You need multi-cloud or hybrid-cloud portability — your app runs the same way on AWS, GCP, Azure, or on-prem.
- Your team needs advanced deployment strategies like canary releases, blue/green deployments, or A/B testing.
- You're operating at significant scale (dozens of services, hundreds of instances).
⚡ Use Serverless (e.g., Vercel / Next.js) when:
- You're building a frontend-heavy or full-stack web app where the platform handles all infrastructure.
- You want zero infrastructure management — just push code and it's live.
- Your traffic is unpredictable or low — serverless scales to zero and you only pay for what you use.
- You don't need persistent connections, long-running processes, or custom networking.
Example: A Next.js marketing site or SaaS frontend deployed on Vercel is a perfect serverless use case. You don't need Kubernetes for that.
🖥️ Use Provisioned Resources (e.g., AWS EC2) when:
- You need a specific OS environment or custom kernel configuration.
- You're running legacy applications that aren't containerized.
- You're doing GPU-intensive workloads (ML training) where you need direct hardware access.
Summary Table
| Scenario | Best Choice |
|---|---|
| Microservices at scale | ✅ Kubernetes |
| Simple web app / frontend | ⚡ Serverless (Vercel, Netlify) |
| Full control over OS/hardware | 🖥️ EC2 / VMs |
| Multi-service backend with APIs | ✅ Kubernetes |
| Unpredictable traffic, pay-per-use | ⚡ Serverless |
| Legacy app, not containerized | 🖥️ EC2 / VMs |
| Portable, cloud-agnostic deployment | ✅ Kubernetes |
Prerequisites
Before starting, make sure you have:
- A computer running macOS, Linux, or Windows (WSL2 recommended on Windows)
- Basic familiarity with the command line / terminal
- Docker installed and running
Tutorial Chapters
The tutorial is organized into six chapters plus a core concepts reference guide. Work through them in order for the best learning experience.
🛠️ Chapter 01 — Setup
Install the required tools, create your first k3d cluster, and explore it with kubectl. By the end of this chapter you'll have a fully functional local Kubernetes cluster running on your machine.
🚀 Chapter 02 — Deployment
Write YAML manifests, create namespaces, deploy an application, and verify it with BusyBox. Learn the fundamental building blocks of Kubernetes workloads — Pods, ReplicaSets, and Deployments.
🌐 Chapter 03 — Services and Beyond
Expose your app with a LoadBalancer Service, add resource limits, and learn Kubernetes architecture. Understand how traffic flows from the outside world into your cluster and between services.
⛵ Chapter 04 — Helm
Use Helm — the Kubernetes package manager — to install, upgrade, and roll back applications on a k3d cluster with podinfo. Learn how Helm charts simplify complex deployments and enable repeatable releases.
🔬 Chapter 05 — Advanced Topics
Pod controllers (Deployment, DaemonSet, Job), stateful workloads, security contexts, Snyk scanning, Prometheus & Grafana monitoring, and Horizontal Pod Autoscaler. Everything you need to run production-grade workloads.
☁️ Chapter 06 — Kubernetes in Production
How this tutorial compares to on-prem and managed cloud Kubernetes. Deep dive on AWS EKS, Google GKE, and Azure AKS — understand the trade-offs and operational considerations for each managed Kubernetes offering.
🛠️ Chapter 01 — Setup: Your First Kubernetes Cluster
In this chapter, we'll install all the tools you need and spin up a local Kubernetes cluster using k3d. By the end, you'll have a running cluster and know how to inspect it with kubectl.
Prerequisites
Before we begin, make sure you have:
- A computer running macOS, Linux, or Windows (WSL2 strongly recommended on Windows)
- A terminal / command-line interface
- An internet connection (to pull Docker images)
Install Docker
k3d runs Kubernetes nodes as Docker containers, so Docker is required.
macOS
Download and install Docker Desktop from the official site:
docs.docker.com/desktop/install/mac-install/Linux
Verify Docker is working
Install kubectl
kubectl is the command-line tool for interacting with any Kubernetes cluster.
macOS (Homebrew)
Linux
Verify
Install k3d
k3d is a lightweight wrapper that runs k3s (a minimal Kubernetes distribution) inside Docker containers. It makes creating local Kubernetes clusters incredibly fast and easy.
macOS (Homebrew)
macOS / Linux (install script)
Verify
Create Your First Cluster
Now for the fun part — let's create a Kubernetes cluster! With k3d, this is a single command:
k3d cluster create myclusterWhat's happening here? k3d pulls the k3s Docker image, starts a Docker container that acts as your Kubernetes server node (control plane), and automatically configures kubectl to connect to this new cluster by updating your ~/.kube/config file.
Explore the Cluster
Now let's use kubectl to inspect what's running in our new cluster.
- 1kubectl cluster-info
Displays the addresses of the Kubernetes control plane and core services (like CoreDNS). Confirms your kubectl is connected to the right cluster.
- 2kubectl get nodes
Lists all the nodes in your cluster. A node is a machine (in our case, a Docker container) that runs your workloads. STATUS: Ready means the node is healthy.
- 3kubectl get namespaces
Lists all namespaces in the cluster. Namespaces are virtual sub-clusters — default, kube-system, kube-public, and kube-node-lease are created automatically.
- 4kubectl get pods -A
Lists all pods across all namespaces (-A = --all-namespaces). You'll see system pods: CoreDNS, local-path-provisioner, metrics-server, and Traefik.
- 5kubectl get services -A
Lists all services across all namespaces. Services provide stable network endpoints to access sets of pods.
Stop the Cluster
When you're done with this chapter, stop and delete the cluster so the next chapter starts fresh:
k3d cluster stop mycluster
k3d cluster delete myclusterSummary
- ✅Installed Docker, kubectl, and k3d
- ✅Created a local Kubernetes cluster with k3d cluster create mycluster
- ✅Explored the cluster using kubectl cluster-info, get nodes, get namespaces, get pods -A, and get services -A
- ✅Learned how to stop and manage your cluster
🚀 Chapter 02 — Deployment: Running Your First Application
In this chapter, we'll deploy a real application to Kubernetes. You'll learn how to write YAML manifests, create a namespace, deploy an app with a Deployment, and verify it's running — all using kubectl.
Concepts: YAML & Infrastructure as Code
Infrastructure as Code (IaC) means defining your infrastructure in files that can be version-controlled, reviewed, and applied automatically. In Kubernetes, every resource — namespaces, deployments, services — is defined in YAML files. This is IaC in practice.
GitOps takes this further: your Git repository becomes the single source of truth. When you push a change to a YAML file, an automated system applies it to the cluster — giving you a full audit trail, easy rollbacks, and consistent deployments.
YAML Basics
YAML is the format Kubernetes uses for all its resource definitions. Key rules to remember:
- Use spaces, not tabs (YAML is whitespace-sensitive)
- Files use .yaml or .yml extension
- --- marks the beginning of a document
- # starts a comment
- Indentation creates hierarchy — more indented = nested inside the parent
- Lists use - as a bullet
Set Up a 4-Node Cluster
For this chapter, we'll create a cluster with 1 server (control plane) node and 3 agent (worker) nodes — 4 nodes total. This lets us see how Kubernetes schedules pods across multiple nodes.
k3d cluster create mycluster --agents 3Create a Namespace
A Namespace is a virtual partition inside your cluster. We'll create a development namespace to isolate our workloads from the system components.
# namespace.yaml
apiVersion: v1
kind: Namespace
metadata:
name: developmentkubectl apply -f namespace.yamlDeploy an Application
Now let's deploy an application. We'll use pod-info-app — a simple Node.js app that displays information about the pod it's running in (name, namespace, IP address).
# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: pod-info-deployment
namespace: development
labels:
app: pod-info
spec:
replicas: 3
selector:
matchLabels:
app: pod-info
template:
metadata:
labels:
app: pod-info
spec:
containers:
- name: pod-info-container
image: kimschles/pod-info-app:latest
ports:
- containerPort: 3000
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIPInspect Deployments and Pods
# Apply the deployment
kubectl apply -f deployment.yaml
# Check the deployment
kubectl get deployments -n development
# Check the pods
kubectl get pods -n development
# Get more details about a pod
kubectl describe pod <pod-name> -n developmentSelf-Healing in Action
Kubernetes automatically restarts failed pods. Try deleting a pod and watch Kubernetes recreate it:
# Delete a pod (replace with actual pod name)
kubectl delete pod pod-info-deployment-xxxxx -n development
# Watch Kubernetes recreate it immediately
kubectl get pods -n development --watchTest with BusyBox
Deploy a BusyBox pod to test connectivity from inside the cluster:
# busybox.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: busybox
namespace: default
spec:
replicas: 1
selector:
matchLabels:
app: busybox
template:
metadata:
labels:
app: busybox
spec:
containers:
- name: busybox
image: busybox:latest
command: ["sleep", "3600"]# Get pod IPs
kubectl get pods -n development -o wide
# Exec into BusyBox
kubectl exec -it busybox-<id> -- /bin/sh
# Inside BusyBox: make an HTTP request to your app
wget <pod-ip>:3000
cat index.htmlView Application Logs
# View logs from a pod
kubectl logs <pod-name> -n development
# Follow logs in real time
kubectl logs -f <pod-name> -n development
# Show logs from all pods in the deployment
kubectl logs -l app=pod-info -n developmentScale Up: Add Nodes and Increase Replicas
Add 4 more agent nodes to the running cluster and scale the deployment to 16 replicas:
# Add 4 more agent nodes
k3d node create mycluster-extra --cluster mycluster --role agent --replicas 4
# Verify all 8 nodes are ready
kubectl get nodes
# Update deployment.yaml: set replicas: 16
# Then apply the change
kubectl apply -f deployment.yaml
# Watch pods roll out across all nodes
kubectl get pods -n development -o wideSummary
- ✅Learned YAML syntax and why it's used for IaC and GitOps
- ✅Created a 4-node k3d cluster (1 server + 3 agents)
- ✅Created a development namespace to isolate workloads
- ✅Wrote and applied a Deployment YAML to run 3 replicas of an app
- ✅Witnessed Kubernetes self-healing by deleting a pod and watching it recover
- ✅Used BusyBox to test your app from inside the cluster
- ✅Viewed application logs with kubectl logs
- ✅Scaled the cluster from 4 to 8 nodes and the deployment from 3 to 16 replicas
🌐 Chapter 03 — Services, Resource Limits, and Kubernetes Architecture
In this chapter, we'll expose our application to the internet using a Service and an Ingress, add resource limits to our pods, and then take a step back to understand how Kubernetes actually works under the hood — the control plane, worker nodes, and how they coordinate together.
Set Up the Cluster
Create a k3d cluster with a port mapping so that traffic from your computer reaches the cluster's built-in load balancer:
k3d cluster create mycluster --agents 2 -p "8080:80@loadbalancer"This maps port 8080 on your computer to port 80 on the k3d load balancer container. k3d comes with Traefik pre-installed as an ingress controller — it listens on port 80 inside the cluster and routes incoming HTTP requests to the right service.
Expose Your App with a Service and Ingress
Right now, our pods are running but they're only reachable from inside the cluster. To access the app from your browser, we need two things:
- A Service — gives the pods a stable internal IP address inside the cluster
- An Ingress — routes HTTP traffic from outside the cluster to the Service
Service Types
ClusterIP (default)
Only reachable inside the cluster. Used for internal service-to-service communication. The Ingress controller will route external traffic to it.
NodePort
Exposes the service on each node's IP at a static port. Accessible from outside the cluster using NodeIP:NodePort. Useful for development and testing.
LoadBalancer
Provisions an external load balancer (in cloud environments). k3d simulates this locally. The standard way to expose services in production.
The Service YAML
# service.yaml
apiVersion: v1
kind: Service
metadata:
name: pod-info-service
namespace: development
spec:
type: ClusterIP
selector:
app: pod-info
ports:
- port: 80
targetPort: 3000The Ingress YAML
# ingress.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: pod-info-ingress
namespace: development
annotations:
ingress.kubernetes.io/ssl-redirect: "false"
spec:
rules:
- http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: pod-info-service
port:
number: 80kubectl apply -f service.yaml
kubectl apply -f ingress.yaml
# Open in your browser
open http://localhost:8080Add Resource Requests and Limits
Resource requests and limits tell Kubernetes how much CPU and memory each container needs and is allowed to use. This enables the scheduler to make intelligent placement decisions.
# Add to the container spec in deployment.yaml
resources:
requests:
memory: "64Mi"
cpu: "250m"
limits:
memory: "128Mi"
cpu: "500m"Kubernetes will perform a rolling update — replacing pods one at a time so the app stays available throughout the update.
Kubernetes Architecture
A Kubernetes cluster has two main parts: the control plane and the worker nodes.
The Control Plane
Kube API Server
The front door to Kubernetes. Every interaction with the cluster goes through it — kubectl, CI/CD pipelines, and other Kubernetes components all talk to the API server.
etcd
A distributed key-value store that holds the entire state of the cluster. If etcd is lost, the cluster loses its memory — this is why etcd backups are critical in production.
Kube Scheduler
Watches for newly created pods that haven't been assigned to a node yet, and decides which node they should run on based on available resources and constraints.
Controller Manager
Runs control loops that continuously watch the cluster state and take action to reconcile it with the desired state. Powers Kubernetes' self-healing behavior.
Worker Nodes
kubelet
An agent that runs on every worker node. Watches the API server for pods assigned to its node, starts and stops containers, and reports node health back to the control plane.
Container Runtime
When the kubelet needs to start a container, it uses the Container Runtime Interface (CRI) to talk to the container runtime (containerd). The runtime pulls the image and starts the container.
kube-proxy
Runs on every node and maintains network rules that allow pods and services to communicate. Handles load balancing across pod replicas at the network level.
How They Work Together
Here's the sequence of events when you run kubectl apply -f deployment.yaml:
- 1You run kubectl apply → API Server validates and stores desired state in etcd
- 2Controller Manager detects new Deployment → creates 3 Pending pods in etcd
- 3Scheduler assigns each pod to a node based on available resources
- 4kubelet on each assigned node detects the pod assignment
- 5kubelet tells the container runtime to pull the image and start the container
- 6Pod is Running ✅ — kubelet reports status back to the API server
Summary
- ✅Created a k3d cluster with a port mapping (-p "8080:80@loadbalancer") to expose services locally
- ✅Created a ClusterIP Service to give the pods a stable internal endpoint
- ✅Created an Ingress to route external traffic through Traefik to the Service
- ✅Opened the app at http://localhost:8080 and saw the load balancer distribute requests across all 3 pods
- ✅Added CPU and memory resource requests and limits to the deployment
- ✅Learned how the Kubernetes control plane and worker nodes work together
⛵ Chapter 04 — Helm: The Kubernetes Package Manager
In this chapter, we'll use Helm — the package manager for Kubernetes — to deploy a real application to our k3d cluster. Instead of writing and managing individual YAML files for every resource, Helm lets you install a complete application with a single command and customize it with a simple values file.
We'll deploy podinfo — a small Go web application purpose-built for demonstrating Kubernetes features.
Install Helm
Helm is a separate CLI tool you install alongside kubectl.
macOS (Homebrew)
Linux
Verify
Set Up the Cluster
k3d cluster create mycluster --agents 1 -p "8080:80@loadbalancer"Add the podinfo Helm Repository
A Helm repository is a collection of charts hosted at a URL — similar to how apt or brew have package repositories.
# Add the podinfo chart repository
helm repo add podinfo https://stefanprodan.github.io/podinfo
# Update your local cache
helm repo update
# List your configured repositories
helm repo listExplore the Chart
Before installing anything, inspect the chart to understand what it will create and what you can configure:
# View chart metadata
helm show chart podinfo/podinfo
# View the default values (all configurable options)
helm show values podinfo/podinfo
# Search for charts on Artifact Hub
helm search hub podinfo
# Search within your added repositories
helm search repo podinfoInstall podinfo with a Values File
Rather than passing every option on the command line, create a values.yaml file to override the defaults:
# values.yaml
replicaCount: 2
ui:
color: "#4CAF50"
message: "Hello from Helm + k3d!"
ingress:
enabled: true
className: ""
annotations:
ingress.kubernetes.io/ssl-redirect: "false"
hosts:
- host: ""
paths:
- path: /
pathType: Prefix
resources:
requests:
cpu: 10m
memory: 32Mi
limits:
memory: 64Mi# Create the namespace
kubectl create namespace podinfo
# Install the chart
helm install podinfo podinfo/podinfo --namespace podinfo -f values.yaml💡 Release vs. Chart
A chart is the package (like a recipe). A release is a specific installation of that chart in your cluster (like a meal you cooked from the recipe). You can install the same chart multiple times with different release names and different values — for example, a podinfo-staging release and a podinfo-production release.
Verify the Deployment
# Check the pods
kubectl get pods -n podinfo
# Check the Helm release
helm list -n podinfo
# Open in your browser
open http://localhost:8080Upgrade a Release
Update the values file (e.g., change replicaCount: 3) and upgrade the release:
helm upgrade podinfo podinfo/podinfo --namespace podinfo -f values.yaml
# Check the release history
helm history podinfo -n podinfoRoll Back a Release
If something goes wrong, roll back to a previous revision:
# Roll back to revision 1
helm rollback podinfo 1 -n podinfo
# Verify the rollback
helm history podinfo -n podinfoUninstall a Release
helm uninstall podinfo -n podinfo
# Delete the cluster when done
k3d cluster delete myclusterSummary
- ✅Installed Helm and added the podinfo chart repository
- ✅Explored the chart's default values with helm show values
- ✅Created a values.yaml file to customize the deployment
- ✅Installed podinfo with helm install and verified it in the browser
- ✅Upgraded the release with helm upgrade and checked the history
- ✅Rolled back to a previous revision with helm rollback
- ✅Uninstalled the release with helm uninstall
🔬 Chapter 05 — Advanced Topics
In this chapter, we'll explore several advanced Kubernetes topics by running real commands against a k3d cluster. We'll cover different ways to manage pods, persistent storage, security hardening, logging and monitoring with Prometheus and Grafana, and automatic scaling with the Horizontal Pod Autoscaler.
Set Up the Cluster
k3d cluster create mycluster --agents 2 -p "8080:80@loadbalancer"
kubectl create namespace advancedWays to Manage Pods
Kubernetes provides several different controllers for managing pods. Each is designed for a specific use case.
Deployment: Rolling Updates
Deployments support rolling updates — Kubernetes replaces pods one at a time with zero downtime. Configure the strategy in your deployment YAML:
spec:
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 1 # At most 1 pod can be unavailable during an update
maxSurge: 1 # At most 1 extra pod can be created during an update# Watch the rolling update in real time
kubectl get pods -n advanced --watch
# Check rollout history
kubectl rollout history deployment/pod-info-deployment -n advanced
# Roll back to the previous version
kubectl rollout undo deployment/pod-info-deployment -n advancedDaemonSet: One Pod Per Node
A DaemonSet ensures that exactly one copy of a pod runs on every node. Useful for log collectors, monitoring agents, and other node-level services.
# daemonset.yaml
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: log-collector
namespace: advanced
spec:
selector:
matchLabels:
app: log-collector
template:
metadata:
labels:
app: log-collector
spec:
containers:
- name: log-collector
image: busybox:latest
command: ["sh", "-c", "while true; do echo 'Collecting logs from node'; sleep 30; done"]
volumeMounts:
- name: varlog
mountPath: /var/log
readOnly: true
resources:
requests:
memory: "16Mi"
cpu: "10m"
limits:
memory: "32Mi"
cpu: "50m"
volumes:
- name: varlog
hostPath:
path: /var/logkubectl apply -f daemonset.yaml
# Verify one pod per node
kubectl get pods -n advanced -l app=log-collector -o wideJob: Run to Completion
A Job creates pods that run until a task completes successfully. Use parallelism and completions for parallel processing.
# job.yaml
apiVersion: batch/v1
kind: Job
metadata:
name: pi-calculator
namespace: advanced
spec:
template:
spec:
containers:
- name: pi
image: perl:slim
command: ["perl", "-Mbignum=bpi", "-wle", "print bpi(2000)"]
resources:
requests:
memory: "64Mi"
cpu: "100m"
limits:
memory: "128Mi"
cpu: "250m"
restartPolicy: NeverPersistent Volumes
Pods are ephemeral — when they restart, their data is lost. A PersistentVolumeClaim (PVC) requests durable storage that survives pod restarts. We'll deploy PostgreSQL with a PVC to demonstrate stateful workloads.
# postgres-pvc.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: postgres-pvc
namespace: advanced
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1GiKubernetes Security
Security Context in Action
A Security Context defines privilege and access control settings for Pods and containers. Compare an insecure deployment (running as root) with a hardened one:
# Hardened security context
securityContext:
runAsNonRoot: true
runAsUser: 1000
readOnlyRootFilesystem: true
allowPrivilegeEscalation: false
capabilities:
drop:
- ALLScanning with Snyk
Snyk scans your YAML files for security misconfigurations before they reach production:
# Install Snyk CLI
npm install -g snyk
# Authenticate
snyk auth
# Scan your YAML files
snyk iac test deployment.yamlLogging and Monitoring with Prometheus and Grafana
Install the kube-prometheus-stack Helm chart to get Prometheus (metrics collection) and Grafana (visualization) running in your cluster in minutes:
# Add the prometheus-community Helm repository
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
# Create the monitoring namespace
kubectl create namespace monitoring
# Install the kube-prometheus-stack
helm install prometheus prometheus-community/kube-prometheus-stack --namespace monitoring --set grafana.adminPassword=admin
# Port-forward Grafana to your browser
kubectl port-forward svc/prometheus-grafana 3000:80 -n monitoringOpen http://localhost:3000 in your browser (username: admin, password: admin). Pre-built dashboards give you instant visibility into cluster health, node resource usage, and application metrics.
Horizontal Pod Autoscaler (HPA)
The HPA automatically scales the number of Pod replicas based on observed CPU utilization or custom metrics. Set a target CPU percentage and min/max replica counts — Kubernetes handles the rest.
# Create an HPA targeting 30% CPU utilization
kubectl autoscale deployment pod-info-deployment --namespace advanced --cpu-percent=30 --min=2 --max=8
# Watch the HPA
kubectl get hpa -n advanced --watch
# Generate load to trigger scaling
kubectl run load-generator --image=busybox -n advanced -- /bin/sh -c "while true; do wget -q -O- http://pod-info-service; done"
# Stop the load generator
kubectl delete pod load-generator -n advanced💡 HPA + Cluster Autoscaler
The HPA scales pods. The Cluster Autoscaler scales nodes. Together they give you fully elastic infrastructure: HPA (scales pods) + Cluster Autoscaler (scales nodes) = fully elastic infrastructure
Summary
- ✅Deployed a Deployment and triggered a rolling update — watching pods replaced one at a time with zero downtime
- ✅Created a DaemonSet and verified one pod ran on every node
- ✅Ran a Job to completion and used a parallel Job to process multiple tasks simultaneously
- ✅Created a PersistentVolumeClaim, deployed PostgreSQL with it, and verified data survived a pod restart
- ✅Compared an insecure deployment (running as root) with a hardened one using a Security Context
- ✅Scanned YAML files with Snyk and fixed all reported issues
- ✅Installed Prometheus + Grafana with Helm and explored pre-built Kubernetes dashboards
- ✅Created an HPA, generated load, and watched it automatically scale pods up and back down
☁️ Chapter 06 — Kubernetes in Production: Local, On-Prem, and the Cloud
Throughout this tutorial, you've been running Kubernetes locally using k3d — a lightweight tool that runs a full Kubernetes cluster inside Docker containers on your computer. That's great for learning, but production Kubernetes looks very different.
In this chapter, we'll compare three ways to run Kubernetes in production, then deep-dive into the three major cloud providers' managed Kubernetes offerings: Amazon EKS, Google GKE, and Microsoft AKS.
What You've Been Running: k3d
✅ What k3d gives you
- A full Kubernetes API — everything transfers to production
- Fast cluster creation — a new cluster in seconds
- Multi-node simulation with Docker containers
- Zero cost — runs entirely on your computer
❌ What k3d doesn't give you
- Real hardware isolation
- High availability
- Production-grade networking
- Persistent enterprise storage
- Real cloud load balancers
- Scalability beyond your laptop
On-Premises Kubernetes
On-premises Kubernetes means running Kubernetes on hardware that you own and operate — either in your own data center or in a co-location facility.
Popular on-prem distributions:
When you run Kubernetes on-prem, you own the entire stack: hardware procurement, OS patching, Kubernetes upgrades, etcd backups, high availability, networking (CNI plugin), storage (CSI driver), load balancing, certificate management, and security.
On-prem makes sense for: data sovereignty/compliance, existing hardware investment, air-gapped environments, and predictable high-volume workloads.
Managed Kubernetes on the Cloud
A managed Kubernetes service means the cloud provider handles the control plane for you. You focus on deploying your applications; they handle the infrastructure.
Key advantages over on-prem:
Side-by-Side Comparison
| k3d (this tutorial) | On-Premises | Managed Cloud | |
|---|---|---|---|
| Purpose | Local dev & learning | Full control production | Managed production |
| Control plane | Docker on your computer | You install & manage | Cloud provider manages |
| Worker nodes | Docker containers | Physical servers / VMs | Cloud VMs |
| High availability | No | You configure | Built-in (provider SLA) |
| K8s upgrades | Recreate cluster | Manual, complex | One-click or automated |
| Cost | Free | Hardware + ops team | Pay-per-use |
| Ops burden | None | Very high | Low to medium |
Cloud Provider Comparison: EKS vs. GKE vs. AKS
All three major cloud providers offer managed Kubernetes. They all run standard Kubernetes under the hood — your YAML manifests work on all three. The differences are in the surrounding ecosystem, pricing, and operational experience.
Amazon EKS (Elastic Kubernetes Service)
06-kubernetes-in-production/README.md#amazon-eksAWS's managed Kubernetes service. The most widely adopted managed Kubernetes offering. Deep integration with IAM, ALB Ingress Controller, EBS/EFS storage, and CloudWatch. Fargate support lets you run Pods without managing EC2 nodes.
Control plane cost: $0.10/hour per cluster (~$73/month)
Networking: AWS VPC CNI — pods get real VPC IP addresses
IAM integration: IRSA (IAM Roles for Service Accounts)
Google GKE (Google Kubernetes Engine)
06-kubernetes-in-production/README.md#google-gkeGoogle invented Kubernetes, and GKE shows it. Autopilot mode fully manages the node infrastructure for you. Best-in-class auto-upgrade, auto-repair, and Workload Identity for secure GCP service access. Often considered the most “Kubernetes-native” experience.
Control plane cost: Free (Standard tier)
Networking: VPC-native (alias IP) — pods get real VPC IPs
IAM integration: Workload Identity
Microsoft AKS (Azure Kubernetes Service)
06-kubernetes-in-production/README.md#microsoft-aksMicrosoft's managed Kubernetes offering. Strong integration with Azure Active Directory, Azure Container Registry, and Azure Monitor. Virtual Nodes (backed by Azure Container Instances) allow burst scaling without pre-provisioned VMs. The natural choice for Microsoft-centric organizations.
Control plane cost: Free
Networking: Azure CNI — pods get real VNet IPs
IAM integration: Azure AD Workload Identity
Which Should You Choose?
| Scenario | Recommendation |
|---|---|
| Already on AWS (RDS, S3, Lambda, etc.) | EKS — native integration with your existing AWS services |
| Best Kubernetes experience / greenfield | GKE — most mature, fastest upgrades, Autopilot for hands-off ops |
| Microsoft / Azure ecosystem (.NET, Azure AD, Windows) | AKS — free control plane, Windows node support, AD integration |
| Serverless / no node management | GKE Autopilot or EKS Fargate |
| Multi-cloud / cloud-agnostic | Any — Kubernetes abstracts the cloud; use the same YAML everywhere |
| Cost-sensitive, many small clusters | AKS (free control plane) or GKE Standard (free control plane) |
Key Concepts for Production Kubernetes
RBAC (Role-Based Access Control)
Controls who can do what in your cluster. In production, you never give everyone cluster-admin access. Define fine-grained roles and bind them to users or service accounts.
Network Policies
Kubernetes firewall rules for pods. By default, all pods can communicate with each other. Network Policies let you restrict this — e.g., only allow the 'api' pods to receive traffic from 'frontend' pods.
Pod Disruption Budgets (PDB)
Limits how many pods of a Deployment can be unavailable at the same time during voluntary disruptions (node upgrades, cluster maintenance). Without a PDB, a node upgrade could evict all your pods simultaneously.
Resource Quotas
Limit the total resources a namespace can consume. Prevents one team's workload from starving another — set limits on total CPU, memory, and pod count per namespace.
Cluster Autoscaler
Automatically adjusts the number of worker nodes based on pending pods and node utilization. Scale up when pods can't be scheduled; scale down when nodes are underutilized.
Summary
Every concept in this tutorial — Pods, Deployments, Services, Ingress, ConfigMaps, Secrets, PVCs, DaemonSets, Jobs, HPA, Security Contexts, Helm — works identically on EKS, GKE, AKS, and any on-prem cluster. The Kubernetes API is the same everywhere.
The cloud-specific differences are in the infrastructure layer beneath Kubernetes: how load balancers are provisioned, how storage is allocated, and how IAM is integrated. Once you understand those integrations for your chosen provider, you're ready to run production workloads.
📚 Core Concepts Reference
New to containers or Docker? This reference guide explains the foundational concepts that Kubernetes is built on — the “what” and “why” behind containers, Docker, and the key Kubernetes objects. Refer back to it whenever you encounter an unfamiliar term in the tutorial.
Containers & Docker
What is a Container?
A lightweight, standalone, executable package that includes everything needed to run a piece of software: code, runtime, system libraries, and configuration. Containers solve the 'it works on my machine' problem by bundling the application with its entire environment.
What is Docker?
The most popular platform for building, running, and sharing containers. Consists of Docker Engine (the runtime), Docker CLI (the docker command), Docker Hub (public registry), and Docker Desktop (GUI for macOS/Windows).
Containers vs. Virtual Machines
Containers share the host OS kernel — much less overhead than a full VM. They start in milliseconds (not minutes), use less memory, and are portable across any machine with a container runtime.
Container Image
A read-only template used to create containers. Built from a Dockerfile using docker build. Stored in a container registry (Docker Hub, ECR, GCR, ACR). Immutable — you replace it with a new version rather than modifying it.
Kubernetes Building Blocks
Pod
The smallest deployable unit in Kubernetes. Wraps one or more containers that share the same network namespace and storage. Pods are ephemeral — they can be created and destroyed at any time.
Node
A machine (physical or virtual) that runs your workloads. Each node runs a kubelet (agent), a container runtime (containerd), and kube-proxy. A cluster has one or more nodes.
Cluster
A set of nodes managed by Kubernetes. Has a control plane (API server, etcd, scheduler, controller manager) and worker nodes where your pods run.
Namespace
A virtual partition inside your cluster. Isolates resources between teams or environments. Default namespaces: default, kube-system, kube-public, kube-node-lease.
Deployment
Manages a set of identical pods. Ensures the desired number of replicas are always running. Handles rolling updates and rollbacks. The most common way to run stateless applications.
Service
A stable network endpoint that exposes a set of pods. Since pods are ephemeral (their IPs change), a Service provides a consistent way to reach them. Types: ClusterIP, NodePort, LoadBalancer.
ConfigMap & Secret
ConfigMaps store non-sensitive configuration data (env vars, config files). Secrets store sensitive data (passwords, API keys) — base64-encoded and optionally encrypted at rest.
Ingress
A set of routing rules that tells the ingress controller (e.g., Traefik, nginx) how to route incoming HTTP/HTTPS requests to services. Like a smart router sitting in front of your services.
Volume & PersistentVolume
A Volume is a directory accessible to containers in a pod. A PersistentVolume (PV) is a piece of storage provisioned by an admin. A PersistentVolumeClaim (PVC) is a request for storage by a user.
DaemonSet
Ensures that exactly one copy of a pod runs on every node. Used for log collectors (Fluent Bit), monitoring agents (Prometheus node exporter), and other node-level services.
Job & CronJob
A Job creates pods that run until a task completes successfully. A CronJob runs a Job on a schedule (like cron). Use for batch processing, database migrations, and periodic tasks.
StatefulSet
Like a Deployment, but for stateful workloads (databases, message queues). Provides stable network identities, ordered deployment/scaling, and persistent storage per pod.
HorizontalPodAutoscaler
Automatically scales the number of pod replicas based on observed CPU utilization or custom metrics. Set a target CPU percentage and min/max replica counts — Kubernetes handles the rest.
Security Context
Defines privilege and access control settings for Pods and containers. Key settings: runAsNonRoot, runAsUser, readOnlyRootFilesystem, allowPrivilegeEscalation, capabilities.drop.
Production Kubernetes Concepts
RBAC
Role-Based Access Control. Controls who can do what in your cluster. Define Roles (namespace-scoped) or ClusterRoles (cluster-wide), then bind them to users or service accounts.
Network Policy
Kubernetes firewall rules for pods. By default, all pods can communicate with each other. Network Policies let you restrict this — requires a CNI plugin that supports them (Calico, Cilium).
Pod Disruption Budget
Limits how many pods of a Deployment can be unavailable at the same time during voluntary disruptions (node upgrades). Prevents downtime during cluster maintenance.
Resource Quota
Limits the total resources a namespace can consume. Prevents one team's workload from starving another — set limits on total CPU, memory, and pod count per namespace.
Cluster Autoscaler
Automatically adjusts the number of worker nodes based on pending pods and node utilization. Scale up when pods can't be scheduled; scale down when nodes are underutilized.
CNI Plugin
Container Network Interface plugin — implements pod networking. Examples: Calico, Cilium, Flannel (on-prem), AWS VPC CNI (EKS), Azure CNI (AKS). Required for Network Policies.
Conclusion
This tutorial has taken you from zero to a solid understanding of Kubernetes — from spinning up your first local cluster with k3d, to writing YAML manifests, deploying applications, exposing them with Services and Ingress, managing them with Helm, exploring advanced topics like DaemonSets, StatefulSets, security contexts, and the HPA, and finally understanding how production Kubernetes works on AWS EKS, Google GKE, and Azure AKS.
The key insight is that the Kubernetes API is the same everywhere. Every concept you've learned — Pods, Deployments, Services, Ingress, ConfigMaps, Secrets, PVCs, Jobs, HPA, Helm — works identically on any Kubernetes cluster, whether it's running on your laptop with k3d or in production on a managed cloud service. The cloud-specific differences are only in the infrastructure layer beneath Kubernetes.
About the Author
Wayne Cheng is the founder and AI app developer at Audoir, LLC. Prior to founding Audoir, he worked as a hardware design engineer for Silicon Valley startups and an audio engineer for creative organizations. He holds an MSEE from UC Davis and a Music Technology degree from Foothill College.
Learn More
Further Exploration
Explore the complete tutorial repository and experiment with extending the examples. Consider deploying a real multi-service application, setting up a CI/CD pipeline that deploys to your cluster, or migrating one of the chapters to a managed cloud provider to deepen your understanding of production Kubernetes.
For more AI-powered development tools and tutorials, visit Audoir .