DevOps

Kubernetes Tutorial: From Zero to Production

A hands-on Kubernetes tutorial that takes you from zero to deploying and managing containerized applications on a local cluster. Learn core concepts, write YAML manifests, use Helm, explore advanced topics, and understand how Kubernetes works in production on AWS EKS, Google GKE, and Azure AKS.

30 min read
Published

Complete Tutorial Code

Follow along with the complete source code for this Kubernetes tutorial. Includes setup, deployments, services, Helm charts, advanced topics, and production guidance across six chapters.

View on GitHub

What is Kubernetes?

Kubernetes (often abbreviated as K8s) is an open-source container orchestration platform originally developed by Google. It automates the deployment, scaling, and management of containerized applications across a cluster of machines.

At its core, Kubernetes answers the question: “I have many containers — how do I run them reliably, at scale, across multiple machines?”

New to containers or Docker? Check out the Core Concepts reference guide first.

Key capabilities:

Self-Healing

Automatically restarts failed containers, replaces and reschedules them when nodes die — keeping your application running without manual intervention.

Horizontal Scaling

Scale your application up or down with a single command or automatically based on CPU/memory usage using the Horizontal Pod Autoscaler.

Load Balancing & Service Discovery

Distributes network traffic so deployments are stable. Containers can find each other by name without hardcoded IPs — built-in service discovery.

Secret & Config Management

Store and manage sensitive information separately from your container image using ConfigMaps & Secrets — keeping credentials out of your codebase.

When Should You Use Kubernetes?

Kubernetes is a powerful tool, but it's not always the right one. Here's how it compares to other deployment strategies:

✅ Use Kubernetes when:

  • You have multiple microservices that need to communicate, scale independently, and be deployed separately.
  • You need fine-grained control over resource allocation (CPU, memory limits per service).
  • You're running stateful workloads (databases, queues) alongside stateless ones.
  • You need multi-cloud or hybrid-cloud portability — your app runs the same way on AWS, GCP, Azure, or on-prem.
  • Your team needs advanced deployment strategies like canary releases, blue/green deployments, or A/B testing.
  • You're operating at significant scale (dozens of services, hundreds of instances).

⚡ Use Serverless (e.g., Vercel / Next.js) when:

  • You're building a frontend-heavy or full-stack web app where the platform handles all infrastructure.
  • You want zero infrastructure management — just push code and it's live.
  • Your traffic is unpredictable or low — serverless scales to zero and you only pay for what you use.
  • You don't need persistent connections, long-running processes, or custom networking.

Example: A Next.js marketing site or SaaS frontend deployed on Vercel is a perfect serverless use case. You don't need Kubernetes for that.

🖥️ Use Provisioned Resources (e.g., AWS EC2) when:

  • You need a specific OS environment or custom kernel configuration.
  • You're running legacy applications that aren't containerized.
  • You're doing GPU-intensive workloads (ML training) where you need direct hardware access.

Summary Table

ScenarioBest Choice
Microservices at scale✅ Kubernetes
Simple web app / frontend⚡ Serverless (Vercel, Netlify)
Full control over OS/hardware🖥️ EC2 / VMs
Multi-service backend with APIs✅ Kubernetes
Unpredictable traffic, pay-per-use⚡ Serverless
Legacy app, not containerized🖥️ EC2 / VMs
Portable, cloud-agnostic deployment✅ Kubernetes

Prerequisites

Before starting, make sure you have:

  • A computer running macOS, Linux, or Windows (WSL2 recommended on Windows)
  • Basic familiarity with the command line / terminal
  • Docker installed and running

Tutorial Chapters

The tutorial is organized into six chapters plus a core concepts reference guide. Work through them in order for the best learning experience.

1

🛠️ Chapter 01 — Setup

Install the required tools, create your first k3d cluster, and explore it with kubectl. By the end of this chapter you'll have a fully functional local Kubernetes cluster running on your machine.

k3dkubectlDockerLocal Cluster
2

🚀 Chapter 02 — Deployment

Write YAML manifests, create namespaces, deploy an application, and verify it with BusyBox. Learn the fundamental building blocks of Kubernetes workloads — Pods, ReplicaSets, and Deployments.

YAML ManifestsNamespacesPodsDeploymentsBusyBox
3

🌐 Chapter 03 — Services and Beyond

Expose your app with a LoadBalancer Service, add resource limits, and learn Kubernetes architecture. Understand how traffic flows from the outside world into your cluster and between services.

ServicesLoadBalancerClusterIPResource LimitsArchitecture
4

⛵ Chapter 04 — Helm

Use Helm — the Kubernetes package manager — to install, upgrade, and roll back applications on a k3d cluster with podinfo. Learn how Helm charts simplify complex deployments and enable repeatable releases.

HelmHelm ChartspodinfoUpgradesRollbacks
5

🔬 Chapter 05 — Advanced Topics

Pod controllers (Deployment, DaemonSet, Job), stateful workloads, security contexts, Snyk scanning, Prometheus & Grafana monitoring, and Horizontal Pod Autoscaler. Everything you need to run production-grade workloads.

DaemonSetStatefulSetJobsHPAPrometheusGrafanaSnyk
6

☁️ Chapter 06 — Kubernetes in Production

How this tutorial compares to on-prem and managed cloud Kubernetes. Deep dive on AWS EKS, Google GKE, and Azure AKS — understand the trade-offs and operational considerations for each managed Kubernetes offering.

AWS EKSGoogle GKEAzure AKSOn-PremProduction

🛠️ Chapter 01 — Setup: Your First Kubernetes Cluster

In this chapter, we'll install all the tools you need and spin up a local Kubernetes cluster using k3d. By the end, you'll have a running cluster and know how to inspect it with kubectl.

Prerequisites

Before we begin, make sure you have:

  • A computer running macOS, Linux, or Windows (WSL2 strongly recommended on Windows)
  • A terminal / command-line interface
  • An internet connection (to pull Docker images)

Install Docker

k3d runs Kubernetes nodes as Docker containers, so Docker is required.

macOS

Download and install Docker Desktop from the official site:

docs.docker.com/desktop/install/mac-install/

Linux

curl -fsSL https://get.docker.com | sh

Verify Docker is working

docker --version
docker run hello-world

Install kubectl

kubectl is the command-line tool for interacting with any Kubernetes cluster.

macOS (Homebrew)

brew install kubectl

Linux

curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl"
chmod +x kubectl
sudo mv kubectl /usr/local/bin/

Verify

kubectl version --client

Install k3d

k3d is a lightweight wrapper that runs k3s (a minimal Kubernetes distribution) inside Docker containers. It makes creating local Kubernetes clusters incredibly fast and easy.

macOS (Homebrew)

brew install k3d

macOS / Linux (install script)

curl -s https://raw.githubusercontent.com/k3d-io/k3d/main/install.sh | bash

Verify

k3d --version

Create Your First Cluster

Now for the fun part — let's create a Kubernetes cluster! With k3d, this is a single command:

k3d cluster create mycluster

What's happening here? k3d pulls the k3s Docker image, starts a Docker container that acts as your Kubernetes server node (control plane), and automatically configures kubectl to connect to this new cluster by updating your ~/.kube/config file.

Explore the Cluster

Now let's use kubectl to inspect what's running in our new cluster.

  1. 1
    kubectl cluster-info

    Displays the addresses of the Kubernetes control plane and core services (like CoreDNS). Confirms your kubectl is connected to the right cluster.

  2. 2
    kubectl get nodes

    Lists all the nodes in your cluster. A node is a machine (in our case, a Docker container) that runs your workloads. STATUS: Ready means the node is healthy.

  3. 3
    kubectl get namespaces

    Lists all namespaces in the cluster. Namespaces are virtual sub-clusters — default, kube-system, kube-public, and kube-node-lease are created automatically.

  4. 4
    kubectl get pods -A

    Lists all pods across all namespaces (-A = --all-namespaces). You'll see system pods: CoreDNS, local-path-provisioner, metrics-server, and Traefik.

  5. 5
    kubectl get services -A

    Lists all services across all namespaces. Services provide stable network endpoints to access sets of pods.

Stop the Cluster

When you're done with this chapter, stop and delete the cluster so the next chapter starts fresh:

k3d cluster stop mycluster
k3d cluster delete mycluster

Summary

  • Installed Docker, kubectl, and k3d
  • Created a local Kubernetes cluster with k3d cluster create mycluster
  • Explored the cluster using kubectl cluster-info, get nodes, get namespaces, get pods -A, and get services -A
  • Learned how to stop and manage your cluster

🚀 Chapter 02 — Deployment: Running Your First Application

In this chapter, we'll deploy a real application to Kubernetes. You'll learn how to write YAML manifests, create a namespace, deploy an app with a Deployment, and verify it's running — all using kubectl.

Concepts: YAML & Infrastructure as Code

Infrastructure as Code (IaC) means defining your infrastructure in files that can be version-controlled, reviewed, and applied automatically. In Kubernetes, every resource — namespaces, deployments, services — is defined in YAML files. This is IaC in practice.

GitOps takes this further: your Git repository becomes the single source of truth. When you push a change to a YAML file, an automated system applies it to the cluster — giving you a full audit trail, easy rollbacks, and consistent deployments.

YAML Basics

YAML is the format Kubernetes uses for all its resource definitions. Key rules to remember:

  • Use spaces, not tabs (YAML is whitespace-sensitive)
  • Files use .yaml or .yml extension
  • --- marks the beginning of a document
  • # starts a comment
  • Indentation creates hierarchy — more indented = nested inside the parent
  • Lists use - as a bullet

Set Up a 4-Node Cluster

For this chapter, we'll create a cluster with 1 server (control plane) node and 3 agent (worker) nodes — 4 nodes total. This lets us see how Kubernetes schedules pods across multiple nodes.

k3d cluster create mycluster --agents 3

Create a Namespace

A Namespace is a virtual partition inside your cluster. We'll create a development namespace to isolate our workloads from the system components.

# namespace.yaml
apiVersion: v1
kind: Namespace
metadata:
  name: development
kubectl apply -f namespace.yaml

Deploy an Application

Now let's deploy an application. We'll use pod-info-app — a simple Node.js app that displays information about the pod it's running in (name, namespace, IP address).

# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: pod-info-deployment
  namespace: development
  labels:
    app: pod-info
spec:
  replicas: 3
  selector:
    matchLabels:
      app: pod-info
  template:
    metadata:
      labels:
        app: pod-info
    spec:
      containers:
        - name: pod-info-container
          image: kimschles/pod-info-app:latest
          ports:
            - containerPort: 3000
          env:
            - name: POD_NAME
              valueFrom:
                fieldRef:
                  fieldPath: metadata.name
            - name: POD_NAMESPACE
              valueFrom:
                fieldRef:
                  fieldPath: metadata.namespace
            - name: POD_IP
              valueFrom:
                fieldRef:
                  fieldPath: status.podIP

Inspect Deployments and Pods

# Apply the deployment
kubectl apply -f deployment.yaml

# Check the deployment
kubectl get deployments -n development

# Check the pods
kubectl get pods -n development

# Get more details about a pod
kubectl describe pod <pod-name> -n development

Self-Healing in Action

Kubernetes automatically restarts failed pods. Try deleting a pod and watch Kubernetes recreate it:

# Delete a pod (replace with actual pod name)
kubectl delete pod pod-info-deployment-xxxxx -n development

# Watch Kubernetes recreate it immediately
kubectl get pods -n development --watch

Test with BusyBox

Deploy a BusyBox pod to test connectivity from inside the cluster:

# busybox.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: busybox
  namespace: default
spec:
  replicas: 1
  selector:
    matchLabels:
      app: busybox
  template:
    metadata:
      labels:
        app: busybox
    spec:
      containers:
        - name: busybox
          image: busybox:latest
          command: ["sleep", "3600"]
# Get pod IPs
kubectl get pods -n development -o wide

# Exec into BusyBox
kubectl exec -it busybox-<id> -- /bin/sh

# Inside BusyBox: make an HTTP request to your app
wget <pod-ip>:3000
cat index.html

View Application Logs

# View logs from a pod
kubectl logs <pod-name> -n development

# Follow logs in real time
kubectl logs -f <pod-name> -n development

# Show logs from all pods in the deployment
kubectl logs -l app=pod-info -n development

Scale Up: Add Nodes and Increase Replicas

Add 4 more agent nodes to the running cluster and scale the deployment to 16 replicas:

# Add 4 more agent nodes
k3d node create mycluster-extra --cluster mycluster --role agent --replicas 4

# Verify all 8 nodes are ready
kubectl get nodes

# Update deployment.yaml: set replicas: 16
# Then apply the change
kubectl apply -f deployment.yaml

# Watch pods roll out across all nodes
kubectl get pods -n development -o wide

Summary

  • Learned YAML syntax and why it's used for IaC and GitOps
  • Created a 4-node k3d cluster (1 server + 3 agents)
  • Created a development namespace to isolate workloads
  • Wrote and applied a Deployment YAML to run 3 replicas of an app
  • Witnessed Kubernetes self-healing by deleting a pod and watching it recover
  • Used BusyBox to test your app from inside the cluster
  • Viewed application logs with kubectl logs
  • Scaled the cluster from 4 to 8 nodes and the deployment from 3 to 16 replicas

🌐 Chapter 03 — Services, Resource Limits, and Kubernetes Architecture

In this chapter, we'll expose our application to the internet using a Service and an Ingress, add resource limits to our pods, and then take a step back to understand how Kubernetes actually works under the hood — the control plane, worker nodes, and how they coordinate together.

Set Up the Cluster

Create a k3d cluster with a port mapping so that traffic from your computer reaches the cluster's built-in load balancer:

k3d cluster create mycluster --agents 2 -p "8080:80@loadbalancer"

This maps port 8080 on your computer to port 80 on the k3d load balancer container. k3d comes with Traefik pre-installed as an ingress controller — it listens on port 80 inside the cluster and routes incoming HTTP requests to the right service.

Expose Your App with a Service and Ingress

Right now, our pods are running but they're only reachable from inside the cluster. To access the app from your browser, we need two things:

  • A Service — gives the pods a stable internal IP address inside the cluster
  • An Ingress — routes HTTP traffic from outside the cluster to the Service

Service Types

ClusterIP (default)

Only reachable inside the cluster. Used for internal service-to-service communication. The Ingress controller will route external traffic to it.

NodePort

Exposes the service on each node's IP at a static port. Accessible from outside the cluster using NodeIP:NodePort. Useful for development and testing.

LoadBalancer

Provisions an external load balancer (in cloud environments). k3d simulates this locally. The standard way to expose services in production.

The Service YAML

# service.yaml
apiVersion: v1
kind: Service
metadata:
  name: pod-info-service
  namespace: development
spec:
  type: ClusterIP
  selector:
    app: pod-info
  ports:
    - port: 80
      targetPort: 3000

The Ingress YAML

# ingress.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: pod-info-ingress
  namespace: development
  annotations:
    ingress.kubernetes.io/ssl-redirect: "false"
spec:
  rules:
  - http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: pod-info-service
            port:
              number: 80
kubectl apply -f service.yaml
kubectl apply -f ingress.yaml

# Open in your browser
open http://localhost:8080

Add Resource Requests and Limits

Resource requests and limits tell Kubernetes how much CPU and memory each container needs and is allowed to use. This enables the scheduler to make intelligent placement decisions.

# Add to the container spec in deployment.yaml
resources:
  requests:
    memory: "64Mi"
    cpu: "250m"
  limits:
    memory: "128Mi"
    cpu: "500m"

Kubernetes will perform a rolling update — replacing pods one at a time so the app stays available throughout the update.

Kubernetes Architecture

A Kubernetes cluster has two main parts: the control plane and the worker nodes.

The Control Plane

Kube API Server

The front door to Kubernetes. Every interaction with the cluster goes through it — kubectl, CI/CD pipelines, and other Kubernetes components all talk to the API server.

etcd

A distributed key-value store that holds the entire state of the cluster. If etcd is lost, the cluster loses its memory — this is why etcd backups are critical in production.

Kube Scheduler

Watches for newly created pods that haven't been assigned to a node yet, and decides which node they should run on based on available resources and constraints.

Controller Manager

Runs control loops that continuously watch the cluster state and take action to reconcile it with the desired state. Powers Kubernetes' self-healing behavior.

Worker Nodes

kubelet

An agent that runs on every worker node. Watches the API server for pods assigned to its node, starts and stops containers, and reports node health back to the control plane.

Container Runtime

When the kubelet needs to start a container, it uses the Container Runtime Interface (CRI) to talk to the container runtime (containerd). The runtime pulls the image and starts the container.

kube-proxy

Runs on every node and maintains network rules that allow pods and services to communicate. Handles load balancing across pod replicas at the network level.

How They Work Together

Here's the sequence of events when you run kubectl apply -f deployment.yaml:

  1. 1You run kubectl apply → API Server validates and stores desired state in etcd
  2. 2Controller Manager detects new Deployment → creates 3 Pending pods in etcd
  3. 3Scheduler assigns each pod to a node based on available resources
  4. 4kubelet on each assigned node detects the pod assignment
  5. 5kubelet tells the container runtime to pull the image and start the container
  6. 6Pod is Running ✅ — kubelet reports status back to the API server

Summary

  • Created a k3d cluster with a port mapping (-p "8080:80@loadbalancer") to expose services locally
  • Created a ClusterIP Service to give the pods a stable internal endpoint
  • Created an Ingress to route external traffic through Traefik to the Service
  • Opened the app at http://localhost:8080 and saw the load balancer distribute requests across all 3 pods
  • Added CPU and memory resource requests and limits to the deployment
  • Learned how the Kubernetes control plane and worker nodes work together

⛵ Chapter 04 — Helm: The Kubernetes Package Manager

In this chapter, we'll use Helm — the package manager for Kubernetes — to deploy a real application to our k3d cluster. Instead of writing and managing individual YAML files for every resource, Helm lets you install a complete application with a single command and customize it with a simple values file.

We'll deploy podinfo — a small Go web application purpose-built for demonstrating Kubernetes features.

Install Helm

Helm is a separate CLI tool you install alongside kubectl.

macOS (Homebrew)

brew install helm

Linux

curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash

Verify

helm version

Set Up the Cluster

k3d cluster create mycluster --agents 1 -p "8080:80@loadbalancer"

Add the podinfo Helm Repository

A Helm repository is a collection of charts hosted at a URL — similar to how apt or brew have package repositories.

# Add the podinfo chart repository
helm repo add podinfo https://stefanprodan.github.io/podinfo

# Update your local cache
helm repo update

# List your configured repositories
helm repo list

Explore the Chart

Before installing anything, inspect the chart to understand what it will create and what you can configure:

# View chart metadata
helm show chart podinfo/podinfo

# View the default values (all configurable options)
helm show values podinfo/podinfo

# Search for charts on Artifact Hub
helm search hub podinfo

# Search within your added repositories
helm search repo podinfo

Install podinfo with a Values File

Rather than passing every option on the command line, create a values.yaml file to override the defaults:

# values.yaml
replicaCount: 2

ui:
  color: "#4CAF50"
  message: "Hello from Helm + k3d!"

ingress:
  enabled: true
  className: ""
  annotations:
    ingress.kubernetes.io/ssl-redirect: "false"
  hosts:
    - host: ""
      paths:
        - path: /
          pathType: Prefix

resources:
  requests:
    cpu: 10m
    memory: 32Mi
  limits:
    memory: 64Mi
# Create the namespace
kubectl create namespace podinfo

# Install the chart
helm install podinfo podinfo/podinfo --namespace podinfo -f values.yaml

💡 Release vs. Chart

A chart is the package (like a recipe). A release is a specific installation of that chart in your cluster (like a meal you cooked from the recipe). You can install the same chart multiple times with different release names and different values — for example, a podinfo-staging release and a podinfo-production release.

Verify the Deployment

# Check the pods
kubectl get pods -n podinfo

# Check the Helm release
helm list -n podinfo

# Open in your browser
open http://localhost:8080

Upgrade a Release

Update the values file (e.g., change replicaCount: 3) and upgrade the release:

helm upgrade podinfo podinfo/podinfo --namespace podinfo -f values.yaml

# Check the release history
helm history podinfo -n podinfo

Roll Back a Release

If something goes wrong, roll back to a previous revision:

# Roll back to revision 1
helm rollback podinfo 1 -n podinfo

# Verify the rollback
helm history podinfo -n podinfo

Uninstall a Release

helm uninstall podinfo -n podinfo

# Delete the cluster when done
k3d cluster delete mycluster

Summary

  • Installed Helm and added the podinfo chart repository
  • Explored the chart's default values with helm show values
  • Created a values.yaml file to customize the deployment
  • Installed podinfo with helm install and verified it in the browser
  • Upgraded the release with helm upgrade and checked the history
  • Rolled back to a previous revision with helm rollback
  • Uninstalled the release with helm uninstall

🔬 Chapter 05 — Advanced Topics

In this chapter, we'll explore several advanced Kubernetes topics by running real commands against a k3d cluster. We'll cover different ways to manage pods, persistent storage, security hardening, logging and monitoring with Prometheus and Grafana, and automatic scaling with the Horizontal Pod Autoscaler.

Set Up the Cluster

k3d cluster create mycluster --agents 2 -p "8080:80@loadbalancer"
kubectl create namespace advanced

Ways to Manage Pods

Kubernetes provides several different controllers for managing pods. Each is designed for a specific use case.

Deployment: Rolling Updates

Deployments support rolling updates — Kubernetes replaces pods one at a time with zero downtime. Configure the strategy in your deployment YAML:

spec:
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 1   # At most 1 pod can be unavailable during an update
      maxSurge: 1         # At most 1 extra pod can be created during an update
# Watch the rolling update in real time
kubectl get pods -n advanced --watch

# Check rollout history
kubectl rollout history deployment/pod-info-deployment -n advanced

# Roll back to the previous version
kubectl rollout undo deployment/pod-info-deployment -n advanced

DaemonSet: One Pod Per Node

A DaemonSet ensures that exactly one copy of a pod runs on every node. Useful for log collectors, monitoring agents, and other node-level services.

# daemonset.yaml
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: log-collector
  namespace: advanced
spec:
  selector:
    matchLabels:
      app: log-collector
  template:
    metadata:
      labels:
        app: log-collector
    spec:
      containers:
        - name: log-collector
          image: busybox:latest
          command: ["sh", "-c", "while true; do echo 'Collecting logs from node'; sleep 30; done"]
          volumeMounts:
            - name: varlog
              mountPath: /var/log
              readOnly: true
          resources:
            requests:
              memory: "16Mi"
              cpu: "10m"
            limits:
              memory: "32Mi"
              cpu: "50m"
      volumes:
        - name: varlog
          hostPath:
            path: /var/log
kubectl apply -f daemonset.yaml

# Verify one pod per node
kubectl get pods -n advanced -l app=log-collector -o wide

Job: Run to Completion

A Job creates pods that run until a task completes successfully. Use parallelism and completions for parallel processing.

# job.yaml
apiVersion: batch/v1
kind: Job
metadata:
  name: pi-calculator
  namespace: advanced
spec:
  template:
    spec:
      containers:
        - name: pi
          image: perl:slim
          command: ["perl", "-Mbignum=bpi", "-wle", "print bpi(2000)"]
          resources:
            requests:
              memory: "64Mi"
              cpu: "100m"
            limits:
              memory: "128Mi"
              cpu: "250m"
      restartPolicy: Never

Persistent Volumes

Pods are ephemeral — when they restart, their data is lost. A PersistentVolumeClaim (PVC) requests durable storage that survives pod restarts. We'll deploy PostgreSQL with a PVC to demonstrate stateful workloads.

# postgres-pvc.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: postgres-pvc
  namespace: advanced
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi

Kubernetes Security

Security Context in Action

A Security Context defines privilege and access control settings for Pods and containers. Compare an insecure deployment (running as root) with a hardened one:

# Hardened security context
securityContext:
  runAsNonRoot: true
  runAsUser: 1000
  readOnlyRootFilesystem: true
  allowPrivilegeEscalation: false
  capabilities:
    drop:
      - ALL

Scanning with Snyk

Snyk scans your YAML files for security misconfigurations before they reach production:

# Install Snyk CLI
npm install -g snyk

# Authenticate
snyk auth

# Scan your YAML files
snyk iac test deployment.yaml

Logging and Monitoring with Prometheus and Grafana

Install the kube-prometheus-stack Helm chart to get Prometheus (metrics collection) and Grafana (visualization) running in your cluster in minutes:

# Add the prometheus-community Helm repository
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update

# Create the monitoring namespace
kubectl create namespace monitoring

# Install the kube-prometheus-stack
helm install prometheus prometheus-community/kube-prometheus-stack   --namespace monitoring   --set grafana.adminPassword=admin

# Port-forward Grafana to your browser
kubectl port-forward svc/prometheus-grafana 3000:80 -n monitoring

Open http://localhost:3000 in your browser (username: admin, password: admin). Pre-built dashboards give you instant visibility into cluster health, node resource usage, and application metrics.

Horizontal Pod Autoscaler (HPA)

The HPA automatically scales the number of Pod replicas based on observed CPU utilization or custom metrics. Set a target CPU percentage and min/max replica counts — Kubernetes handles the rest.

# Create an HPA targeting 30% CPU utilization
kubectl autoscale deployment pod-info-deployment   --namespace advanced   --cpu-percent=30   --min=2   --max=8

# Watch the HPA
kubectl get hpa -n advanced --watch

# Generate load to trigger scaling
kubectl run load-generator --image=busybox -n advanced --   /bin/sh -c "while true; do wget -q -O- http://pod-info-service; done"

# Stop the load generator
kubectl delete pod load-generator -n advanced

💡 HPA + Cluster Autoscaler

The HPA scales pods. The Cluster Autoscaler scales nodes. Together they give you fully elastic infrastructure: HPA (scales pods) + Cluster Autoscaler (scales nodes) = fully elastic infrastructure

Summary

  • Deployed a Deployment and triggered a rolling update — watching pods replaced one at a time with zero downtime
  • Created a DaemonSet and verified one pod ran on every node
  • Ran a Job to completion and used a parallel Job to process multiple tasks simultaneously
  • Created a PersistentVolumeClaim, deployed PostgreSQL with it, and verified data survived a pod restart
  • Compared an insecure deployment (running as root) with a hardened one using a Security Context
  • Scanned YAML files with Snyk and fixed all reported issues
  • Installed Prometheus + Grafana with Helm and explored pre-built Kubernetes dashboards
  • Created an HPA, generated load, and watched it automatically scale pods up and back down

☁️ Chapter 06 — Kubernetes in Production: Local, On-Prem, and the Cloud

Throughout this tutorial, you've been running Kubernetes locally using k3d — a lightweight tool that runs a full Kubernetes cluster inside Docker containers on your computer. That's great for learning, but production Kubernetes looks very different.

In this chapter, we'll compare three ways to run Kubernetes in production, then deep-dive into the three major cloud providers' managed Kubernetes offerings: Amazon EKS, Google GKE, and Microsoft AKS.

What You've Been Running: k3d

✅ What k3d gives you

  • A full Kubernetes API — everything transfers to production
  • Fast cluster creation — a new cluster in seconds
  • Multi-node simulation with Docker containers
  • Zero cost — runs entirely on your computer

❌ What k3d doesn't give you

  • Real hardware isolation
  • High availability
  • Production-grade networking
  • Persistent enterprise storage
  • Real cloud load balancers
  • Scalability beyond your laptop

On-Premises Kubernetes

On-premises Kubernetes means running Kubernetes on hardware that you own and operate — either in your own data center or in a co-location facility.

Popular on-prem distributions:

kubeadmThe official Kubernetes bootstrapping tool. Gives you a vanilla upstream Kubernetes cluster.
k3sThe same lightweight distribution behind k3d, but deployed on real servers. Great for edge computing.
RKE2Rancher's hardened Kubernetes distribution, focused on security and compliance.
OpenShiftRed Hat's enterprise Kubernetes platform with additional developer tooling and stricter security defaults.
Talos LinuxAn immutable, API-driven OS designed specifically for running Kubernetes.

When you run Kubernetes on-prem, you own the entire stack: hardware procurement, OS patching, Kubernetes upgrades, etcd backups, high availability, networking (CNI plugin), storage (CSI driver), load balancing, certificate management, and security.

On-prem makes sense for: data sovereignty/compliance, existing hardware investment, air-gapped environments, and predictable high-volume workloads.

Managed Kubernetes on the Cloud

A managed Kubernetes service means the cloud provider handles the control plane for you. You focus on deploying your applications; they handle the infrastructure.

Key advantages over on-prem:

No control plane opsNever worry about etcd backups, API server upgrades, or certificate rotation
Cloud-native integrationsLoadBalancer services automatically provision real cloud load balancers; PVCs provision real cloud disks
Cluster AutoscalerAutomatically add or remove worker nodes based on pending pods
Pay-as-you-goPay only for the worker nodes you use (control plane is often free)
Managed upgradesUpgrade Kubernetes versions with a single API call or click
Global availabilityDeploy clusters in any region worldwide in minutes

Side-by-Side Comparison

k3d (this tutorial)On-PremisesManaged Cloud
PurposeLocal dev & learningFull control productionManaged production
Control planeDocker on your computerYou install & manageCloud provider manages
Worker nodesDocker containersPhysical servers / VMsCloud VMs
High availabilityNoYou configureBuilt-in (provider SLA)
K8s upgradesRecreate clusterManual, complexOne-click or automated
CostFreeHardware + ops teamPay-per-use
Ops burdenNoneVery highLow to medium

Cloud Provider Comparison: EKS vs. GKE vs. AKS

All three major cloud providers offer managed Kubernetes. They all run standard Kubernetes under the hood — your YAML manifests work on all three. The differences are in the surrounding ecosystem, pricing, and operational experience.

Amazon EKS (Elastic Kubernetes Service)

06-kubernetes-in-production/README.md#amazon-eks

AWS's managed Kubernetes service. The most widely adopted managed Kubernetes offering. Deep integration with IAM, ALB Ingress Controller, EBS/EFS storage, and CloudWatch. Fargate support lets you run Pods without managing EC2 nodes.

Control plane cost: $0.10/hour per cluster (~$73/month)

Networking: AWS VPC CNI — pods get real VPC IP addresses

IAM integration: IRSA (IAM Roles for Service Accounts)

# Create an EKS cluster with eksctl eksctl create cluster \ --name my-cluster \ --region us-east-1 \ --nodegroup-name standard-workers \ --node-type t3.medium \ --nodes 3
IAM IntegrationFargateALB Ingresseksctl

Google GKE (Google Kubernetes Engine)

06-kubernetes-in-production/README.md#google-gke

Google invented Kubernetes, and GKE shows it. Autopilot mode fully manages the node infrastructure for you. Best-in-class auto-upgrade, auto-repair, and Workload Identity for secure GCP service access. Often considered the most “Kubernetes-native” experience.

Control plane cost: Free (Standard tier)

Networking: VPC-native (alias IP) — pods get real VPC IPs

IAM integration: Workload Identity

# Create a GKE Autopilot cluster gcloud container clusters create-auto my-autopilot-cluster \ --region us-central1
AutopilotWorkload IdentityAuto-Upgradegcloud

Microsoft AKS (Azure Kubernetes Service)

06-kubernetes-in-production/README.md#microsoft-aks

Microsoft's managed Kubernetes offering. Strong integration with Azure Active Directory, Azure Container Registry, and Azure Monitor. Virtual Nodes (backed by Azure Container Instances) allow burst scaling without pre-provisioned VMs. The natural choice for Microsoft-centric organizations.

Control plane cost: Free

Networking: Azure CNI — pods get real VNet IPs

IAM integration: Azure AD Workload Identity

# Create an AKS cluster az aks create \ --resource-group myResourceGroup \ --name myAKSCluster \ --node-count 3 \ --node-vm-size Standard_D2s_v3 \ --generate-ssh-keys
Azure ADVirtual NodesACR Integrationaz cli

Which Should You Choose?

ScenarioRecommendation
Already on AWS (RDS, S3, Lambda, etc.)EKS — native integration with your existing AWS services
Best Kubernetes experience / greenfieldGKE — most mature, fastest upgrades, Autopilot for hands-off ops
Microsoft / Azure ecosystem (.NET, Azure AD, Windows)AKS — free control plane, Windows node support, AD integration
Serverless / no node managementGKE Autopilot or EKS Fargate
Multi-cloud / cloud-agnosticAny — Kubernetes abstracts the cloud; use the same YAML everywhere
Cost-sensitive, many small clustersAKS (free control plane) or GKE Standard (free control plane)

Key Concepts for Production Kubernetes

RBAC (Role-Based Access Control)

Controls who can do what in your cluster. In production, you never give everyone cluster-admin access. Define fine-grained roles and bind them to users or service accounts.

Network Policies

Kubernetes firewall rules for pods. By default, all pods can communicate with each other. Network Policies let you restrict this — e.g., only allow the 'api' pods to receive traffic from 'frontend' pods.

Pod Disruption Budgets (PDB)

Limits how many pods of a Deployment can be unavailable at the same time during voluntary disruptions (node upgrades, cluster maintenance). Without a PDB, a node upgrade could evict all your pods simultaneously.

Resource Quotas

Limit the total resources a namespace can consume. Prevents one team's workload from starving another — set limits on total CPU, memory, and pod count per namespace.

Cluster Autoscaler

Automatically adjusts the number of worker nodes based on pending pods and node utilization. Scale up when pods can't be scheduled; scale down when nodes are underutilized.

Summary

Every concept in this tutorial — Pods, Deployments, Services, Ingress, ConfigMaps, Secrets, PVCs, DaemonSets, Jobs, HPA, Security Contexts, Helm — works identically on EKS, GKE, AKS, and any on-prem cluster. The Kubernetes API is the same everywhere.

The cloud-specific differences are in the infrastructure layer beneath Kubernetes: how load balancers are provisioned, how storage is allocated, and how IAM is integrated. Once you understand those integrations for your chosen provider, you're ready to run production workloads.

📚 Core Concepts Reference

New to containers or Docker? This reference guide explains the foundational concepts that Kubernetes is built on — the “what” and “why” behind containers, Docker, and the key Kubernetes objects. Refer back to it whenever you encounter an unfamiliar term in the tutorial.

Containers & Docker

What is a Container?

A lightweight, standalone, executable package that includes everything needed to run a piece of software: code, runtime, system libraries, and configuration. Containers solve the 'it works on my machine' problem by bundling the application with its entire environment.

What is Docker?

The most popular platform for building, running, and sharing containers. Consists of Docker Engine (the runtime), Docker CLI (the docker command), Docker Hub (public registry), and Docker Desktop (GUI for macOS/Windows).

Containers vs. Virtual Machines

Containers share the host OS kernel — much less overhead than a full VM. They start in milliseconds (not minutes), use less memory, and are portable across any machine with a container runtime.

Container Image

A read-only template used to create containers. Built from a Dockerfile using docker build. Stored in a container registry (Docker Hub, ECR, GCR, ACR). Immutable — you replace it with a new version rather than modifying it.

Kubernetes Building Blocks

Pod

The smallest deployable unit in Kubernetes. Wraps one or more containers that share the same network namespace and storage. Pods are ephemeral — they can be created and destroyed at any time.

Node

A machine (physical or virtual) that runs your workloads. Each node runs a kubelet (agent), a container runtime (containerd), and kube-proxy. A cluster has one or more nodes.

Cluster

A set of nodes managed by Kubernetes. Has a control plane (API server, etcd, scheduler, controller manager) and worker nodes where your pods run.

Namespace

A virtual partition inside your cluster. Isolates resources between teams or environments. Default namespaces: default, kube-system, kube-public, kube-node-lease.

Deployment

Manages a set of identical pods. Ensures the desired number of replicas are always running. Handles rolling updates and rollbacks. The most common way to run stateless applications.

Service

A stable network endpoint that exposes a set of pods. Since pods are ephemeral (their IPs change), a Service provides a consistent way to reach them. Types: ClusterIP, NodePort, LoadBalancer.

ConfigMap & Secret

ConfigMaps store non-sensitive configuration data (env vars, config files). Secrets store sensitive data (passwords, API keys) — base64-encoded and optionally encrypted at rest.

Ingress

A set of routing rules that tells the ingress controller (e.g., Traefik, nginx) how to route incoming HTTP/HTTPS requests to services. Like a smart router sitting in front of your services.

Volume & PersistentVolume

A Volume is a directory accessible to containers in a pod. A PersistentVolume (PV) is a piece of storage provisioned by an admin. A PersistentVolumeClaim (PVC) is a request for storage by a user.

DaemonSet

Ensures that exactly one copy of a pod runs on every node. Used for log collectors (Fluent Bit), monitoring agents (Prometheus node exporter), and other node-level services.

Job & CronJob

A Job creates pods that run until a task completes successfully. A CronJob runs a Job on a schedule (like cron). Use for batch processing, database migrations, and periodic tasks.

StatefulSet

Like a Deployment, but for stateful workloads (databases, message queues). Provides stable network identities, ordered deployment/scaling, and persistent storage per pod.

HorizontalPodAutoscaler

Automatically scales the number of pod replicas based on observed CPU utilization or custom metrics. Set a target CPU percentage and min/max replica counts — Kubernetes handles the rest.

Security Context

Defines privilege and access control settings for Pods and containers. Key settings: runAsNonRoot, runAsUser, readOnlyRootFilesystem, allowPrivilegeEscalation, capabilities.drop.

Production Kubernetes Concepts

RBAC

Role-Based Access Control. Controls who can do what in your cluster. Define Roles (namespace-scoped) or ClusterRoles (cluster-wide), then bind them to users or service accounts.

Network Policy

Kubernetes firewall rules for pods. By default, all pods can communicate with each other. Network Policies let you restrict this — requires a CNI plugin that supports them (Calico, Cilium).

Pod Disruption Budget

Limits how many pods of a Deployment can be unavailable at the same time during voluntary disruptions (node upgrades). Prevents downtime during cluster maintenance.

Resource Quota

Limits the total resources a namespace can consume. Prevents one team's workload from starving another — set limits on total CPU, memory, and pod count per namespace.

Cluster Autoscaler

Automatically adjusts the number of worker nodes based on pending pods and node utilization. Scale up when pods can't be scheduled; scale down when nodes are underutilized.

CNI Plugin

Container Network Interface plugin — implements pod networking. Examples: Calico, Cilium, Flannel (on-prem), AWS VPC CNI (EKS), Azure CNI (AKS). Required for Network Policies.

Conclusion

This tutorial has taken you from zero to a solid understanding of Kubernetes — from spinning up your first local cluster with k3d, to writing YAML manifests, deploying applications, exposing them with Services and Ingress, managing them with Helm, exploring advanced topics like DaemonSets, StatefulSets, security contexts, and the HPA, and finally understanding how production Kubernetes works on AWS EKS, Google GKE, and Azure AKS.

The key insight is that the Kubernetes API is the same everywhere. Every concept you've learned — Pods, Deployments, Services, Ingress, ConfigMaps, Secrets, PVCs, Jobs, HPA, Helm — works identically on any Kubernetes cluster, whether it's running on your laptop with k3d or in production on a managed cloud service. The cloud-specific differences are only in the infrastructure layer beneath Kubernetes.

About the Author

Wayne Cheng is the founder and AI app developer at Audoir, LLC. Prior to founding Audoir, he worked as a hardware design engineer for Silicon Valley startups and an audio engineer for creative organizations. He holds an MSEE from UC Davis and a Music Technology degree from Foothill College.

Learn More

Further Exploration

Explore the complete tutorial repository and experiment with extending the examples. Consider deploying a real multi-service application, setting up a CI/CD pipeline that deploys to your cluster, or migrating one of the chapters to a managed cloud provider to deepen your understanding of production Kubernetes.

For more AI-powered development tools and tutorials, visit Audoir .