Skip to content

JitulKumarL/kubernetes-the-hard-way-aws

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

2 Commits
Β 
Β 

Repository files navigation

Kubernetes the Hard Way on AWS

A production-grade Kubernetes cluster built from scratch on AWS infrastructure without using kubeadm, EKS, or any managed tooling. This project demonstrates deep understanding of Kubernetes internals, networking, security, and operational troubleshooting.

πŸ“– Attribution

This project is based on Kubernetes the Hard Way by Kelsey Hightower, adapted for AWS infrastructure. While the original guide uses GCP, I implemented the cluster on AWS with custom VPC design, Network Load Balancer for HA, and AWS-specific security configurations. The debugging challenges and solutions documented here are from my actual implementation experience.

🎯 Project Overview

This implementation follows the "Kubernetes the Hard Way" methodology, where every component is manually configured to understand the complete architecture of a production Kubernetes cluster. The cluster features high availability, custom networking, and manual PKI/TLS configuration.

πŸ—οΈ Architecture

Infrastructure Components

  • VPC & Networking: Custom VPC with public/private subnets across multiple AZs
  • Load Balancing: Network Load Balancer for HA control plane access
  • Security: Custom security groups with principle of least privilege
  • Compute: EC2 instances for control plane and worker nodes

Cluster Components

Control Plane (3 nodes - High Availability)

  • etcd cluster (3 members with Raft consensus)
  • kube-apiserver (multiple instances behind NLB)
  • kube-controller-manager (with leader election)
  • kube-scheduler (with leader election)

Worker Nodes

  • containerd runtime
  • kubelet
  • kube-proxy
  • CNI networking plugins

Networking Configuration

  • Pod CIDR: 10.200.0.0/16
  • Service CIDR: 10.32.0.0/24
  • CNI-based pod networking with cross-node routing
  • CoreDNS for service discovery

πŸ”§ Implementation Details

1. Infrastructure Provisioning

Custom VPC
β”œβ”€β”€ Public Subnets (Multi-AZ)
β”œβ”€β”€ Private Subnets (Multi-AZ)
β”œβ”€β”€ Internet Gateway
β”œβ”€β”€ NAT Gateways
β”œβ”€β”€ Route Tables
└── Security Groups (Control Plane, Workers, etcd)

Configured Network Load Balancer for external API access with health checks targeting kube-apiserver on all control plane nodes.

2. PKI Infrastructure with cfssl

Generated and distributed certificates for:

  • Kubernetes CA
  • API Server (with proper SANs for NLB, node IPs, and cluster DNS)
  • Controller Manager
  • Scheduler
  • Kubelet (per-node client certificates)
  • Kube-proxy
  • Service Account signing keys
  • Admin user certificates

Key learnings: Proper certificate configuration with correct Common Names (CN) and Subject Alternative Names (SANs) is critical for component authentication and API server TLS verification.

3. etcd Cluster Setup

Deployed 3-node etcd cluster with:

  • TLS peer and client communication
  • Data directory persistence
  • Cluster member configuration
  • Health monitoring

Verified cluster health with etcdctl member list and endpoint health checks.

4. Control Plane Configuration

kube-apiserver

  • Configured with etcd cluster endpoints
  • TLS certificate authentication
  • Service account key configuration
  • Authorization modes (Node, RBAC)
  • Admission controllers enabled

kube-controller-manager

  • Cluster CIDR configuration
  • Service cluster IP range
  • Leader election enabled
  • Certificate signing controller

kube-scheduler

  • Leader election configuration
  • TLS client authentication

5. Worker Node Bootstrap

Each worker node configured with:

  • containerd as container runtime
  • CNI plugins for pod networking
  • kubelet with TLS bootstrap
  • kube-proxy for service networking

πŸ› Critical Issues Debugged

Issue 1: Services Unreachable - kube-proxy Misconfiguration

Symptom: Services had ClusterIPs assigned but were completely unreachable. No connectivity to any service from pods or nodes.

Root Cause: kube-proxy kubeconfig pointed to an invalid API server endpoint, preventing kube-proxy from watching service/endpoint objects and creating iptables rules.

Debugging Process:

# Checked kube-proxy logs
kubectl logs -n kube-system kube-proxy-xxxxx

# Verified iptables rules were missing
sudo iptables -t nat -L KUBE-SERVICES

# Inspected kube-proxy configuration
kubectl get configmap -n kube-system kube-proxy -o yaml

Resolution:

  • Rebuilt kube-proxy kubeconfig with correct API server endpoint (NLB DNS)
  • Redeployed kube-proxy DaemonSet
  • Validated iptables NAT chains were properly populated with KUBE-SVC-* chains

Learning: kube-proxy is responsible for implementing the service abstraction by programming iptables rules. Without proper API connectivity, the entire service mesh breaks.

Issue 2: DNS Failures - CoreDNS Selector Mismatch

Symptom: DNS queries from pods returned NXDOMAIN for all service names. nslookup kubernetes.default failed.

Root Cause: CoreDNS pods had incorrect labels that didn't match the kube-dns service selector, resulting in zero endpoints.

Debugging Process:

# Checked DNS from pod
kubectl exec -it busybox -- nslookup kubernetes.default

# Verified service endpoints
kubectl get endpoints kube-dns -n kube-system

# Inspected service selector vs pod labels
kubectl get svc kube-dns -n kube-system -o yaml
kubectl get pods -n kube-system --show-labels

Resolution:

  • Updated CoreDNS deployment with matching labels: k8s-app=kube-dns
  • Verified endpoints populated correctly
  • Tested DNS resolution from test pods

Learning: The service β†’ endpoints β†’ pod chain is critical for service discovery. Label mismatches break the entire DNS resolution mechanism.

Issue 3: Pod-to-Pod Communication Failure

Symptom: Pods on different nodes couldn't communicate. Cross-node traffic failed silently.

Root Cause: Incomplete CNI configuration and missing kernel network settings (IP forwarding disabled, bridge netfilter not configured).

Debugging Process:

# Checked pod connectivity
kubectl exec -it pod-1 -- ping <pod-2-ip>

# Verified routing tables
ip route
route -n

# Checked IP forwarding
cat /proc/sys/net/ipv4/ip_forward

# Inspected CNI bridge
ip link show cni0
bridge fdb show

Resolution:

  • Enabled IP forwarding: net.ipv4.ip_forward=1
  • Configured bridge netfilter: net.bridge.bridge-nf-call-iptables=1
  • Verified CNI bridge creation and route propagation
  • Added routes for pod CIDR ranges to appropriate worker nodes

Learning: CNI plugins require proper kernel networking configuration. Understanding how pod IPs are assigned and how cross-node routing works is essential for troubleshooting network issues.

Issue 4: TLS/PKI Authentication Errors

Symptom: Components couldn't authenticate to API server. Certificate validation errors in logs.

Root Cause: Incorrect certificate Common Names, missing SANs for NLB endpoint, and improper RBAC bindings for system components.

Debugging Process:

# Checked certificate details
openssl x509 -in /path/to/cert -text -noout

# Verified API server was serving correct cert
openssl s_client -connect <nlb-endpoint>:6443

# Reviewed RBAC bindings
kubectl get clusterrolebindings

Resolution:

  • Regenerated certificates with proper CNs (e.g., system:kube-controller-manager)
  • Added all API server access points to SANs (NLB DNS, node IPs, localhost, cluster IP)
  • Created proper RBAC bindings for system:node and system:kube-proxy
  • Redistributed certificates to all nodes

Learning: Kubernetes PKI is highly specific about certificate identities. The CN must match RBAC expectations, and SANs must cover all access patterns.

🧠 Key Technical Learnings

API Request Flow

Understanding the complete path of a kubectl command:

kubectl β†’ NLB β†’ kube-apiserver (any of 3) β†’ RBAC check β†’ etcd β†’ response

Service Networking Implementation

How kube-proxy implements services using iptables:

ClusterIP request β†’ KUBE-SERVICES chain β†’ KUBE-SVC-* chain β†’ 
KUBE-SEP-* chains (endpoints) β†’ DNAT to pod IP

DNS Resolution Flow

Pod β†’ /etc/resolv.conf (CoreDNS ClusterIP) β†’ kube-proxy iptables β†’ 
CoreDNS pod β†’ Kubernetes API (endpoints) β†’ Response

Pod Networking

Container β†’ CNI bridge β†’ veth pair β†’ node routing table β†’ 
destination node β†’ CNI bridge β†’ destination pod

etcd and Cluster State

  • All Kubernetes objects stored in etcd under /registry
  • Watch mechanism enables real-time controller reconciliation
  • Quorum required for write operations (2 of 3 nodes)

πŸ› οΈ Skills Demonstrated

Cloud Infrastructure

  • AWS VPC design and networking
  • Load balancer configuration
  • Security group management
  • Multi-AZ high availability

Kubernetes Internals

  • Control plane component configuration
  • etcd cluster management
  • Container runtime (containerd) setup
  • CNI plugin implementation

Networking

  • Linux networking (iptables, routing, bridges)
  • Service mesh implementation
  • DNS resolution and CoreDNS
  • Cross-node pod communication

Security

  • PKI/TLS certificate management
  • RBAC configuration
  • Component authentication
  • Secure inter-component communication

Debugging & Troubleshooting

  • Systematic problem isolation
  • Log analysis and correlation
  • Network packet tracing
  • State validation (iptables, routes, endpoints)

πŸ“Š Validation & Testing

Cluster validation tests performed:

# Component health
kubectl get componentstatuses
kubectl get nodes
etcdctl member list

# Networking
kubectl run busybox --image=busybox --command -- sleep 3600
kubectl exec -it busybox -- nslookup kubernetes.default

# Service routing
kubectl expose deployment nginx --port=80
kubectl run curl --image=curlimages/curl -it --rm -- curl nginx

# DNS
kubectl run -it --rm debug --image=busybox --restart=Never -- nslookup kubernetes.default.svc.cluster.local

πŸŽ“ Production-Readiness Insights

This project taught me to think like an SRE debugging production failures:

  1. Never trust abstractions - Understanding what's actually happening at the networking/kernel level is critical
  2. Follow the data path - API requests, service routing, and DNS all have specific paths that can be traced
  3. Validate assumptions - "The config looks correct" means nothing; verify with actual runtime state
  4. Use first principles - When debugging, go back to "what is this component supposed to do?" rather than guessing
  5. Correlate multiple signals - Logs, iptables rules, endpoint objects, and network traces together tell the complete story

πŸš€ Future Enhancements

Potential extensions to deepen production knowledge:

  • Persistent Storage: EBS CSI driver integration
  • Observability: Prometheus, Grafana, and metrics-server
  • Advanced Networking: Calico/Cilium for network policies
  • Ingress: NGINX ingress controller with TLS
  • Security: Pod Security Standards, OPA Gatekeeper
  • Autoscaling: HPA and Cluster Autoscaler
  • Service Mesh: Istio or Linkerd integration

πŸ“š Technologies Used

  • Infrastructure: AWS (EC2, VPC, NLB, Security Groups)
  • Container Runtime: containerd
  • Kubernetes: v1.28 (all components built from source)
  • Networking: CNI plugins, CoreDNS, iptables
  • PKI: cfssl, OpenSSL
  • Etcd: v3.5
  • Operating System: Ubuntu 22.04 LTS

πŸ’‘ Why This Matters

Most engineers interact with Kubernetes through abstractions (EKS, GKE, managed services). While this is fine for application development, it creates a knowledge gap when:

  • Production clusters experience networking issues
  • Security incidents require certificate rotation
  • Control plane failures need root cause analysis
  • Custom networking requirements emerge
  • Cost optimization requires architectural changes

This project bridges that gap by building production-grade infrastructure without abstractions, enabling me to debug and operate Kubernetes at the same level as platform engineering and SRE teams at scale.


πŸ“« Contact

If you'd like to discuss Kubernetes architecture, SRE practices, or this implementation, feel free to reach out:


About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors