A modern homelab running on Kubernetes with Talos Linux, migrated from Proxmox/Docker
Architecture β’ Services β’ Infrastructure β’ Deployment β’ Roadmap
What: Production-grade Kubernetes homelab for self-hosted services
Why: GitOps automation, better scalability, and learning cloud-native tech
How: Talos Linux bare-metal cluster with declarative configuration
Board: Project Board on Jira
graph TB
subgraph "External Access"
Internet((Internet))
CF[Cloudflare DNS]
DD[DuckDNS]
end
subgraph "Homelab Network"
Router[Router<br/>192.168.10.1]
subgraph "Kubernetes Cluster"
subgraph "Control Plane"
CP[beelink-1<br/>192.168.10.147<br/>Control Plane]
end
subgraph "Worker Nodes"
W1[proxmox<br/>192.168.10.165<br/>Worker Node]
end
subgraph "Network Layer"
MLB[MetalLB<br/>Load Balancer]
TRF[Traefik<br/>Ingress Controller]
end
end
subgraph "Storage"
NAS[Synology DS423+<br/>36TB Raw / 24TB Usable<br/>3x 12TB SHR - 1 Drive Redundancy<br/>NFS + iSCSI]
end
end
Internet --> CF
Internet --> DD
CF --> Router
DD --> Router
Router --> MLB
MLB --> TRF
TRF --> CP
TRF --> W1
CP -.-> W1
W1 --> NAS
CP --> NAS
classDef control fill:#326CE5,stroke:#fff,stroke-width:2px,color:#fff
classDef worker fill:#00ADD8,stroke:#fff,stroke-width:2px,color:#fff
classDef network fill:#FF7300,stroke:#fff,stroke-width:2px,color:#fff
classDef storage fill:#40C463,stroke:#fff,stroke-width:2px,color:#fff
class CP control
class W1 worker
class MLB,TRF network
class NAS storage
| Service | Purpose | Access |
|---|---|---|
| π VPN Group (Gluetun Sidecar) | ||
| ββ qBittorrent | Torrent downloads | Port 8080 |
| ββ NZBGet | Usenet downloads | Port 6789 |
| ββ Prowlarr | Indexer management | Port 9696 |
| πΊ Media Management | ||
| Sonarr / Sonarr2 | TV show automation | Ports 8989 / 8990 |
| Radarr / Radarr2 | Movie automation | Ports 7878 / 7879 |
| Bazarr / Bazarr2 | Subtitle management | Ports 6767 / 6768 |
| Notifiarr | Discord notifications | Port 5454 |
- Jellyfin - Media streaming server with Intel GPU transcoding
- Jellyseerr - Media request management
| Service | Namespace | Purpose |
|---|---|---|
| Traefik | traefik | Ingress controller & reverse proxy |
| Cert-Manager | traefik | Automatic SSL certificates via DuckDNS |
| MetalLB | metallb | Bare-metal load balancer |
| K8s-Cleaner | k8s-cleaner | Cleanup completed pods/jobs |
| Descheduler | kube-system | Workload distribution optimization |
| NFS Provisioner | synology-csi | Dynamic volume provisioning |
- LibreChat (
ai-stuffnamespace) - Self-hosted AI chat interface with MongoDB backend
Cluster:
OS: Talos Linux v1.6
Kubernetes: v1.29
CNI: Flannel
Nodes:
- Name: beelink-1
Role: Control Plane
IP: 192.168.10.147
Specs: Intel N100, 16GB RAM
- Name: proxmox
Role: Worker
IP: 192.168.10.165
Specs: Intel i5-7400, 16GB RAM, NVIDIA GT-730Synology DS423+ (24TB Raw / ~10.9TB Usable) 1 drive fault tolerance
βββ /volume1/
β βββ NAS/
β β βββ Movies
β β βββ Shows
β β βββ Music
β β βββ Youtube
β β βββ Downloads/
β β βββ Qbittorrent/
β β β βββ Torrents
β β β βββ Completed
β β β βββ Incomplete
β β βββ Nzbget/
β β βββ Queue
β β βββ Nzb
β β βββ Intermediate
β β βββ Tmp
β β βββ Completed
β β
β βββ kube/ # NFS-based PVCs
β β βββ jelly/
β β β βββ jellyseerr-pvc
β β βββ ai-stuff/
β β β βββ mongodb-backup-pvc
β β βββ default/
β β β βββ test-pvc-worker
β β βββ test-nfs/
β β βββ test-nfs-pvc
β β
β βββ TimeMachine/ # Macbook Backups
β β
β βββ Docker/ # Legacy
β βββ Pihole
β
βββ iSCSI LUNs (19 total) # High-performance PVCs
βββ jellyfin-config # Jellyfin configs (5Gi)
βββ jellyfin-data # Jellyfin metadata
βββ jellyfin-cache # Transcoding cache
βββ jellyfin-log # Jellyfin logs
βββ arr-stack configs # All *arr app configs
βββ librechat volumes # AI app storage
βββ ... (other service volumes)
Storage Classes:
nfs-client- Dynamic NFS provisioning for general workloadssynology-iscsi- iSCSI LUNs for high-performance/database workloadssyno-storage- Synology CSI driver (alternative option)
- Load Balancer: MetalLB with IP pool
192.168.10.200-192.168.10.250 - Ingress: Traefik v3 with automatic SSL
- Domains:
- Local:
*.arkhaya.duckdns.org(internal services) - Public:
*.arkhaya.xyz(external access)
- Local:
- Security: Cloudflare proxy for public services
Synced from Jira β’ Updates every 6 hours and on push
|
|
|
When using cert-manager with DuckDNS webhook for wildcard certificates, you may encounter issues:
- "no api token secret provided" - The ClusterIssuer is looking for a secret in the wrong namespace
- DNS propagation timeouts - DuckDNS can take 5-10 minutes to propagate DNS changes
- Wrong ClusterIssuer references - Ensure you're using the Helm-deployed issuer
If you installed the webhook via Helm:
helm install cert-manager-webhook-duckdns cert-manager-webhook-duckdns/cert-manager-webhook-duckdns \
--namespace cert-manager \
--set duckdns.token=$DUCKDNS_TOKEN \
--set clusterIssuer.production.create=true \
--set clusterIssuer.staging.create=true \
--set clusterIssuer.email=gauranshmathur1999@gmail.comThen use the Helm-created ClusterIssuer in your Certificate resources:
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: duckdns-wildcard-cert
namespace: traefik
spec:
secretName: duckdns-wildcard-tls
issuerRef:
name: cert-manager-webhook-duckdns-production # Helm-created issuer
kind: ClusterIssuer
dnsNames:
- "arkhaya.duckdns.org"
- "*.arkhaya.duckdns.org"- Hardware: 2+ machines with 8GB+ RAM
- Network: Static IPs, router access for port forwarding
- Storage: NAS with NFS enabled
- Tools:
kubectl,helm,talosctl
# 1. Apply Talos configuration
talosctl apply-config --nodes 192.168.10.147 --file controlplane.yaml
talosctl apply-config --nodes 192.168.10.165 --file worker.yaml
# 2. Bootstrap cluster
talosctl bootstrap --nodes 192.168.10.147
# 3. Get kubeconfig
talosctl kubeconfig --nodes 192.168.10.147
# 4. Install core services
kubectl apply -f kubernetes/namespaces/
helm install metallb metallb/metallb -n metallb -f helm/metallb/values.yaml
helm install traefik traefik/traefik -n traefik -f helm/traefik/values.yaml
# 5. Deploy applications
kubectl apply -k kubernetes/Homelab/
βββ kubernetes/ # Raw Kubernetes manifests
β βββ arr-stack/ # Media automation stack
β βββ jellyfin/ # Media server configs
β βββ ...
βββ helm/ # Helm charts and values
β βββ traefik/ # Ingress controller
β βββ cert-manager/ # SSL certificates
β βββ ...
βββ ansible/ # Migration playbooks
βββ docs/ # Additional documentation
| Component | v1 (Proxmox/Docker) | v2 (Kubernetes) |
|---|---|---|
| Platform | Proxmox VE + LXC | Talos Linux bare-metal |
| Containers | Docker Compose | Kubernetes deployments |
| Networking | Manual port mapping | Service mesh + ingress |
| Storage | Local volumes | Dynamic PVCs |
| Updates | Manual per-service | Rolling updates |
| Backups | Scripts | Persistent volumes |
β
Declarative Configuration - Everything as code
β
Self-Healing - Automatic pod restarts
β
Easy Scaling - Just update replica count
β
Better Isolation - Namespace separation
β
Unified Ingress - Single entry point
β
Automated SSL - Cert-manager handles certificates
- VPN Networking β Gluetun sidecar pattern
- GPU Transcoding β Intel device plugin
- Data Migration β Ansible playbooks
- Service Discovery β CoreDNS + Traefik
- π Project Board
- π v1 README (Legacy setup)
- π·οΈ Talos Linux Docs
- π― TRaSH Guides (Media quality settings)
This is a personal project, but suggestions and improvements are welcome! Feel free to open an issue.
MIT License - Feel free to use this as inspiration for your own homelab!