This is a mono repository for my home infrastructure and Kubernetes cluster. I try to adhere to Infrastructure as Code (IaC) and GitOps practices using tools like Ansible, Terraform, Kubernetes, Flux, Renovate, and GitHub Actions.
My Kubernetes cluster is deployed with Talos. This is a semi-hyper-converged cluster using OpenEBS Mayastor for high-performance block storage, while a separate server with ZFS provides NFS/SMB shares for bulk file storage and backups.
There is a template over at onedr0p/cluster-template if you want to try and follow along with some of the practices used here.
- actions-runner-controller: Self-hosted Github runners.
- cert-manager: Creates SSL certificates for services in my cluster.
- cilium: Internal Kubernetes container networking interface.
- cloudflared: Enables Cloudflare secure access to certain ingresses.
- external-dns: Automatically syncs ingress DNS records to a DNS provider.
- external-secrets: Managed Kubernetes secrets using Infisical.
- ingress-nginx: Kubernetes ingress controller using NGINX as a reverse proxy and load balancer.
- openebs-mayastor: High-performance block storage for persistent storage.
- sops: Managed secrets for Kubernetes and Terraform which are commited to Git.
- spegel: Stateless cluster local OCI registry mirror.
- volsync: Backup and recovery of persistent volume claims.
Flux watches the clusters in my kubernetes folder (see Directories below) and makes the changes to my clusters based on the state of my Git repository.
The way Flux works for me here is it will recursively search the kubernetes/apps
folder until it finds the most top level kustomization.yaml
per directory and then apply all the resources listed in it. That aforementioned kustomization.yaml
will generally only have a namespace resource and one or many Flux kustomizations (ks.yaml
). Under the control of those Flux kustomizations there will be a HelmRelease
or other resources related to the application which will be applied.
Renovate watches my entire repository looking for dependency updates, when they are found a PR is automatically created. When some PRs are merged Flux applies the changes to my cluster.
This Git repository contains the following directories under Kubernetes.
π kubernetes
βββ π apps # applications
βββ π bootstrap # bootstrap procedures
βββ π components # re-useable components
βββ π flux # flux system configuration
This is a high-level look how Flux deploys my applications with dependencies. In most cases a HelmRelease
will depend on other HelmRelease
's, in other cases a Kustomization
will depend on other Kustomization
's, and in rare situations an app can depend on a HelmRelease
and a Kustomization
. The example below shows that atuin
won't be deployed or upgrade until the mayastor
Helm release is installed or in a healthy state.
graph TD
%% Styling
classDef kustomization fill:#2f73d8,stroke:#fff,stroke-width:2px,color:#fff
classDef helmRelease fill:#389826,stroke:#fff,stroke-width:2px,color:#fff
%% Nodes
A>Kustomization: openebs]:::kustomization
B[HelmRelease: openebs]:::helmRelease
C[HelmRelease: mayastor]:::helmRelease
D>Kustomization: atuin]:::kustomization
E[HelmRelease: atuin]:::helmRelease
%% Relationships with styled edges
A -->|Creates| B
A -->|Creates| C
C -->|Depends on| B
D -->|Creates| E
E -->|Depends on| C
%% Link styling
linkStyle default stroke:#666,stroke-width:2px
Click here to see my high-level network diagram
graph TD
%% Styling
classDef network fill:#2f73d8,stroke:#fff,stroke-width:2px,color:#fff
classDef hardware fill:#d83933,stroke:#fff,stroke-width:2px,color:#fff
classDef vm fill:#389826,stroke:#fff,stroke-width:2px,color:#fff
subgraph LAN [LAN - 192.168.1.1/24]
OPN[OPNsense Router]:::hardware
SW[Aruba S2500-48p Switch]:::hardware
PH1[Proxmox Host - Kubernetes]:::hardware
PH2[Proxmox Host - NAS]:::hardware
end
subgraph VLAN100 [SERVERS - 10.100.0.1/24]
K8S1[Talos VM 1]:::vm
K8S2[Talos VM 2]:::vm
K8S3[Talos VM 3]:::vm
K8S4[Talos VM 4]:::vm
K8S5[Talos VM 5]:::vm
K8S6[Talos VM 6]:::vm
K8S7[Talos VM 7]:::vm
end
%% Network connections with styled edges
OPN --- SW
SW --- PH1
SW --- PH2
%% VM connections with styled edges
PH1 --> K8S1
PH1 --> K8S2
PH1 --> K8S3
PH1 --> K8S4
PH1 --> K8S5
PH1 --> K8S6
PH1 --> K8S7
%% Subgraph styling
style LAN fill:#f5f5f5,stroke:#666,stroke-width:2px
style VLAN100 fill:#f5f5f5,stroke:#666,stroke-width:2px
%% Link styling
linkStyle default stroke:#666,stroke-width:2px
While most of my infrastructure and workloads are self-hosted I do rely upon the cloud for certain key parts of my setup. This saves me from having to worry about three things. (1) Dealing with chicken/egg scenarios, (2) services I critically need whether my cluster is online or not and (3) The "hit by a bus factor" - what happens to critical apps (e.g. Email, Password Manager, Photos) that my family relies on when I no longer around.
Alternative solutions to the first two of these problems would be to host a Kubernetes cluster in the cloud and deploy applications like HCVault, Vaultwarden, ntfy, and Gatus; however, maintaining another cluster and monitoring another group of workloads would be more work and probably be more or equal out to the same costs as described below.
Service | Use | Cost |
---|---|---|
Infisical | Secrets with External Secrets | Free |
Cloudflare | Domain and S3 | Free |
GCP | Voice interactions with Home Assistant over Google Assistant | Free |
GitHub | Hosting this repository and continuous integration/deployments | Free |
Migadu | Email hosting | ~$20/yr |
Pushover | Kubernetes Alerts and application notifications | $5 OTP |
UptimeRobot | Monitoring internet connectivity and external facing applications | Free |
Total: ~$2/mo |
In my cluster there are two instances of ExternalDNS running. One for syncing private DNS records to my UDM Pro Max
using ExternalDNS webhook provider for UniFi, while another instance syncs public DNS to Cloudflare
. This setup is managed by creating ingresses with two specific classes: internal
for private DNS and external
for public DNS. The external-dns
instances then syncs the DNS records to their respective platforms accordingly.
Device | CPU | RAM | Storage | Function |
---|---|---|---|---|
Proxmox Host (Kubernetes) | 2x Intel Xeon E5-2697A v4 (64 cores @ 2.60GHz) | 512GB | 1TB NVMe (host), 4x 3.84TB SSD (passthrough) | Kubernetes Cluster |
Proxmox Host (NAS) | 2x Intel Xeon E5-2687W (32 cores @ 3.10GHz) | 126GB | 2x 120GB SSD (boot), 800GB NVMe, Various HDDs | NAS + Storage |
OPNsense Router | Intel i3-4130T (2 cores, 4 threads @ 2.90GHz) | 16GB | 120GB SSD | Router |
Aruba S2500-48p | - | - | - | PoE Switch |
Additional Hardware:
- 4x Tesla P100 16GB GPUs (passthrough to Kubernetes host)
- 7x Virtualized Talos VMs running on Kubernetes host
Thanks to all the people who donate their time to the Home Operations Discord community. Be sure to check out kubesearch.dev for ideas on how to deploy applications or get ideas on what you could deploy.