Turing RK1 Kubernetes Cluster

A 4-node bare-metal Kubernetes cluster built on Turing RK1 compute modules, supporting both Talos Linux and K3s on Armbian distributions. Designed for edge computing, AI/ML workloads with NPU acceleration, and distributed storage.

Choose Your Distribution

Distribution	Best For	NPU/GPU	Shell Access
Talos Linux	Production, Security	No	API only
K3s on Armbian	Development, AI/ML	Yes	SSH

See docs/COMPARISON.md for detailed feature comparison.

Quick Start

# Talos Linux (automated deployment)
./scripts/deploy-talos-cluster.sh prereq    # Check prerequisites
./scripts/deploy-talos-cluster.sh deploy    # Full deployment

# K3s on Armbian
./scripts/setup-k3s-node.sh   # Run on each node
./scripts/deploy-k3s-cluster.sh  # Deploy from workstation

# Check cluster status (works with both distributions)
./scripts/talos-cluster-status.sh           # Auto-detects and shows health summary

Note: This project is under active development. See CONTRIBUTING.md for how to get involved.

Hardware Summary

Turing Pi 2 Board

Component	Specification
Form Factor	Mini-ITX
Node Slots	4x CM4/RK1 compatible
BMC	Integrated management controller
Networking	Gigabit Ethernet per node
Storage	NVMe slot per node

Turing RK1 Compute Modules (x4)

Component	Specification
SoC	Rockchip RK3588
CPU	4x Cortex-A76 @ 2.4GHz + 4x Cortex-A55 @ 1.8GHz
RAM	16GB / 32GB LPDDR4X
GPU	Mali-G610 MP4
NPU	6 TOPS (INT8) - see limitations
eMMC	32GB (system disk)
NVMe	500GB Crucial P3 (worker nodes)

Cluster Topology

┌─────────────────────────────────────────────────────────────┐
│                    Turing Pi 2 BMC                          │
│                     10.10.88.70                             │
├─────────────┬─────────────┬─────────────┬───────────────────┤
│   Node 1    │   Node 2    │   Node 3    │      Node 4       │
│ Control Pl. │   Worker    │   Worker    │      Worker       │
│ 10.10.88.73 │ 10.10.88.74 │ 10.10.88.75 │   10.10.88.76     │
│   32GB eMMC │ 32GB + 500GB│ 32GB + 500GB│  32GB + 500GB     │
└─────────────┴─────────────┴─────────────┴───────────────────┘

Total Resources

Resource	Amount
CPU Cores	32 (8 per node)
RAM	64-128GB
Storage (eMMC)	128GB
Storage (NVMe)	1.5TB
Network	4x 1Gbps

Software Stack

Operating System

Component	Version	Notes
Talos Linux	v1.11.6	Immutable, API-driven Kubernetes OS
Linux Kernel	6.12.62	Mainline kernel (ARM64)

Kubernetes Components

Component	Version	Purpose
Kubernetes	v1.34.1	Container orchestration
containerd	v2.1.5	Container runtime
etcd	Bundled	Distributed key-value store

Storage

Component	Version	Purpose
Longhorn	Latest	Distributed block storage
CSI Driver	Longhorn	Persistent volume provisioning

Networking

Component	Version	Purpose
Flannel	Bundled	Pod networking (CNI)
MetalLB	Latest	LoadBalancer for bare-metal
NGINX Ingress	Latest	HTTP/HTTPS ingress controller

Monitoring

Component	Version	Purpose
Prometheus	Latest	Metrics collection & alerting
Grafana	Latest	Visualization & dashboards
Alertmanager	Latest	Alert routing & management
Node Exporter	Latest	Host-level metrics
kube-state-metrics	Latest	Kubernetes state metrics

Management

Component	Version	Purpose
Portainer Agent	v2.33.6	Remote cluster management
talosctl	v1.11.6	Talos node management
kubectl	v1.34.x	Kubernetes CLI
Helm	v3.x	Package manager

Cluster Capabilities

What This Cluster Can Do

Container Orchestration

Run containerized workloads across 4 nodes
Automatic pod scheduling and load balancing
Rolling updates and rollbacks
Health monitoring and self-healing

Distributed Storage

~1.5TB distributed storage via Longhorn
Volume replication across nodes (configurable 1-3 replicas)
Snapshots and backups
Dynamic volume provisioning
High-performance NVMe-backed storage class

Networking

LoadBalancer services via MetalLB (10.10.88.80-89)
HTTP/HTTPS ingress with NGINX
TLS termination
Path and host-based routing

Edge Computing

Low-power ARM64 architecture (~10W per node)
Compact form factor (Mini-ITX)
Suitable for remote/edge deployments

Development & Testing

Full Kubernetes API compatibility
Helm chart deployment
GitOps-ready
Multi-architecture image support (arm64)

AI/ML Workloads (CPU)

ARM64-optimized inference
NumPy, ONNX Runtime, PyTorch (CPU)
~12 GFLOPS matrix operations per node
Distributed training/inference across nodes

Monitoring & Observability

Full cluster metrics via Prometheus
Pre-configured Grafana dashboards
Node, pod, and container-level monitoring
Alerting with Alertmanager
External Docker host monitoring support
Longhorn storage metrics integration

Limitations & Known Issues

NPU Not Available (Talos Only)

Issue	Status	Details
RK3588 NPU inaccessible	Talos: Not Supported	Talos uses mainline Linux kernel which lacks Rockchip's proprietary RKNPU driver
	K3s/Armbian: Supported	BSP kernel includes full NPU support

Impact: On Talos, the 6 TOPS NPU in each RK3588 cannot be used for hardware-accelerated AI inference.

Solutions:

Use K3s on Armbian - Full NPU support with RKNN SDK (see docs/INSTALLATION-K3S.md)
Use CPU-based inference on Talos (ONNX Runtime, TensorFlow Lite)
Wait for mainline NPU driver (in kernel review)

GPU Not Available (Talos Only)

Issue	Status	Details
Mali-G610 GPU inaccessible	Talos: Not Supported	No GPU driver/passthrough in Talos
	K3s/Armbian: Supported	OpenCL and Vulkan available

Impact: On Talos, no GPU acceleration for graphics or compute workloads. K3s on Armbian provides full GPU support.

Storage Limitations

Issue	Status	Details
Control plane has no NVMe	By Design	Only workers have NVMe; CP uses eMMC only
Single replica risk	Configurable	Default 3 replicas; 2-replica mode loses redundancy if node fails

Network Limitations

Issue	Status	Details
No native LoadBalancer	Mitigated	MetalLB provides L2 LoadBalancer functionality
Single network interface	Hardware	Each node has only 1x 1Gbps NIC

Talos-Specific Considerations

Issue	Details
Immutable filesystem	Cannot install packages; must use extensions or containers
No SSH access	Nodes managed via `talosctl` API only
Privileged namespaces	Many add-ons require `pod-security.kubernetes.io/enforce=privileged` label

Known Bugs

Issue	Status	Workaround
PodSecurity warnings on deploy	Expected	Label namespaces as privileged
MetalLB speaker pods require privileges	Expected	Namespace is pre-labeled

Network Configuration

IP Allocation

Resource	IP Address	Port(s)
BMC	10.10.88.70	22 (SSH)
Control Plane	10.10.88.73	6443 (API)
Worker 1	10.10.88.74	-
Worker 2	10.10.88.75	-
Worker 3	10.10.88.76	-
Ingress Controller	10.10.88.80	80, 443
Portainer Agent	10.10.88.81	9001
Available Pool	10.10.88.82-89	-

Internal Networks

Network	CIDR	Purpose
Pod Network	10.244.0.0/16	Container IPs
Service Network	10.96.0.0/12	ClusterIP services

Quick Access

Management URLs

Service	URL	Notes
Kubernetes API	https://10.10.88.73:6443	Use kubeconfig
Grafana	http://grafana.local	Default: admin/admin
Prometheus	http://prometheus.local	Metrics & queries
Alertmanager	http://alertmanager.local	Alert management
Longhorn UI	http://longhorn.local	Storage management
Portainer	Your Portainer instance	Connect agent: `10.10.88.81:9001`

Add to /etc/hosts:

10.10.88.80  grafana.local prometheus.local alertmanager.local longhorn.local

CLI Access

# Set environment variables
export TALOSCONFIG=/path/to/cluster-config/talosconfig
export KUBECONFIG=/path/to/cluster-config/kubeconfig

# Verify cluster
kubectl get nodes
talosctl health

BMC Access Setup

The deployment scripts require access to the Turing Pi BMC. Configure credentials by copying the example file:

cp .env.example .env
# Edit .env with your BMC credentials

Required variables in .env:

Variable	Description	Default
`TPI_HOSTNAME`	BMC IP address	`10.10.88.70`
`TPI_USERNAME`	BMC login username	-
`TPI_PASSWORD`	BMC login password	-
`USE_LOCAL_TPI`	Use local tpi CLI (1) or SSH to BMC (0)	`1`

Test BMC connectivity:

./scripts/wipe-cluster.sh status

Documentation Map

Primary Documentation

Document	Path	Description
Docs Index	docs/README.md	Documentation overview
Talos Installation	docs/INSTALLATION.md	Talos Linux setup guide
K3s Installation	docs/INSTALLATION-K3S.md	K3s on Armbian setup guide
Distribution Comparison	docs/COMPARISON.md	Talos vs K3s feature matrix
Architecture Diagrams	docs/ARCHITECTURE.md	Visual cluster architecture (Mermaid)
Storage Guide	docs/STORAGE.md	Longhorn and NVMe configuration
Networking Guide	docs/NETWORKING.md	MetalLB and Ingress setup
Monitoring Guide	docs/MONITORING.md	Prometheus, Grafana & external monitoring
Quick Reference	docs/QUICKREF.md	Command cheatsheet

Configuration Files

File	Path	Description
Talos Config	cluster-config/talosconfig	Talos CLI configuration
Kubeconfig	cluster-config/kubeconfig	Kubernetes access
Cluster Secrets	cluster-config/secrets.yaml	Keep secure!
MetalLB Config	cluster-config/metallb-config.yaml	IP pool configuration
Ingress Config	cluster-config/ingress-config.yaml	Ingress rules
Portainer Agent	cluster-config/portainer-agent.yaml	Agent deployment
Prometheus Values	cluster-config/prometheus-values.yaml	Monitoring stack config
External Scrape	cluster-config/external-scrape-config.yaml	Docker host monitoring

Reference Documentation

Document	Path	Description
Cluster Plan	CLUSTER_PLAN.md	Original deployment plan
Talos Schematic	talos-schematic.yaml	Custom image configuration

External Resources

Resource	URL
Talos Documentation	https://www.talos.dev/docs/
K3s Documentation	https://docs.k3s.io/
Longhorn Documentation	https://longhorn.io/docs/
Turing Pi Documentation	https://docs.turingpi.com/
MetalLB Documentation	https://metallb.io/
NGINX Ingress	https://kubernetes.github.io/ingress-nginx/
Prometheus Documentation	https://prometheus.io/docs/
Grafana Documentation	https://grafana.com/docs/
RKNN SDK (NPU)	https://github.com/airockchip/rknn-toolkit2
RKLLM (LLM inference)	https://github.com/airockchip/rknn-llm

Directory Structure

turing-rk1-cluster/
├── README.md                 # This file
├── CLUSTER_PLAN.md           # Deployment planning document
├── .env.example              # Environment variables template
├── talos-schematic.yaml      # Talos image customization
├── cluster-config/           # Cluster configurations
│   ├── talosconfig           # Talos CLI config
│   ├── kubeconfig            # Kubernetes access
│   ├── secrets.yaml          # Cluster secrets (sensitive!)
│   ├── controlplane.yaml     # Control plane config (Talos)
│   ├── worker.yaml           # Worker config (Talos)
│   ├── metallb-config.yaml   # MetalLB IP pool
│   ├── ingress-config.yaml   # Ingress rules
│   ├── prometheus-values.yaml # Monitoring stack config
│   ├── external-scrape-config.yaml # External targets
│   └── *.yaml                # Other configurations
├── scripts/                  # Automation scripts
│   ├── deploy-talos-cluster.sh # Automated Talos deployment
│   ├── talos-cluster-status.sh # Cluster health and status checker
│   ├── setup-k3s-node.sh     # Armbian node preparation
│   ├── deploy-k3s-cluster.sh # K3s cluster deployment
│   └── wipe-cluster.sh       # Cluster reset/migration tool
├── docs/                     # Documentation
│   ├── README.md             # Docs index
│   ├── INSTALLATION.md       # Talos setup guide
│   ├── INSTALLATION-K3S.md   # K3s on Armbian setup guide
│   ├── COMPARISON.md         # Talos vs K3s comparison
│   ├── ARCHITECTURE.md       # Cluster architecture diagrams
│   ├── STORAGE.md            # Storage guide
│   ├── NETWORKING.md         # Network guide
│   ├── MONITORING.md         # Monitoring guide
│   └── QUICKREF.md           # Quick reference
├── images/                   # Talos images
│   └── latest/
│       └── metal-arm64.raw   # Current Talos image
└── repo/                     # Submodules/repos
    ├── sbc-rockchip/         # Talos Rockchip overlay
    ├── rknn-toolkit2/        # RKNN SDK v2.3.2 (for K3s)
    ├── rknn-llm/             # RKLLM v1.2.3 (for K3s)
    └── rknn_model_zoo/       # Pre-built models (for K3s)

Security Notes

Secrets Protection: cluster-config/secrets.yaml contains cluster credentials. Keep it secure and never commit to public repositories.
BMC Access: The BMC (10.10.88.70) has full control over all nodes. Restrict network access appropriately.
Privileged Workloads: Many add-ons require privileged namespace labels. Review security implications before deploying untrusted workloads.
Network Segmentation: Consider isolating the cluster network (10.10.88.x) from untrusted networks.

Contributing

This is a personal homelab cluster. Configuration files and documentation are provided as-is for reference.

License

Configuration files and documentation are provided under MIT license. Third-party components retain their original licenses.

FilesExpand file tree

README.md

Latest commit

History