Skip to content

Commit 2cccf2b

Browse files
committed
docs: rewrite README with Helm as primary install method
Restructure around user journey: install, configure, upgrade, uninstall. Move Helm to top as recommended method. Add full configuration table. Remove demo/contributor sections to focus on end users.
1 parent fdc584f commit 2cccf2b

File tree

1 file changed

+54
-138
lines changed

1 file changed

+54
-138
lines changed

README.md

Lines changed: 54 additions & 138 deletions
Original file line numberDiff line numberDiff line change
@@ -6,107 +6,92 @@ Kubernetes controller that automatically assigns node roles based on a configura
66

77
By default, `kubeadm` enables the NodeRestriction admission controller that restricts what labels `kubelet` can self-apply on node registration. The `node-role.kubernetes.io/*` label is restricted and can't be set in cloud init scripts or during other node bootstrap processes.
88

9-
## Features
10-
11-
- Watches node add/update events via Kubernetes informer
12-
- Patches nodes with role labels derived from a configurable source label
13-
- Leader election via Lease for safe multi-replica deployments
14-
- Exponential backoff with permanent error detection for patch retries
15-
- Rate-limited Kubernetes API client to prevent API server overload
16-
- Prometheus metrics for successful and failed patch operations
17-
- Health (`/healthz`) and readiness (`/readyz`) endpoints
18-
- Graceful shutdown on SIGINT/SIGTERM with context propagation
19-
20-
## Requirements
21-
22-
- Kubernetes 1.33+
23-
- RBAC permissions: nodes (list, watch, patch) and leases (get, create, update)
24-
25-
## Usage
26-
27-
Update [patch-configmap.yaml](deployment/overlays/prod/patch-configmap.yaml) to configure the source label and behavior:
28-
29-
```yaml
30-
apiVersion: v1
31-
kind: ConfigMap
32-
metadata:
33-
name: node-role-controller-config
34-
namespace: node-labeler
35-
data:
36-
roleLabel: "nodeGroup" # value of this label becomes the node role
37-
roleReplace: "false" # whether to replace existing node roles
38-
logLevel: "info" # debug, info, warn, error
39-
```
9+
## Install
4010

41-
Then apply to the cluster:
11+
### Helm (recommended)
4212

43-
```sh
44-
kubectl apply -k deployment/overlays/prod
13+
```shell
14+
helm install node-role-controller oci://ghcr.io/mchmarny/node-role-controller \
15+
--namespace node-labeler --create-namespace
4516
```
4617

47-
This ensures all nodes with `nodeGroup=customer-gpu` get labeled with `node-role.kubernetes.io/customer-gpu`.
48-
49-
> If you change ConfigMap values after deployment, restart to apply: `kubectl -n node-labeler rollout restart deployment node-role-controller`
50-
51-
Alternatively, apply the prebuilt manifest:
18+
### Manifest
5219

5320
```shell
5421
kubectl apply -f https://raw.githubusercontent.com/mchmarny/rolesetter/refs/heads/main/deployment/manifest.yaml
5522
```
5623

57-
This creates:
58-
59-
* `Namespace` - Isolates the controller resources
60-
* `ServiceAccount` - Authenticates the controller
61-
* `ClusterRole` - Grants node list/watch/patch and lease permissions
62-
* `ClusterRoleBinding` - Links the role to the ServiceAccount
63-
* `ConfigMap` - Defines label, replace, and logging configuration
64-
* `Deployment` - Runs the controller with leader election, health probes, and security hardening
65-
66-
## Helm
67-
68-
Install from the OCI registry:
24+
### Kustomize
6925

7026
```shell
71-
helm install node-role-controller oci://ghcr.io/mchmarny/node-role-controller \
72-
--namespace node-labeler --create-namespace
27+
kubectl apply -k deployment/overlays/prod
7328
```
7429

75-
Configure via values:
30+
## Configuration
31+
32+
The controller is configured via environment variables sourced from a ConfigMap. With Helm, set values directly:
7633

7734
```shell
7835
helm install node-role-controller oci://ghcr.io/mchmarny/node-role-controller \
7936
--namespace node-labeler --create-namespace \
8037
--set config.roleLabel=nodeGroup \
8138
--set config.roleReplace=true \
82-
--set replicas=2
39+
--set config.logLevel=debug
8340
```
8441

85-
Uninstall:
42+
| Parameter | Default | Description |
43+
|-----------|---------|-------------|
44+
| `config.roleLabel` | `nodeGroup` | Source label whose value becomes the node role |
45+
| `config.roleReplace` | `false` | Replace existing `node-role.kubernetes.io/*` labels |
46+
| `config.logLevel` | `info` | Log level (`debug`, `info`, `warn`, `error`) |
47+
| `replicas` | `1` | Number of controller replicas (leader election enabled) |
48+
| `image.tag` | Chart `appVersion` | Override the image tag |
49+
| `resources.requests.cpu` | `50m` | CPU request |
50+
| `resources.requests.memory` | `64Mi` | Memory request |
51+
| `resources.limits.cpu` | `250m` | CPU limit |
52+
| `resources.limits.memory` | `256Mi` | Memory limit |
53+
| `tolerations` | `[]` | Pod tolerations |
54+
| `nodeSelector` | `{}` | Pod node selector |
55+
56+
> After changing configuration, restart to apply: `kubectl -n node-labeler rollout restart deployment node-role-controller`
57+
58+
## Upgrade
59+
60+
```shell
61+
helm upgrade node-role-controller oci://ghcr.io/mchmarny/node-role-controller \
62+
--namespace node-labeler
63+
```
64+
65+
## Uninstall
8666

8767
```shell
8868
helm uninstall node-role-controller -n node-labeler
8969
```
9070

91-
## Metrics
71+
## How It Works
9272

93-
The controller exposes:
73+
1. Nodes are labeled with a source label (e.g., `nodeGroup=gpu-worker`)
74+
2. The controller watches node add/update events via a Kubernetes informer
75+
3. When a node has the source label, the controller patches it with `node-role.kubernetes.io/<value>`
76+
4. Leader election via Lease ensures only one replica is active
9477

95-
- `node_role_patch_success_total` - Successful node patch operations (labeled by role)
96-
- `node_role_patch_failure_total` - Failed node patch operations (labeled by role)
78+
**Example:** A node with `nodeGroup=gpu-worker` gets `node-role.kubernetes.io/gpu-worker`.
9779

98-
Available at the `/metrics` endpoint on port `8080`.
80+
## Metrics
9981

100-
## Validation
82+
| Metric | Description |
83+
|--------|-------------|
84+
| `node_role_patch_success_total` | Successful patch operations (labeled by role) |
85+
| `node_role_patch_failure_total` | Failed patch operations (labeled by role) |
10186

102-
The image comes with SLSA attestation verifying it was built in this repo. You can verify using [Sigstore](https://docs.sigstore.dev/about/overview/) CLI or the in-cluster policy controller.
87+
Available at `/metrics` on port `8080`. Health at `/healthz`, readiness at `/readyz`.
10388

104-
### Manual
89+
## Image Verification
10590

106-
> Update the image digest to the version you are using.
91+
Every release includes [SLSA](https://slsa.dev) provenance attestation:
10792

10893
```shell
109-
export IMAGE=ghcr.io/mchmarny/node-role-controller:v0.5.1
94+
export IMAGE=ghcr.io/mchmarny/node-role-controller:v0.6.0
11095

11196
cosign verify-attestation \
11297
--output json \
@@ -116,85 +101,16 @@ cosign verify-attestation \
116101
$IMAGE
117102
```
118103

119-
### In Cluster
120-
121-
To enforce provenance verification on the `node-labeler` namespace:
104+
To enforce verification in-cluster with the [Sigstore policy controller](https://docs.sigstore.dev/about/overview/):
122105

123106
```shell
124107
kubectl label namespace node-labeler policy.sigstore.dev/include=true
125108
kubectl apply -f policy/clusterimagepolicy.yaml
126109
```
127110

128-
Test admission:
129-
130-
```shell
131-
kubectl -n node-labeler run test --image=$IMAGE
132-
```
133-
134-
If you don't already have the [Sigstore](https://docs.sigstore.dev/about/overview/) policy controller:
135-
136-
```shell
137-
kubectl create namespace cosign-system
138-
helm repo add sigstore https://sigstore.github.io/helm-charts
139-
helm repo update
140-
helm install policy-controller -n cosign-system sigstore/policy-controller
141-
```
142-
143-
## Demo
111+
## Contributing
144112

145-
> Requires [Kind](https://kind.sigs.k8s.io/)
146-
147-
Create a Kind cluster with multiple nodes:
148-
149-
```shell
150-
make up
151-
```
152-
153-
Check the nodes (workers have no role):
154-
155-
```shell
156-
kubectl get nodes
157-
```
158-
159-
```
160-
NAME STATUS ROLES AGE VERSION
161-
node-role-controller-control-plane Ready control-plane 2m9s v1.33.1
162-
node-role-controller-worker Ready <none> 114s v1.33.1
163-
node-role-controller-worker2 Ready <none> 114s v1.33.1
164-
```
165-
166-
Label the workers and deploy:
167-
168-
```shell
169-
kubectl get nodes -l '!node-role.kubernetes.io/control-plane' -o name | \
170-
xargs -I {} kubectl label {} nodeGroup=worker --overwrite
171-
kubectl apply -k deployment/overlays/dev
172-
```
173-
174-
After a few seconds, roles appear:
175-
176-
```shell
177-
kubectl get nodes
178-
```
179-
180-
```
181-
NAME STATUS ROLES AGE VERSION
182-
node-role-controller-control-plane Ready control-plane 3m12s v1.33.1
183-
node-role-controller-worker Ready worker 2m57s v1.33.1
184-
node-role-controller-worker2 Ready worker 2m57s v1.33.1
185-
```
186-
187-
Change a node's label to see the role update:
188-
189-
```shell
190-
kubectl label node node-role-controller-worker nodeGroup=gpu --overwrite
191-
```
192-
193-
Clean up:
194-
195-
```shell
196-
make down
197-
```
113+
See [CONTRIBUTING.md](CONTRIBUTING.md). Run `make pre` before submitting PRs.
198114

199115
## Disclaimer
200116

0 commit comments

Comments
 (0)