Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 21 additions & 0 deletions examples/sharded-namespace/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# examples/sharded-namespace/Dockerfile
# Multi-stage to keep image slim
FROM golang:1.24 as build

WORKDIR /src
# Copy go.mod/sum from repo root so deps resolve; adjust paths if needed.
COPY go.mod go.sum ./
RUN go mod download

# Copy the whole repo to compile the example against local packages
COPY . .

# Build the example
WORKDIR /src/examples/sharded-namespace
RUN CGO_ENABLED=0 GOOS=linux GOARCH=amd64 go build -o /out/sharded-example main.go

FROM gcr.io/distroless/static:nonroot
WORKDIR /
COPY --from=build /out/sharded-example /sharded-example
USER 65532:65532
ENTRYPOINT ["/sharded-example"]
137 changes: 137 additions & 0 deletions examples/sharded-namespace/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,137 @@
# Sharded Namespace Example

This demo runs two replicas of a multicluster manager that **splits ownership** of downstream "clusters" discovered by the *namespace provider* (each Kubernetes Namespace == one cluster). Synchronization is decided by HRW hashing across peers, then serialized with per-cluster fencing Leases so exactly one peer reconciles a given cluster at a time.

**Key Components:**
- **Peer membership (presence)**: `coordination.k8s.io/Lease` with prefix `mcr-peer-*`
- **Per-cluster fencing (synchronization)**: `coordination.k8s.io/Lease` with prefix `mcr-shard-<namespace>`

Controllers attach per-cluster watches when synchronization starts, and cleanly detach & re-attach when it transfers.

## Build the image

From repo root:

```bash
docker build -t mcr-namespace:dev -f examples/sharded-namespace/Dockerfile .
```

If using KinD:
```bash
kind create cluster --name mcr-demo
kind load docker-image mcr-namespace:dev --name mcr-demo
```

## Deploy

```bash
kubectl apply -k examples/sharded-namespace/manifests
```

## Observe

Tail logs from pods:

```bash
kubectl -n mcr-demo get pods
kubectl -n mcr-demo logs statefulset/sharded-namespace -f
```

You should see lines like:
```bash
"synchronization start {"cluster": "zoo", "peer": "sharded-namespace-0"}"
"synchronization start {"cluster": "jungle", "peer": "sharded-namespace-1"}"

Reconciling ConfigMap {"controller": "multicluster-configmaps", "controllerGroup": "", "controllerKind": "ConfigMap", "reconcileID": "4f1116b3-b5 │
│ 4e-4e6a-b84f-670ca5cfc9ce", "cluster": "zoo", "ns": "default", "name": "elephant"}

Reconciling ConfigMap {"controller": "multicluster-configmaps", "controllerGroup": "", "controllerKind": "ConfigMap", "reconcileID": "688b8467-f5 │
│ d3-491b-989e-87bc8aad780e", "cluster": "jungle", "ns": "default", "name": "monkey"}
```

Check Leases:
```bash
# Peer membership (one per pod)
kubectl -n kube-system get lease | grep '^mcr-peer-'

# Per-cluster fencing (one per namespace/"cluster")
kubectl -n kube-system get lease | grep '^mcr-shard-'
```

Who owns a given cluster?
```bash
C=zoo
kubectl -n kube-system get lease mcr-shard-$C \
-o custom-columns=HOLDER:.spec.holderIdentity,RENEW:.spec.renewTime
```


## Test Synchronization

Scale down to 1 replica and watch synchronization consolidate:
```bash
# Scale down
kubectl -n mcr-demo scale statefulset/sharded-namespace --replicas=1


# Watch leases lose their holders as pods terminate
kubectl -n kube-system get lease -o custom-columns=NAME:.metadata.name,HOLDER:.spec.holderIdentity | grep "^mcr-shard-"

# Wait for all clusters to be owned by the single remaining pod (~30s)
kubectl -n kube-system wait --for=jsonpath='{.spec.holderIdentity}'=sharded-namespace-0 \
lease/mcr-shard-zoo lease/mcr-shard-jungle lease/mcr-shard-island --timeout=60s

```
Create/patch a ConfigMap and confirm the single owner reconciles it:
```bash
# Pick a cluster and create a test ConfigMap
C=island
kubectl -n "$C" create cm test-$(date +%s) --from-literal=ts=$(date +%s) --dry-run=client -oyaml | kubectl apply -f -

# Verify only pod-q reconciles it (since it owns everything now)
kubectl -n mcr-demo logs pod/sharded-namespace-0 --since=100s | grep "Reconciling ConfigMap.*$C"
```

Scale up to 3 replicas and watch synchronization rebalance:
```bash
# Scale up
kubectl -n mcr-demo scale statefulset/sharded-namespace --replicas=3
# Watch leases regain holders as pods start
kubectl -n kube-system get lease -o custom-columns=NAME:.metadata.name,HOLDER:.spec.holderIdentity | grep "^mcr-shard-"

# Create a cm in the default ns which belongs to sharded-namespace-2
C=default
kubectl -n "$C" create cm test-$(date +%s) --from-literal=ts=$(date +%s) --dry-run=client -oyaml | kubectl apply -f -
# Verify only pod-2 reconciles it (since it owns the default ns now)
kubectl -n mcr-demo logs pod/sharded-namespace-2 --since=100s | grep "Reconciling ConfigMap.*$C"
```

## Tuning
In your example app (e.g., examples/sharded-namespace/main.go), configure fencing and timings:

```go
mgr, err := mcmanager.New(cfg, provider, manager.Options{},
// Per-cluster fencing Leases live here as mcr-shard-<namespace>
mcmanager.WithShardLease("kube-system", "mcr-shard"),
mcmanager.WithPerClusterLease(true), // enabled by default

// Optional: tune fencing timings (duration, renew, throttle):
// mcmanager.WithLeaseTimings(30*time.Second, 10*time.Second, 750*time.Millisecond),

// Optional: peer weight for HRW:
// mcmanager.WithPeerWeight(1),
)
if err != nil {
// handle error
}
```

The peer registry uses mcr-peer-* automatically and derives the peer ID from the pod hostname (StatefulSet ordinal).

## Cleanup

```bash
kubectl delete -k examples/sharded-namespace/manifests
kind delete cluster --name mcr-demo

```
120 changes: 120 additions & 0 deletions examples/sharded-namespace/main.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,120 @@
// examples/sharded-namespace/main.go
package main

import (
"context"
"errors"
"fmt"
"os"

"golang.org/x/sync/errgroup"

corev1 "k8s.io/api/core/v1"
apierrors "k8s.io/apimachinery/pkg/api/errors"
"k8s.io/klog/v2"

ctrl "sigs.k8s.io/controller-runtime"
"sigs.k8s.io/controller-runtime/pkg/cluster"
"sigs.k8s.io/controller-runtime/pkg/log/zap"
"sigs.k8s.io/controller-runtime/pkg/manager"

mcbuilder "sigs.k8s.io/multicluster-runtime/pkg/builder"
mcmanager "sigs.k8s.io/multicluster-runtime/pkg/manager"
mcreconcile "sigs.k8s.io/multicluster-runtime/pkg/reconcile"
"sigs.k8s.io/multicluster-runtime/providers/namespace"
)

func main() {
klog.Background() // ensure klog initialized
ctrl.SetLogger(zap.New(zap.UseDevMode(true)))
log := ctrl.Log.WithName("sharded-example")

ctx := ctrl.SetupSignalHandler()

if err := run(ctx); err != nil {
log.Error(err, "exiting")
os.Exit(1)
}
}

func run(ctx context.Context) error {
// use in-cluster config; fall back to default loading rules for local runs
cfg, err := ctrl.GetConfig()
if err != nil {
return fmt.Errorf("get kubeconfig: %w", err)
}

// Provider: treats namespaces in the host cluster as “downstream clusters”.
host, err := cluster.New(cfg)
if err != nil {
return fmt.Errorf("create host cluster: %w", err)
}
provider := namespace.New(host)

// Multicluster manager (no peer ID passed; pod hostname becomes peer ID).
// Configure sharding:
// - fencing prefix: "mcr-shard" (per-cluster Lease names become mcr-shard-<cluster>)
// - peer membership still uses "mcr-peer" internally (set in WithMultiCluster)
// Peer ID defaults to os.Hostname().
mgr, err := mcmanager.New(cfg, provider, manager.Options{},
mcmanager.WithShardLease("kube-system", "mcr-shard"),
// optional but explicit (your manager already defaults this to true)
mcmanager.WithPerClusterLease(true),
)
if err != nil {
return fmt.Errorf("create mc manager: %w", err)
}

// A simple controller that logs ConfigMaps per owned “cluster” (namespace).
if err := mcbuilder.ControllerManagedBy(mgr).
Named("multicluster-configmaps").
For(&corev1.ConfigMap{}).
Complete(mcreconcile.Func(func(ctx context.Context, req mcreconcile.Request) (ctrl.Result, error) {
// attach cluster once; don't repeat it in Info()
l := ctrl.LoggerFrom(ctx).WithValues("cluster", req.ClusterName)

// get the right cluster client
cl, err := mgr.GetCluster(ctx, req.ClusterName)
if err != nil {
return ctrl.Result{}, err
}

// fetch the object, then log from the object (truth)
cm := &corev1.ConfigMap{}
if err := cl.GetClient().Get(ctx, req.Request.NamespacedName, cm); err != nil {
if apierrors.IsNotFound(err) {
// object vanished — nothing to do
return ctrl.Result{}, nil
}
return ctrl.Result{}, err
}

// now cm.Namespace is accurate (e.g., "zoo", "kube-system", etc.)
l.Info("Reconciling ConfigMap",
"ns", cm.GetNamespace(),
"name", cm.GetName(),
)

// show which peer handled it (pod hostname)
if host, _ := os.Hostname(); host != "" {
l.Info("Handled by peer", "peer", host)
}
return ctrl.Result{}, nil
})); err != nil {
return fmt.Errorf("build controller: %w", err)
}

// Start everything
g, ctx := errgroup.WithContext(ctx)
g.Go(func() error { return ignoreCanceled(provider.Run(ctx, mgr)) })
g.Go(func() error { return ignoreCanceled(host.Start(ctx)) })
g.Go(func() error { return ignoreCanceled(mgr.Start(ctx)) })
return g.Wait()
}

func ignoreCanceled(err error) error {
if errors.Is(err, context.Canceled) {
return nil
}
return err
}
6 changes: 6 additions & 0 deletions examples/sharded-namespace/manifests/kustomization.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- rbac.yaml
- statefulset.yaml
- sample-data.yaml
56 changes: 56 additions & 0 deletions examples/sharded-namespace/manifests/rbac.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
apiVersion: v1
kind: Namespace
metadata:
name: mcr-demo
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: mcr-demo
namespace: mcr-demo
---
# Allow reading/writing peer Leases in kube-system
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: mcr-lease
rules:
- apiGroups: ["coordination.k8s.io"]
resources: ["leases"]
verbs: ["get","list","watch","create","update","patch","delete"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: mcr-lease
subjects:
- kind: ServiceAccount
name: mcr-demo
namespace: mcr-demo
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: mcr-lease
---
# Allow the example to read namespaces/configmaps (namespace provider + demo controller)
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: mcr-example
rules:
- apiGroups: [""]
resources: ["namespaces","configmaps"]
verbs: ["get","list","watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: mcr-example
subjects:
- kind: ServiceAccount
name: mcr-demo
namespace: mcr-demo
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: mcr-example
42 changes: 42 additions & 0 deletions examples/sharded-namespace/manifests/sample-data.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
apiVersion: v1
kind: Namespace
metadata:
name: zoo
---
apiVersion: v1
kind: ConfigMap
metadata:
name: elephant
namespace: zoo
data: { a: "1" }
---
apiVersion: v1
kind: ConfigMap
metadata:
name: lion
namespace: zoo
data: { a: "1" }
---
apiVersion: v1
kind: Namespace
metadata:
name: jungle
---
apiVersion: v1
kind: ConfigMap
metadata:
name: monkey
namespace: jungle
data: { a: "1" }
---
apiVersion: v1
kind: Namespace
metadata:
name: island
---
apiVersion: v1
kind: ConfigMap
metadata:
name: bird
namespace: island
data: { a: "1" }
Loading