Local Development with Kind Emulator

Quick start guide for local development using Kind (Kubernetes in Docker) with emulated GPU resources.

Prerequisites

Docker
Kind
kubectl
Helm (optional, but recommended)

Quick Start

One-Command Setup

Deploy WVA with full llm-d infrastructure:

# From project root
make deploy-llm-d-inferno-emulated-on-kind

This creates:

Kind cluster with 3 nodes, emulated GPUs (mixed vendors)
WVA controller
llm-d infrastructure (simulation mode)
Prometheus monitoring
vLLM emulator

Step-by-Step Setup

1. Create Kind cluster:

make create-kind-cluster

# With custom configuration
make create-kind-cluster KIND_ARGS="-t mix -n 4 -g 2"
# -t: vendor type (nvidia, amd, intel, mix)
# -n: number of nodes
# -g: GPUs per node

2. Deploy WVA only:

make deploy-wva-emulated-on-kind

3. Deploy with llm-d:

make deploy-llm-d-wva-emulated-on-kind

Scripts

setup.sh

Creates Kind cluster with emulated GPU support.

./setup.sh -t mix -n 3 -g 2

Options:

-t: Vendor type (nvidia|amd|intel|mix) - default: mix
-n: Number of nodes - default: 3
-g: GPUs per node - default: 2

deploy-wva.sh

Deploys WVA controller to existing cluster.

./deploy-wva.sh

deploy-llm-d.sh

Deploys WVA with llm-d infrastructure.

./deploy-llm-d.sh -i <your-image>

undeploy-llm-d.sh

Removes WVA and llm-d infrastructure.

./undeploy-llm-d.sh

teardown.sh

Destroys the Kind cluster.

./teardown.sh

Cluster Configuration

Default cluster created by setup.sh:

kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
  - role: control-plane
    extraMounts:
      - hostPath: /dev/null
        containerPath: /dev/nvidia0
  - role: worker
  - role: worker

GPUs are emulated using extended resources:

nvidia.com/gpu
amd.com/gpu
intel.com/gpu

Testing Locally

1. Access Services

Port-forward WVA metrics:

kubectl port-forward -n workload-variant-autoscaler-system \
  svc/workload-variant-autoscaler-controller-manager-metrics 8080:8080

Port-forward Prometheus:

kubectl port-forward -n workload-variant-autoscaler-monitoring \
  svc/prometheus-operated 9090:9090

Port-forward vLLM emulator:

kubectl port-forward -n llm-d-sim svc/vllme-service 8000:80

Port-forward Inference Gateway:

kubectl port-forward -n llm-d-sim svc/infra-sim-inference-gateway 8000:80

2. Create Test Resources

# Apply sample VariantAutoscaling
kubectl apply -f ../../config/samples/

3. Generate Load

cd ../../tools/vllm-emulator

# Install dependencies
pip install -r requirements.txt

# Run load generator
python loadgen.py \
  --model default/default \
  --rate '[[120, 60]]' \
  --url http://localhost:8000/v1 \
  --content 50

4. Monitor

# Watch deployments scale
watch kubectl get deploy -n llm-d-sim

# Watch VariantAutoscaling status
watch kubectl get variantautoscalings.llmd.ai -A

# View controller logs
kubectl logs -n workload-variant-autoscaler-system \
  -l control-plane=controller-manager -f

Troubleshooting

Cluster Creation Fails

# Clean up and retry
kind delete cluster --name kind-inferno-gpu-cluster
make create-kind-cluster

Controller Not Starting

# Check controller logs
kubectl logs -n workload-variant-autoscaler-system \
  deployment/workload-variant-autoscaler-controller-manager

# Verify CRDs installed
kubectl get crd variantautoscalings.llmd.ai

# Check RBAC
kubectl get clusterrole,clusterrolebinding -l app=workload-variant-autoscaler

GPUs Not Appearing

# Verify GPU labels on nodes
kubectl get nodes -o json | jq '.items[].status.capacity'

# Should see nvidia.com/gpu, amd.com/gpu, or intel.com/gpu

Port-Forward Issues

# Kill existing port-forwards
pkill -f "kubectl port-forward"

# Verify pod is running before port-forwarding
kubectl get pods -n <namespace>

Development Workflow

Make code changes

Build new image:

make docker-build IMG=localhost:5000/wva:dev

Load image to Kind:

kind load docker-image localhost:5000/wva:dev --name kind-inferno-gpu-cluster

Update deployment:

kubectl set image deployment/workload-variant-autoscaler-controller-manager \
  -n workload-variant-autoscaler-system \
  manager=localhost:5000/wva:dev

Verify changes:

kubectl logs -n workload-variant-autoscaler-system \
  deployment/workload-variant-autoscaler-controller-manager -f

Clean Up

Remove deployments:

make undeploy-llm-d-inferno-emulated-on-kind

Destroy cluster:

make destroy-kind-cluster

Or use scripts directly:

./undeploy-llm-d.sh
./teardown.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Local Development with Kind Emulator

Prerequisites

Quick Start

One-Command Setup

Step-by-Step Setup

Scripts

setup.sh

deploy-wva.sh

deploy-llm-d.sh

undeploy-llm-d.sh

teardown.sh

Cluster Configuration

Testing Locally

1. Access Services

2. Create Test Resources

3. Generate Load

4. Monitor

Troubleshooting

Cluster Creation Fails

Controller Not Starting

GPUs Not Appearing

Port-Forward Issues

Development Workflow

Clean Up

Next Steps

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

Local Development with Kind Emulator

Prerequisites

Quick Start

One-Command Setup

Step-by-Step Setup

Scripts

setup.sh

deploy-wva.sh

deploy-llm-d.sh

undeploy-llm-d.sh

teardown.sh

Cluster Configuration

Testing Locally

1. Access Services

2. Create Test Resources

3. Generate Load

4. Monitor

Troubleshooting

Cluster Creation Fails

Controller Not Starting

GPUs Not Appearing

Port-Forward Issues

Development Workflow

Clean Up

Next Steps