Infrastructure Helm charts for deploying Red Hat AI Inference Server (KServe LLMInferenceService) on managed Kubernetes platforms (AKS, CoreWeave).
Getting started? See the Deploying Red Hat AI Inference Server on Managed Kubernetes guide for step-by-step deployment instructions.
| Repository | Purpose |
|---|---|
| llm-d-xks-aks | AKS cluster provisioning (creates cluster + GPU nodes + GPU Operator) |
| Component | App Version | Description |
|---|---|---|
| cert-manager-operator | 1.15.2 | TLS certificate management |
| sail-operator (Istio) | 3.2.x | Gateway API for inference routing |
| lws-operator | 1.0 | LeaderWorkerSet controller for multi-node workloads |
| kserve | 3.4.0-ea.1 | KServe controller for LLMInferenceService lifecycle |
| Component | Version | Notes |
|---|---|---|
| OSSM (Sail Operator) | 3.2.x | Gateway API for inference routing |
| Istio | v1.27.x | Service mesh |
| InferencePool API | v1 | inference.networking.k8s.io/v1 |
| KServe | rhoai-3.4+ | LLMInferenceService controller |
- Kubernetes cluster (AKS or CoreWeave) - see llm-d-xks-aks for AKS provisioning
kubectl,helm(v3.17+),helmfile- Red Hat account (for Sail Operator and vLLM images from
registry.redhat.io)
Cluster readiness check (optional): Run cd validation && make container && make run to verify cloud provider, GPU availability, and instance types before deploying. CRD checks will pass only after operators are deployed. See Preflight Validation.
The Sail Operator and RHAIIS vLLM images are hosted on registry.redhat.io which requires authentication.
Choose one of the following methods:
Create a Registry Service Account (works for both Sail Operator and vLLM images):
- Go to: https://access.redhat.com/terms-based-registry/
- Click "New Service Account"
- Create account and note the username (e.g.,
12345678|myserviceaccount) - Login with the service account credentials:
$ podman login registry.redhat.io
Username: {REGISTRY-SERVICE-ACCOUNT-USERNAME}
Password: {REGISTRY-SERVICE-ACCOUNT-PASSWORD}
Login Succeeded!
# Verify it works
$ podman pull registry.redhat.io/openshift-service-mesh/istio-sail-operator-bundle:3.2Then configure values.yaml:
useSystemPodmanAuth: trueAlternative: Download the pull secret file (OpenShift secret tab) and copy to persistent location:
mkdir -p ~/.config/containers
cp ~/pull-secret.txt ~/.config/containers/auth.jsonNote: Registry Service Accounts are recommended as they don't expire like personal credentials.
If you have direct Red Hat account access (e.g., internal developers):
$ podman login registry.redhat.io
Username: {YOUR-REDHAT-USERNAME}
Password: {YOUR-REDHAT-PASSWORD}
Login Succeeded!This stores credentials in ${XDG_RUNTIME_DIR}/containers/auth.json or ~/.config/containers/auth.json.
Then configure values.yaml:
useSystemPodmanAuth: truegit clone https://github.com/opendatahub-io/rhaii-on-xks.git
cd rhaii-on-xks
# 1. Deploy all components (cert-manager + Istio + LWS + KServe)
make deploy-all
# 2. Set up inference gateway
./scripts/setup-gateway.sh
# 3. Validate deployment
cd validation && make container && make run
# 4. Check status
make statusFor deploying LLM inference services, GPU requirements, and testing inference, see the full deployment guide.
# Deploy
make deploy # cert-manager + istio + lws
make deploy-all # cert-manager + istio + lws + kserve
make deploy-kserve # Deploy KServe
# Undeploy
make undeploy # Remove all infrastructure
make undeploy-kserve # Remove KServe
# Test (ODH conformance)
make test NAMESPACE=llm-d # Run conformance tests
make test PROFILE=kserve-gpu # With specific profile
# Other
make status # Show status
make sync # Update helm reposEdit values.yaml:
# Option 1: Use system podman auth (recommended)
useSystemPodmanAuth: true
# Option 2: Use pull secret file directly
# pullSecretFile: ~/pull-secret.txt
# Operators
certManager:
enabled: true
sailOperator:
enabled: true
lwsOperator:
enabled: true # Required for multi-node LLM workloadsIf you encounter issues, collect diagnostic information for troubleshooting or to share with Red Hat support:
./scripts/collect-debug-info.shSee the Collecting Debug Information guide for details.
For detailed troubleshooting steps (KServe controller issues, gateway errors, webhook problems, monitoring setup), see the full deployment guide - Troubleshooting.
rhaii-on-xks/
├── helmfile.yaml.gotmpl
├── values.yaml
├── Makefile
├── README.md
├── charts/
│ ├── cert-manager-operator/ # cert-manager operator Helm chart
│ ├── sail-operator/ # Sail/Istio operator Helm chart
│ ├── lws-operator/ # LWS operator Helm chart
│ └── kserve/ # KServe controller Helm chart (auto-generated)
├── validation/ # Preflight validation checks
│ ├── llmd_xks_checks.py # Validation script
│ ├── Containerfile # Container build
│ └── Makefile # Build and run helpers
└── scripts/
├── cleanup.sh # Cleanup infrastructure (helmfile destroy + finalizers)
└── setup-gateway.sh # Set up Gateway with CA bundle for mTLS
Helm charts are included locally under charts/:
charts/cert-manager-operator/— cert-manager operatorcharts/sail-operator/— Sail/Istio operatorcharts/lws-operator/— LeaderWorkerSet operatorcharts/kserve/— KServe controller (auto-generated from Kustomize overlays, all images fromregistry.redhat.io)
The helmfile imports the infrastructure charts (cert-manager, sail-operator, lws-operator) including presync hooks for CRD installation. The KServe OCI chart is deployed via helmfile from ghcr.io/opendatahub-io/kserve-rhaii-xks.