diff --git a/charts/workload-variant-autoscaler/Chart.yaml b/charts/workload-variant-autoscaler/Chart.yaml
index 8e5a64681..7439ad42a 100644
--- a/charts/workload-variant-autoscaler/Chart.yaml
+++ b/charts/workload-variant-autoscaler/Chart.yaml
@@ -2,5 +2,5 @@ apiVersion: v2
 name: workload-variant-autoscaler
 description: Helm chart for Workload-Variant-Autoscaler (WVA) - GPU-aware autoscaler for LLM inference workloads
 type: application
-version: 0.4.1
-appVersion: "v0.4.1"
+version: 0.4.3
+appVersion: "v0.4.3"
diff --git a/charts/workload-variant-autoscaler/README.md b/charts/workload-variant-autoscaler/README.md
index 2af11e9c8..93d835486 100644
--- a/charts/workload-variant-autoscaler/README.md
+++ b/charts/workload-variant-autoscaler/README.md
@@ -1,13 +1,108 @@
 # workload-variant-autoscaler
 
-![Version: 0.4.1](https://img.shields.io/badge/Version-0.4.1-informational?style=flat-square) ![Type: application](https://img.shields.io/badge/Type-application-informational?style=flat-square) ![AppVersion: v0.4.1](https://img.shields.io/badge/AppVersion-v0.4.1-informational?style=flat-square)
+![Version: 0.4.3](https://img.shields.io/badge/Version-0.4.3-informational?style=flat-square) ![Type: application](https://img.shields.io/badge/Type-application-informational?style=flat-square) ![AppVersion: v0.4.3](https://img.shields.io/badge/AppVersion-v0.4.3-informational?style=flat-square)
 
 Helm chart for Workload-Variant-Autoscaler (WVA) - GPU-aware autoscaler for LLM inference workloads
 
+## Installation Modes
+
+WVA supports three installation modes to enable flexible deployment architectures:
+
+### Mode 1: `all` (Default)
+Installs both the WVA controller and model-specific resources in a single installation. This is the traditional mode and is backward compatible with previous versions.
+
+**Use case**: Single llm-d stack with one model.
+
+```bash
+helm install workload-variant-autoscaler ./workload-variant-autoscaler \
+  -n workload-variant-autoscaler-system \
+  --set installMode=all
+```
+
+### Mode 2: `controller-only`
+Installs only the WVA controller without any model-specific resources. This enables a cluster-wide controller that can manage multiple models across different namespaces.
+
+**Use case**: Install the controller once for the entire cluster, then deploy model-specific resources separately as needed.
+
+```bash
+# Install the controller once
+helm install wva-controller ./workload-variant-autoscaler \
+  -n workload-variant-autoscaler-system \
+  --create-namespace \
+  --set installMode=controller-only \
+  --set wva.namespaceScoped=false
+```
+
+### Mode 3: `model-resources-only`
+Installs only model-specific resources (VariantAutoscaling, HPA, Service, ServiceMonitor) without the controller. Use this mode to add new models when a cluster-wide controller is already running.
+
+**Use case**: Deploy resources for additional models in different namespaces after the controller is installed.
+
+```bash
+# Deploy model resources for Model A in namespace-a
+helm install model-a-resources ./workload-variant-autoscaler \
+  -n namespace-a \
+  --set installMode=model-resources-only \
+  --set llmd.namespace=namespace-a \
+  --set llmd.modelName=model-a \
+  --set llmd.modelID="vendor/model-a"
+
+# Deploy model resources for Model B in namespace-b
+helm install model-b-resources ./workload-variant-autoscaler \
+  -n namespace-b \
+  --set installMode=model-resources-only \
+  --set llmd.namespace=namespace-b \
+  --set llmd.modelName=model-b \
+  --set llmd.modelID="vendor/model-b"
+```
+
+## Multi-Model Architecture Example
+
+For supporting multiple llm-d stacks with a single controller:
+
+```bash
+# Step 1: Install the WVA controller once (cluster-wide)
+helm install wva-controller ./workload-variant-autoscaler \
+  -n workload-variant-autoscaler-system \
+  --create-namespace \
+  --set installMode=controller-only \
+  --set wva.namespaceScoped=false \
+  --set wva.prometheus.baseURL="https://prometheus:9090"
+
+# Step 2: Deploy Model A resources
+helm install model-a ./workload-variant-autoscaler \
+  --set installMode=model-resources-only \
+  --set llmd.namespace=llm-d-model-a \
+  --set llmd.modelName=ms-model-a-llm-d-modelservice \
+  --set llmd.modelID="meta-llama/Llama-2-7b-hf" \
+  --set va.accelerator=H100
+
+# Step 3: Deploy Model B resources (in a different namespace)
+helm install model-b ./workload-variant-autoscaler \
+  --set installMode=model-resources-only \
+  --set llmd.namespace=llm-d-model-b \
+  --set llmd.modelName=ms-model-b-llm-d-modelservice \
+  --set llmd.modelID="mistralai/Mistral-7B-v0.1" \
+  --set va.accelerator=A100
+```
+
+### Important Configuration Notes
+
+**Namespace Scoping:**
+- When using `installMode=controller-only` for multi-model deployments, you must set `wva.namespaceScoped=false` to allow the controller to watch all namespaces.
+- When using `installMode=all` (default), you can keep `wva.namespaceScoped=true` for single-namespace operation or set it to `false` for cluster-wide operation.
+- `installMode=model-resources-only` does not use the `namespaceScoped` setting since it doesn't install the controller.
+
+**Resource Isolation:**
+- Each model's resources (VariantAutoscaling, HPA, Service, ServiceMonitor) are deployed in the model's namespace.
+- The controller remains in its dedicated namespace (typically `workload-variant-autoscaler-system`).
+- Multiple Helm releases can coexist: one for the controller and one per model.
+
 ## Values
 
 | Key | Type | Default | Description |
 |-----|------|---------|-------------|
+| installMode | string | `"all"` | Installation mode: "all" (controller + model resources), "controller-only" (just controller), "model-resources-only" (just model resources) |
 | hpa.enabled | bool | `true` |  |
 | hpa.maxReplicas | int | `10` |  |
 | hpa.targetAverageValue | string | `"1"` |  |
@@ -24,6 +119,7 @@ Helm chart for Workload-Variant-Autoscaler (WVA) - GPU-aware autoscaler for LLM
 | vllmService.scheme | string | `"http"` |  |
 | wva.enabled | bool | `true` |  |
 | wva.experimentalHybridOptimization | enum | `off` | supports on, off, and model-only |
+| wva.namespaceScoped | bool | `true` | If true, controller watches only its namespace; if false, watches all namespaces (cluster-scoped) |
 | wva.image.repository | string | `"ghcr.io/llm-d-incubation/workload-variant-autoscaler"` |  |
 | wva.image.tag | string | `"latest"` |  |
 | wva.imagePullPolicy | string | `"Always"` |  |
@@ -41,6 +137,9 @@ Helm chart for Workload-Variant-Autoscaler (WVA) - GPU-aware autoscaler for LLM
 Autogenerated from chart metadata using [helm-docs v1.14.2](https://github.com/norwoodj/helm-docs/releases/v1.14.2)
 
 ### INSTALL (on OpenShift)
+
+> **Note**: The default installation mode is `all`, which installs both the controller and model resources. For multi-model deployments, use `controller-only` mode first, then use `model-resources-only` mode for each model. See the [Installation Modes](#installation-modes) section above for details.
+
 1. Before running, be sure to delete all previous helm installations for workload-variant-scheduler and prometheus-adapter.
 2. llm-d must be installed for WVA to do it's magic. If you plan on installing llm-d with these instructions, please be sure to remove any other helm installation of llm-d before proceeding.
 
diff --git a/charts/workload-variant-autoscaler/templates/hpa.yaml b/charts/workload-variant-autoscaler/templates/hpa.yaml
index 7ad0c8d66..cdbf06738 100644
--- a/charts/workload-variant-autoscaler/templates/hpa.yaml
+++ b/charts/workload-variant-autoscaler/templates/hpa.yaml
@@ -1,4 +1,4 @@
-{{- if .Values.hpa.enabled }}
+{{- if and (or (eq .Values.installMode "all") (eq .Values.installMode "model-resources-only")) .Values.hpa.enabled }}
 apiVersion: autoscaling/v2
 kind: HorizontalPodAutoscaler
 metadata:
diff --git a/charts/workload-variant-autoscaler/templates/manager/prometheus-clusterrolebinding.yaml b/charts/workload-variant-autoscaler/templates/manager/prometheus-clusterrolebinding.yaml
index fce4b3d9b..bd303dff9 100644
--- a/charts/workload-variant-autoscaler/templates/manager/prometheus-clusterrolebinding.yaml
+++ b/charts/workload-variant-autoscaler/templates/manager/prometheus-clusterrolebinding.yaml
@@ -1,3 +1,4 @@
+{{- if or (eq .Values.installMode "all") (eq .Values.installMode "controller-only") }}
 apiVersion: rbac.authorization.k8s.io/v1
 kind: ClusterRoleBinding
 metadata:
@@ -13,3 +14,4 @@ subjects:
   - kind: ServiceAccount
     name: {{ .Values.wva.prometheus.serviceAccountName }}
     namespace: {{ .Values.wva.prometheus.monitoringNamespace }}
+{{- end }}
diff --git a/charts/workload-variant-autoscaler/templates/manager/wva-clusterrolebinding.yaml b/charts/workload-variant-autoscaler/templates/manager/wva-clusterrolebinding.yaml
index 58376493c..3fd78ef8d 100644
--- a/charts/workload-variant-autoscaler/templates/manager/wva-clusterrolebinding.yaml
+++ b/charts/workload-variant-autoscaler/templates/manager/wva-clusterrolebinding.yaml
@@ -1,3 +1,4 @@
+{{- if or (eq .Values.installMode "all") (eq .Values.installMode "controller-only") }}
 apiVersion: rbac.authorization.k8s.io/v1
 kind: ClusterRoleBinding
 metadata:
@@ -13,3 +14,4 @@ subjects:
   - kind: ServiceAccount
     name: workload-variant-autoscaler-controller-manager
     namespace: {{ .Release.Namespace }}
+{{- end }}
diff --git a/charts/workload-variant-autoscaler/templates/manager/wva-configmap-accelerator-costs.yaml b/charts/workload-variant-autoscaler/templates/manager/wva-configmap-accelerator-costs.yaml
index d83284c11..a52d032ed 100644
--- a/charts/workload-variant-autoscaler/templates/manager/wva-configmap-accelerator-costs.yaml
+++ b/charts/workload-variant-autoscaler/templates/manager/wva-configmap-accelerator-costs.yaml
@@ -1,3 +1,4 @@
+{{- if or (eq .Values.installMode "all") (eq .Values.installMode "controller-only") }}
 apiVersion: v1
 kind: ConfigMap
 metadata:
@@ -34,3 +35,4 @@ data:
     "device": "NVIDIA-L40S",
     "cost": "32.00"
     }
+{{- end }}
diff --git a/charts/workload-variant-autoscaler/templates/manager/wva-configmap-service-class.yaml b/charts/workload-variant-autoscaler/templates/manager/wva-configmap-service-class.yaml
index 49d0cbbe6..19acf3aa5 100644
--- a/charts/workload-variant-autoscaler/templates/manager/wva-configmap-service-class.yaml
+++ b/charts/workload-variant-autoscaler/templates/manager/wva-configmap-service-class.yaml
@@ -1,3 +1,4 @@
+{{- if or (eq .Values.installMode "all") (eq .Values.installMode "controller-only") }}
 apiVersion: v1
 kind: ConfigMap
 # This configMap defines the set of accelerators available
@@ -35,3 +36,4 @@ data:
       - model: meta/llama0-7b
         slo-tpot: 150
         slo-ttft: 1500
+{{- end }}
diff --git a/charts/workload-variant-autoscaler/templates/manager/wva-configmap.yaml b/charts/workload-variant-autoscaler/templates/manager/wva-configmap.yaml
index ddc2782d1..6e01d4c07 100644
--- a/charts/workload-variant-autoscaler/templates/manager/wva-configmap.yaml
+++ b/charts/workload-variant-autoscaler/templates/manager/wva-configmap.yaml
@@ -1,3 +1,4 @@
+{{- if or (eq .Values.installMode "all") (eq .Values.installMode "controller-only") }}
 apiVersion: v1
 kind: ConfigMap
 metadata:
@@ -56,3 +57,4 @@ data:
   # EPP_METRICS_CACHE_TTL: "15s"
   # EPP_METRICS_CACHE_MAX_SIZE: "500"
   # EPP_METRICS_CACHE_CLEANUP_INTERVAL: "30s"
+{{- end }}
diff --git a/charts/workload-variant-autoscaler/templates/manager/wva-cp-configmap.yaml b/charts/workload-variant-autoscaler/templates/manager/wva-cp-configmap.yaml
index 0a5214cde..9adc56e7b 100644
--- a/charts/workload-variant-autoscaler/templates/manager/wva-cp-configmap.yaml
+++ b/charts/workload-variant-autoscaler/templates/manager/wva-cp-configmap.yaml
@@ -1,3 +1,4 @@
+{{- if or (eq .Values.installMode "all") (eq .Values.installMode "controller-only") }}
 apiVersion: v1
 kind: ConfigMap
 # This ConfigMap defines saturation-based scaling thresholds for model variants.
@@ -61,4 +62,4 @@ data:
   #   model_id: meta/llama-70b
   #   namespace: production
   #   kvCacheThreshold: 0.85
-  #   kvSpareTrigger: 0.15
\ No newline at end of file
+  #   kvSpareTrigger: 0.15{{- end }}
diff --git a/charts/workload-variant-autoscaler/templates/manager/wva-deployment-controller-manager.yaml b/charts/workload-variant-autoscaler/templates/manager/wva-deployment-controller-manager.yaml
index e779b8344..3f4264233 100644
--- a/charts/workload-variant-autoscaler/templates/manager/wva-deployment-controller-manager.yaml
+++ b/charts/workload-variant-autoscaler/templates/manager/wva-deployment-controller-manager.yaml
@@ -1,3 +1,4 @@
+{{- if or (eq .Values.installMode "all") (eq .Values.installMode "controller-only") }}
 apiVersion: apps/v1
 kind: Deployment
 metadata:
@@ -144,3 +145,4 @@ spec:
           optional: true
       serviceAccountName: workload-variant-autoscaler-controller-manager
       terminationGracePeriodSeconds: 10
+{{- end }}
diff --git a/charts/workload-variant-autoscaler/templates/manager/wva-serviceaccount.yaml b/charts/workload-variant-autoscaler/templates/manager/wva-serviceaccount.yaml
index 10694f1c8..c630b59f5 100644
--- a/charts/workload-variant-autoscaler/templates/manager/wva-serviceaccount.yaml
+++ b/charts/workload-variant-autoscaler/templates/manager/wva-serviceaccount.yaml
@@ -1,3 +1,4 @@
+{{- if or (eq .Values.installMode "all") (eq .Values.installMode "controller-only") }}
 apiVersion: v1
 kind: ServiceAccount
 metadata:
@@ -7,3 +8,4 @@ metadata:
     app.kubernetes.io/name: workload-variant-autoscaler
 imagePullSecrets:
   - name: {{ .Values.wva.imagePullSecret | default "" }}
+{{- end }}
diff --git a/charts/workload-variant-autoscaler/templates/manager/wva-servicemonitor.yaml b/charts/workload-variant-autoscaler/templates/manager/wva-servicemonitor.yaml
index 920ef2d40..a782a941c 100644
--- a/charts/workload-variant-autoscaler/templates/manager/wva-servicemonitor.yaml
+++ b/charts/workload-variant-autoscaler/templates/manager/wva-servicemonitor.yaml
@@ -1,3 +1,4 @@
+{{- if or (eq .Values.installMode "all") (eq .Values.installMode "controller-only") }}
 apiVersion: monitoring.coreos.com/v1
 kind: ServiceMonitor
 metadata:
@@ -27,3 +28,4 @@ spec:
     matchLabels:
       app.kubernetes.io/name: workload-variant-autoscaler
       control-plane: controller-manager
+{{- end }}
diff --git a/charts/workload-variant-autoscaler/templates/manager/wva-token-secret.yaml b/charts/workload-variant-autoscaler/templates/manager/wva-token-secret.yaml
index e42ca8299..dd6ae58f1 100644
--- a/charts/workload-variant-autoscaler/templates/manager/wva-token-secret.yaml
+++ b/charts/workload-variant-autoscaler/templates/manager/wva-token-secret.yaml
@@ -1,3 +1,4 @@
+{{- if or (eq .Values.installMode "all") (eq .Values.installMode "controller-only") }}
 apiVersion: v1
 kind: Secret
 metadata:
@@ -5,4 +6,4 @@ metadata:
   namespace: {{ .Release.Namespace }}
   annotations:
     kubernetes.io/service-account.name: workload-variant-autoscaler-controller-manager
-type: kubernetes.io/service-account-token
\ No newline at end of file
+type: kubernetes.io/service-account-token{{- end }}
diff --git a/charts/workload-variant-autoscaler/templates/metrics_service.yaml b/charts/workload-variant-autoscaler/templates/metrics_service.yaml
index a89c5b089..dc6275a41 100644
--- a/charts/workload-variant-autoscaler/templates/metrics_service.yaml
+++ b/charts/workload-variant-autoscaler/templates/metrics_service.yaml
@@ -1,3 +1,4 @@
+{{- if or (eq .Values.installMode "all") (eq .Values.installMode "controller-only") }}
 apiVersion: v1
 kind: Service
 metadata:
@@ -15,3 +16,4 @@ spec:
   selector:
     control-plane: controller-manager
     app.kubernetes.io/name: workload-variant-autoscaler
+{{- end }}
diff --git a/charts/workload-variant-autoscaler/templates/prometheus-ca-configmap-prom.yaml b/charts/workload-variant-autoscaler/templates/prometheus-ca-configmap-prom.yaml
index 25d1693ec..a424bd5ab 100644
--- a/charts/workload-variant-autoscaler/templates/prometheus-ca-configmap-prom.yaml
+++ b/charts/workload-variant-autoscaler/templates/prometheus-ca-configmap-prom.yaml
@@ -1,3 +1,4 @@
+{{- if or (eq .Values.installMode "all") (eq .Values.installMode "controller-only") }}
 apiVersion: v1
 kind: ConfigMap
 metadata:
@@ -10,3 +11,4 @@ data:
     {{- else }}
     # CA certificate not provided - using system CA bundle
     {{- end }}
+{{- end }}
diff --git a/charts/workload-variant-autoscaler/templates/prometheus-ca-configmap-wva.yaml b/charts/workload-variant-autoscaler/templates/prometheus-ca-configmap-wva.yaml
index 509913708..bab3f2083 100644
--- a/charts/workload-variant-autoscaler/templates/prometheus-ca-configmap-wva.yaml
+++ b/charts/workload-variant-autoscaler/templates/prometheus-ca-configmap-wva.yaml
@@ -1,3 +1,4 @@
+{{- if or (eq .Values.installMode "all") (eq .Values.installMode "controller-only") }}
 apiVersion: v1
 kind: ConfigMap
 metadata:
@@ -10,3 +11,4 @@ data:
     {{- else }}
     # CA certificate not provided - using system CA bundle
     {{- end }}
+{{- end }}
diff --git a/charts/workload-variant-autoscaler/templates/rbac/leader_election_role.yaml b/charts/workload-variant-autoscaler/templates/rbac/leader_election_role.yaml
index 1771f26ca..3d567c728 100644
--- a/charts/workload-variant-autoscaler/templates/rbac/leader_election_role.yaml
+++ b/charts/workload-variant-autoscaler/templates/rbac/leader_election_role.yaml
@@ -1,3 +1,4 @@
+{{- if or (eq .Values.installMode "all") (eq .Values.installMode "controller-only") }}
 # permissions to do leader election.
 apiVersion: rbac.authorization.k8s.io/v1
 kind: Role
@@ -38,3 +39,4 @@ rules:
   verbs:
   - create
   - patch
+{{- end }}
diff --git a/charts/workload-variant-autoscaler/templates/rbac/leader_election_role_binding.yaml b/charts/workload-variant-autoscaler/templates/rbac/leader_election_role_binding.yaml
index a7217dde5..c87a7b510 100644
--- a/charts/workload-variant-autoscaler/templates/rbac/leader_election_role_binding.yaml
+++ b/charts/workload-variant-autoscaler/templates/rbac/leader_election_role_binding.yaml
@@ -1,3 +1,4 @@
+{{- if or (eq .Values.installMode "all") (eq .Values.installMode "controller-only") }}
 apiVersion: rbac.authorization.k8s.io/v1
 kind: RoleBinding
 metadata:
@@ -13,3 +14,4 @@ subjects:
 - kind: ServiceAccount
   name: workload-variant-autoscaler-controller-manager
   namespace: {{ .Release.Namespace }}
+{{- end }}
diff --git a/charts/workload-variant-autoscaler/templates/rbac/metrics_auth_role.yaml b/charts/workload-variant-autoscaler/templates/rbac/metrics_auth_role.yaml
index 1ffc31c62..9d9f00aeb 100644
--- a/charts/workload-variant-autoscaler/templates/rbac/metrics_auth_role.yaml
+++ b/charts/workload-variant-autoscaler/templates/rbac/metrics_auth_role.yaml
@@ -1,3 +1,4 @@
+{{- if or (eq .Values.installMode "all") (eq .Values.installMode "controller-only") }}
 apiVersion: rbac.authorization.k8s.io/v1
 kind: ClusterRole
 metadata:
@@ -15,3 +16,4 @@ rules:
   - subjectaccessreviews
   verbs:
   - create
+{{- end }}
diff --git a/charts/workload-variant-autoscaler/templates/rbac/metrics_auth_role_binding.yaml b/charts/workload-variant-autoscaler/templates/rbac/metrics_auth_role_binding.yaml
index 418ecdc27..836213067 100644
--- a/charts/workload-variant-autoscaler/templates/rbac/metrics_auth_role_binding.yaml
+++ b/charts/workload-variant-autoscaler/templates/rbac/metrics_auth_role_binding.yaml
@@ -1,3 +1,4 @@
+{{- if or (eq .Values.installMode "all") (eq .Values.installMode "controller-only") }}
 apiVersion: rbac.authorization.k8s.io/v1
 kind: ClusterRoleBinding
 metadata:
@@ -10,3 +11,4 @@ subjects:
 - kind: ServiceAccount
   name: workload-variant-autoscaler-controller-manager
   namespace: {{ .Release.Namespace}}
+{{- end }}
diff --git a/charts/workload-variant-autoscaler/templates/rbac/metrics_reader_role.yaml b/charts/workload-variant-autoscaler/templates/rbac/metrics_reader_role.yaml
index d1d0e009b..28f896fe1 100644
--- a/charts/workload-variant-autoscaler/templates/rbac/metrics_reader_role.yaml
+++ b/charts/workload-variant-autoscaler/templates/rbac/metrics_reader_role.yaml
@@ -1,3 +1,4 @@
+{{- if or (eq .Values.installMode "all") (eq .Values.installMode "controller-only") }}
 apiVersion: rbac.authorization.k8s.io/v1
 kind: ClusterRole
 metadata:
@@ -7,3 +8,4 @@ rules:
   - "/metrics"
   verbs:
   - get
+{{- end }}
diff --git a/charts/workload-variant-autoscaler/templates/rbac/metrics_reader_role_binding.yaml b/charts/workload-variant-autoscaler/templates/rbac/metrics_reader_role_binding.yaml
index 59d7c44f6..256602dbf 100644
--- a/charts/workload-variant-autoscaler/templates/rbac/metrics_reader_role_binding.yaml
+++ b/charts/workload-variant-autoscaler/templates/rbac/metrics_reader_role_binding.yaml
@@ -1,3 +1,4 @@
+{{- if or (eq .Values.installMode "all") (eq .Values.installMode "controller-only") }}
 apiVersion: rbac.authorization.k8s.io/v1
 kind: ClusterRoleBinding
 metadata:
@@ -9,4 +10,4 @@ roleRef:
 subjects:
 - kind: ServiceAccount
   name: workload-variant-autoscaler-controller-manager
-  namespace: {{ .Release.Namespace }}
\ No newline at end of file
+  namespace: {{ .Release.Namespace }}{{- end }}
diff --git a/charts/workload-variant-autoscaler/templates/rbac/prometheus_metrics_auth_role_binding.yaml b/charts/workload-variant-autoscaler/templates/rbac/prometheus_metrics_auth_role_binding.yaml
index 3293e0c90..454643489 100644
--- a/charts/workload-variant-autoscaler/templates/rbac/prometheus_metrics_auth_role_binding.yaml
+++ b/charts/workload-variant-autoscaler/templates/rbac/prometheus_metrics_auth_role_binding.yaml
@@ -1,3 +1,4 @@
+{{- if or (eq .Values.installMode "all") (eq .Values.installMode "controller-only") }}
 apiVersion: rbac.authorization.k8s.io/v1
 kind: ClusterRoleBinding
 metadata:
@@ -14,3 +15,4 @@ subjects:
 - kind: ServiceAccount
   name: {{ .Values.wva.prometheus.serviceAccountName }}
   namespace: {{ .Values.wva.prometheus.monitoringNamespace }}
+{{- end }}
diff --git a/charts/workload-variant-autoscaler/templates/rbac/role.yaml b/charts/workload-variant-autoscaler/templates/rbac/role.yaml
index 4107ebcdf..790aa7185 100644
--- a/charts/workload-variant-autoscaler/templates/rbac/role.yaml
+++ b/charts/workload-variant-autoscaler/templates/rbac/role.yaml
@@ -1,3 +1,4 @@
+{{- if or (eq .Values.installMode "all") (eq .Values.installMode "controller-only") }}
 ---
 apiVersion: rbac.authorization.k8s.io/v1
 kind: ClusterRole
@@ -83,3 +84,4 @@ rules:
   verbs:
   - create
   - patch
+{{- end }}
diff --git a/charts/workload-variant-autoscaler/templates/rbac/role_binding.yaml b/charts/workload-variant-autoscaler/templates/rbac/role_binding.yaml
index da246f23e..e0a8ff010 100644
--- a/charts/workload-variant-autoscaler/templates/rbac/role_binding.yaml
+++ b/charts/workload-variant-autoscaler/templates/rbac/role_binding.yaml
@@ -1,3 +1,4 @@
+{{- if or (eq .Values.installMode "all") (eq .Values.installMode "controller-only") }}
 apiVersion: rbac.authorization.k8s.io/v1
 kind: ClusterRoleBinding
 metadata:
@@ -12,3 +13,4 @@ subjects:
 - kind: ServiceAccount
   name: workload-variant-autoscaler-controller-manager
   namespace: {{ .Release.Namespace }}
+{{- end }}
diff --git a/charts/workload-variant-autoscaler/templates/rbac/variantautoscaling_admin_role.yaml b/charts/workload-variant-autoscaler/templates/rbac/variantautoscaling_admin_role.yaml
index 8cfc66882..7587034a5 100644
--- a/charts/workload-variant-autoscaler/templates/rbac/variantautoscaling_admin_role.yaml
+++ b/charts/workload-variant-autoscaler/templates/rbac/variantautoscaling_admin_role.yaml
@@ -1,3 +1,4 @@
+{{- if or (eq .Values.installMode "all") (eq .Values.installMode "controller-only") }}
 # This rule is not used by the project workload-variant-autoscaler itself.
 # It is provided to allow the cluster admin to help manage permissions for users.
 #
@@ -24,3 +25,4 @@ rules:
   - variantautoscalings/status
   verbs:
   - get
+{{- end }}
diff --git a/charts/workload-variant-autoscaler/templates/rbac/variantautoscaling_editor_role.yaml b/charts/workload-variant-autoscaler/templates/rbac/variantautoscaling_editor_role.yaml
index d9a59f600..6ea85ed66 100644
--- a/charts/workload-variant-autoscaler/templates/rbac/variantautoscaling_editor_role.yaml
+++ b/charts/workload-variant-autoscaler/templates/rbac/variantautoscaling_editor_role.yaml
@@ -1,3 +1,4 @@
+{{- if or (eq .Values.installMode "all") (eq .Values.installMode "controller-only") }}
 # This rule is not used by the project workload-variant-autoscaler itself.
 # It is provided to allow the cluster admin to help manage permissions for users.
 #
@@ -30,3 +31,4 @@ rules:
   - variantautoscalings/status
   verbs:
   - get
+{{- end }}
diff --git a/charts/workload-variant-autoscaler/templates/rbac/variantautoscaling_viewer_role.yaml b/charts/workload-variant-autoscaler/templates/rbac/variantautoscaling_viewer_role.yaml
index f5c4c01aa..8318dfae5 100644
--- a/charts/workload-variant-autoscaler/templates/rbac/variantautoscaling_viewer_role.yaml
+++ b/charts/workload-variant-autoscaler/templates/rbac/variantautoscaling_viewer_role.yaml
@@ -1,3 +1,4 @@
+{{- if or (eq .Values.installMode "all") (eq .Values.installMode "controller-only") }}
 # This rule is not used by the project workload-variant-autoscaler itself.
 # It is provided to allow the cluster admin to help manage permissions for users.
 #
@@ -26,3 +27,4 @@ rules:
   - variantautoscalings/status
   verbs:
   - get
+{{- end }}
diff --git a/charts/workload-variant-autoscaler/templates/variantautoscaling.yaml b/charts/workload-variant-autoscaler/templates/variantautoscaling.yaml
index 1a7c79d99..d2ec82773 100644
--- a/charts/workload-variant-autoscaler/templates/variantautoscaling.yaml
+++ b/charts/workload-variant-autoscaler/templates/variantautoscaling.yaml
@@ -1,4 +1,4 @@
-{{- if .Values.va.enabled }}
+{{- if and (or (eq .Values.installMode "all") (eq .Values.installMode "model-resources-only")) .Values.va.enabled }}
 apiVersion: llmd.ai/v1alpha1
 # Optimizing a variant, create only when the model is deployed and serving traffic
 # this is for the collector the collect existing (previous) running metrics of the variant.
@@ -58,4 +58,4 @@ spec:
         maxBatchSize: 4
   {{- end}} 
    
-{{- end }}
\ No newline at end of file
+{{- end }}
diff --git a/charts/workload-variant-autoscaler/templates/vllm-service.yaml b/charts/workload-variant-autoscaler/templates/vllm-service.yaml
index ee13b17c5..9e42ea3c2 100644
--- a/charts/workload-variant-autoscaler/templates/vllm-service.yaml
+++ b/charts/workload-variant-autoscaler/templates/vllm-service.yaml
@@ -1,4 +1,4 @@
-{{- if .Values.vllmService.enabled }}
+{{- if and (or (eq .Values.installMode "all") (eq .Values.installMode "model-resources-only")) .Values.vllmService.enabled }}
 apiVersion: v1
 kind: Service
 metadata:
@@ -16,4 +16,4 @@ spec:
       targetPort: 8200
       nodePort: {{ .Values.vllmService.nodePort }}
   type: NodePort
-{{- end }}
\ No newline at end of file
+{{- end }}
diff --git a/charts/workload-variant-autoscaler/templates/vllm-servicemonitor.yaml b/charts/workload-variant-autoscaler/templates/vllm-servicemonitor.yaml
index 247999cea..c65b7813f 100644
--- a/charts/workload-variant-autoscaler/templates/vllm-servicemonitor.yaml
+++ b/charts/workload-variant-autoscaler/templates/vllm-servicemonitor.yaml
@@ -1,4 +1,4 @@
-{{- if .Values.vllmService.enabled }}
+{{- if and (or (eq .Values.installMode "all") (eq .Values.installMode "model-resources-only")) .Values.vllmService.enabled }}
 apiVersion: monitoring.coreos.com/v1
 kind: ServiceMonitor
 metadata:
@@ -23,4 +23,4 @@ spec:
     {{- end }}
   namespaceSelector:
     any: true
-{{- end }}
\ No newline at end of file
+{{- end }}
diff --git a/charts/workload-variant-autoscaler/values-dev.yaml b/charts/workload-variant-autoscaler/values-dev.yaml
index 4138936f9..30ec1ee68 100644
--- a/charts/workload-variant-autoscaler/values-dev.yaml
+++ b/charts/workload-variant-autoscaler/values-dev.yaml
@@ -1,6 +1,12 @@
 # Development values for workload-variant-autoscaler
 # This file contains development-specific configurations with relaxed security settings
 
+# Installation mode controls which components are installed:
+# - "all": Install both controller and model-specific resources (default, backward compatible)
+# - "controller-only": Install only the WVA controller (for cluster-wide controller management)
+# - "model-resources-only": Install only model-specific resources (VA, HPA, Service, ServiceMonitor)
+installMode: all
+
 wva:
   enabled: true
   replicaCount: 1
diff --git a/charts/workload-variant-autoscaler/values.yaml b/charts/workload-variant-autoscaler/values.yaml
index 11472ec04..05b9f8986 100644
--- a/charts/workload-variant-autoscaler/values.yaml
+++ b/charts/workload-variant-autoscaler/values.yaml
@@ -1,3 +1,9 @@
+# Installation mode controls which components are installed:
+# - "all": Install both controller and model-specific resources (default, backward compatible)
+# - "controller-only": Install only the WVA controller (for cluster-wide controller management)
+# - "model-resources-only": Install only model-specific resources (VA, HPA, Service, ServiceMonitor)
+installMode: all
+
 wva:
   enabled: true
 
@@ -13,6 +19,8 @@ wva:
   
   # If true, the controller will only watch the namespace it is deployed in.
   # If false, the controller will watch all namespaces (cluster-scoped).
+  # Note: When using installMode="controller-only" for multi-model deployments,
+  # set this to false to allow the controller to watch all namespaces.
   namespaceScoped: true
 
   reconcileInterval: 60s
diff --git a/deploy/README.md b/deploy/README.md
index e6601877c..9c63df5d2 100644
--- a/deploy/README.md
+++ b/deploy/README.md
@@ -219,7 +219,9 @@ The WVA can be deployed as a standalone using Helm, assuming you have:
 - ServiceMonitors configured
 - Prometheus Adapter (optional, for HPA)
 
-This method is particularly useful when there is one (or more) existing llm-d infrastructure deployed
+This method is particularly useful when there is one (or more) existing llm-d infrastructure deployed.
+
+> **New in v0.4.3**: The Helm chart now supports three installation modes (`all`, `controller-only`, `model-resources-only`) to enable flexible multi-model deployments. See the [Helm Chart Installation Modes](#helm-chart-installation-modes) section below for details.
 
 #### Helm Chart Quick Start
 
@@ -529,10 +531,178 @@ spec:
 EOF
 ```
 
+#### Helm Chart Installation Modes
+
+**New in v0.4.3**: The Helm chart supports three installation modes to enable flexible multi-model deployments across multiple namespaces. This addresses the limitation where installing WVA for a new model would overwrite resources from existing models.
+
+##### Mode 1: `all` (Default - Backward Compatible)
+
+Installs both the WVA controller and model-specific resources. This is the traditional mode and maintains backward compatibility.
+
+**Use case**: Single llm-d stack with one model.
+
+```bash
+helm install workload-variant-autoscaler ./charts/workload-variant-autoscaler \
+  -n workload-variant-autoscaler-system \
+  --create-namespace \
+  --set installMode=all  # This is the default
+```
+
+##### Mode 2: `controller-only`
+
+Installs only the WVA controller (Deployment, ServiceAccount, RBAC, ConfigMaps) without any model-specific resources.
+
+**Use case**: Install a cluster-wide controller once, then deploy model-specific resources separately as needed.
+
+```bash
+# Step 1: Install the controller once for the entire cluster
+helm install wva-controller ./charts/workload-variant-autoscaler \
+  -n workload-variant-autoscaler-system \
+  --create-namespace \
+  --set installMode=controller-only \
+  --set wva.namespaceScoped=false \
+  --set wva.prometheus.baseURL="https://prometheus-k8s.monitoring.svc:9090" \
+  --set wva.prometheus.tls.insecureSkipVerify=false
+```
+
+##### Mode 3: `model-resources-only`
+
+Installs only model-specific resources (VariantAutoscaling, HPA, Service, ServiceMonitor) without the controller.
+
+**Use case**: Deploy resources for additional models in different namespaces after a cluster-wide controller is installed.
+
+```bash
+# Step 2a: Deploy model resources for Model A in namespace-a
+helm install model-a-resources ./charts/workload-variant-autoscaler \
+  --set installMode=model-resources-only \
+  --set llmd.namespace=llm-d-model-a \
+  --set llmd.modelName=ms-model-a-llm-d-modelservice \
+  --set llmd.modelID="meta-llama/Llama-2-7b-hf" \
+  --set va.accelerator=H100 \
+  --set va.enabled=true \
+  --set hpa.enabled=true \
+  --set vllmService.enabled=true
+
+# Step 2b: Deploy model resources for Model B in namespace-b
+helm install model-b-resources ./charts/workload-variant-autoscaler \
+  --set installMode=model-resources-only \
+  --set llmd.namespace=llm-d-model-b \
+  --set llmd.modelName=ms-model-b-llm-d-modelservice \
+  --set llmd.modelID="mistralai/Mistral-7B-v0.1" \
+  --set va.accelerator=A100 \
+  --set va.enabled=true \
+  --set hpa.enabled=true \
+  --set vllmService.enabled=true
+```
+
+##### Complete Multi-Model Example
+
+Here's a complete example showing how to deploy WVA to support multiple llm-d stacks:
+
+```bash
+# Prerequisites: Multiple llm-d stacks already deployed in different namespaces
+
+# Step 1: Install the WVA controller once (cluster-wide)
+helm install wva-controller ./charts/workload-variant-autoscaler \
+  -n workload-variant-autoscaler-system \
+  --create-namespace \
+  --set installMode=controller-only \
+  --set wva.namespaceScoped=false \
+  --set wva.prometheus.baseURL="https://prometheus-k8s.monitoring.svc:9090"
+
+# Step 2: Deploy model resources for each llm-d stack
+# Model A in llm-d-stack-a namespace
+helm install wva-model-a ./charts/workload-variant-autoscaler \
+  --set installMode=model-resources-only \
+  --set llmd.namespace=llm-d-stack-a \
+  --set llmd.modelName=ms-model-a-llm-d-modelservice \
+  --set llmd.modelID="meta-llama/Llama-2-7b-hf" \
+  --set va.accelerator=H100
+
+# Model B in llm-d-stack-b namespace
+helm install wva-model-b ./charts/workload-variant-autoscaler \
+  --set installMode=model-resources-only \
+  --set llmd.namespace=llm-d-stack-b \
+  --set llmd.modelName=ms-model-b-llm-d-modelservice \
+  --set llmd.modelID="mistralai/Mistral-7B-v0.1" \
+  --set va.accelerator=A100
+
+# Model C in llm-d-stack-c namespace
+helm install wva-model-c ./charts/workload-variant-autoscaler \
+  --set installMode=model-resources-only \
+  --set llmd.namespace=llm-d-stack-c \
+  --set llmd.modelName=ms-model-c-llm-d-modelservice \
+  --set llmd.modelID="google/gemma-2b" \
+  --set va.accelerator=L40S
+```
+
+**Architecture with Multiple Models:**
+
+```
+Cluster
+├── workload-variant-autoscaler-system (namespace)
+│   └── wva-controller (Deployment) ← Single controller watching all namespaces
+│
+├── llm-d-stack-a (namespace)
+│   ├── llm-d resources (Gateway, Scheduler, vLLM)
+│   └── wva-resources (VA, HPA, Service, ServiceMonitor) for Model A
+│
+├── llm-d-stack-b (namespace)
+│   ├── llm-d resources (Gateway, Scheduler, vLLM)
+│   └── wva-resources (VA, HPA, Service, ServiceMonitor) for Model B
+│
+└── llm-d-stack-c (namespace)
+    ├── llm-d resources (Gateway, Scheduler, vLLM)
+    └── wva-resources (VA, HPA, Service, ServiceMonitor) for Model C
+```
+
+**Benefits:**
+- Single WVA controller manages all models across the cluster
+- Each model's resources are isolated in their own namespace
+- Adding/removing models doesn't affect other models
+- Supports multiple llm-d stacks without resource conflicts
+
+##### Upgrading Existing Installations
+
+If you have an existing WVA installation (v0.4.1 or earlier), it will continue to work with the default `all` mode. To migrate to the new multi-model architecture:
+
+```bash
+# 1. Note your current model configuration
+kubectl get variantautoscaling -A
+kubectl get hpa -A | grep vllm
+
+# 2. Uninstall the old WVA installation
+helm uninstall workload-variant-autoscaler -n workload-variant-autoscaler-system
+
+# 3. Reinstall with controller-only mode
+helm install wva-controller ./charts/workload-variant-autoscaler \
+  -n workload-variant-autoscaler-system \
+  --create-namespace \
+  --set installMode=controller-only \
+  --set wva.namespaceScoped=false \
+  --set wva.prometheus.baseURL="<your-prometheus-url>"
+
+# 4. Reinstall model resources for each model
+helm install wva-model-a ./charts/workload-variant-autoscaler \
+  --set installMode=model-resources-only \
+  --set llmd.namespace=<your-model-namespace> \
+  --set llmd.modelName=<your-model-name> \
+  --set llmd.modelID="<your-model-id>" \
+  --set va.accelerator=<your-accelerator>
+```
+
 #### Helm Uninstall
 
 ```bash
-# Uninstall the release
+# Uninstall controller
+helm uninstall wva-controller -n workload-variant-autoscaler-system
+
+# Uninstall model resources (repeat for each model)
+helm uninstall wva-model-a
+helm uninstall wva-model-b
+# ...
+
+# Or uninstall all-in-one installation
 helm uninstall workload-variant-autoscaler -n workload-variant-autoscaler-system
 ```
 
diff --git a/docs/user-guide/installation.md b/docs/user-guide/installation.md
index 5a6fc4777..037314d40 100644
--- a/docs/user-guide/installation.md
+++ b/docs/user-guide/installation.md
@@ -2,6 +2,8 @@
 
 This guide covers installing Workload-Variant-Autoscaler (WVA) on your Kubernetes cluster.
 
+> **New in v0.4.3**: WVA now supports flexible installation modes for multi-model deployments. See the [Multi-Model Migration Guide](multi-model-migration.md) for details on deploying WVA across multiple llm-d stacks.
+
 ## Prerequisites
 
 - Kubernetes v1.32.0 or later
@@ -13,7 +15,13 @@ This guide covers installing Workload-Variant-Autoscaler (WVA) on your Kubernete
 
 ### Option 1: Helm Installation (Recommended)
 
-The simplest way to install WVA is using Helm:
+The simplest way to install WVA is using Helm. The Helm chart supports three installation modes:
+
+- **`all` (default)**: Install both controller and model resources together
+- **`controller-only`**: Install only the controller for cluster-wide management
+- **`model-resources-only`**: Install only model-specific resources
+
+**Basic installation (single model):**
 
 ```bash
 # Install WVA with default configuration
@@ -28,6 +36,28 @@ helm install workload-variant-autoscaler ./charts/workload-variant-autoscaler \
   --values custom-values.yaml
 ```
 
+**Multi-model installation:**
+
+For deploying WVA to manage multiple models across different namespaces:
+
+```bash
+# Step 1: Install controller once
+helm install wva-controller ./charts/workload-variant-autoscaler \
+  -n workload-variant-autoscaler-system \
+  --create-namespace \
+  --set installMode=controller-only \
+  --set wva.namespaceScoped=false
+
+# Step 2: Install model resources for each model
+helm install wva-model-a ./charts/workload-variant-autoscaler \
+  --set installMode=model-resources-only \
+  --set llmd.namespace=llm-d-model-a \
+  --set llmd.modelName=model-a \
+  --set llmd.modelID="meta-llama/Llama-2-7b-hf"
+```
+
+See the [Multi-Model Migration Guide](multi-model-migration.md) for complete details.
+
 **Verify the installation:**
 ```bash
 kubectl get pods -n workload-variant-autoscaler-system
diff --git a/docs/user-guide/multi-model-migration.md b/docs/user-guide/multi-model-migration.md
new file mode 100644
index 000000000..4c644d024
--- /dev/null
+++ b/docs/user-guide/multi-model-migration.md
@@ -0,0 +1,311 @@
+# Multi-Model Migration Guide
+
+This guide helps you migrate from a single-model WVA installation to a multi-model architecture that supports multiple llm-d stacks across different namespaces.
+
+## Overview
+
+**Prior to v0.4.3**, the WVA Helm chart installed both the controller and model-specific resources together. This meant that installing WVA for a new model would overwrite the resources from existing models, making it impossible to support multiple llm-d stacks.
+
+**Starting with v0.4.3**, WVA supports three installation modes that enable you to decouple the controller from model resources:
+
+- `all` (default) - Install both controller and model resources together
+- `controller-only` - Install only the WVA controller
+- `model-resources-only` - Install only model-specific resources
+
+## When to Migrate
+
+You should consider migrating to the new multi-model architecture if you:
+
+- Have multiple llm-d stacks in different namespaces
+- Want to add models without affecting existing models
+- Need to scale different models independently
+- Want to manage model lifecycles separately from the controller
+
+## Migration Steps
+
+### Step 1: Document Current Configuration
+
+Before starting the migration, document your current setup:
+
+```bash
+# List existing WVA installations
+helm ls -A | grep workload-variant-autoscaler
+
+# Save current VariantAutoscaling resources
+kubectl get variantautoscaling -A -o yaml > /tmp/existing-va.yaml
+
+# Save current HPA resources
+kubectl get hpa -A | grep vllm > /tmp/existing-hpa.txt
+
+# Save current model configuration
+kubectl get variantautoscaling -A -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.metadata.namespace}{"\t"}{.spec.modelID}{"\n"}{end}' > /tmp/models.txt
+```
+
+### Step 2: Backup Current Installation
+
+Create a backup of your current Helm values:
+
+```bash
+# Get current values
+helm get values workload-variant-autoscaler -n workload-variant-autoscaler-system > /tmp/current-values.yaml
+```
+
+### Step 3: Uninstall Old Installation
+
+Remove the existing WVA installation:
+
+```bash
+# Uninstall WVA
+helm uninstall workload-variant-autoscaler -n workload-variant-autoscaler-system
+
+# Verify resources are removed
+kubectl get pods -n workload-variant-autoscaler-system
+kubectl get variantautoscaling -A
+kubectl get hpa -A | grep vllm
+```
+
+**Note**: The VariantAutoscaling and HPA resources will be deleted. This is expected as we will recreate them in the next steps.
+
+### Step 4: Install WVA Controller (Cluster-Wide)
+
+Install the WVA controller once for the entire cluster:
+
+```bash
+# Install controller in controller-only mode
+helm install wva-controller ./charts/workload-variant-autoscaler \
+  -n workload-variant-autoscaler-system \
+  --create-namespace \
+  --set installMode=controller-only \
+  --set wva.namespaceScoped=false \
+  --set wva.prometheus.baseURL="<your-prometheus-url>" \
+  --set wva.prometheus.tls.insecureSkipVerify=<true-or-false> \
+  --set-file wva.prometheus.caCert=<path-to-ca-cert>  # If needed
+
+# Verify controller is running
+kubectl get pods -n workload-variant-autoscaler-system
+kubectl logs -n workload-variant-autoscaler-system -l app.kubernetes.io/name=workload-variant-autoscaler
+```
+
+### Step 5: Deploy Model Resources for Each Model
+
+For each llm-d stack/model, install the model-specific resources:
+
+#### Example: Model A
+
+```bash
+helm install wva-model-a ./charts/workload-variant-autoscaler \
+  --set installMode=model-resources-only \
+  --set llmd.namespace=llm-d-model-a \
+  --set llmd.modelName=ms-model-a-llm-d-modelservice \
+  --set llmd.modelID="meta-llama/Llama-2-7b-hf" \
+  --set va.enabled=true \
+  --set va.accelerator=H100 \
+  --set va.sloTpot=10 \
+  --set va.sloTtft=1000 \
+  --set hpa.enabled=true \
+  --set hpa.maxReplicas=10 \
+  --set vllmService.enabled=true \
+  --set vllmService.nodePort=30000 \
+  --set vllmService.interval=15s
+```
+
+#### Example: Model B
+
+```bash
+helm install wva-model-b ./charts/workload-variant-autoscaler \
+  --set installMode=model-resources-only \
+  --set llmd.namespace=llm-d-model-b \
+  --set llmd.modelName=ms-model-b-llm-d-modelservice \
+  --set llmd.modelID="mistralai/Mistral-7B-v0.1" \
+  --set va.enabled=true \
+  --set va.accelerator=A100 \
+  --set va.sloTpot=8 \
+  --set va.sloTtft=800 \
+  --set hpa.enabled=true \
+  --set hpa.maxReplicas=8 \
+  --set vllmService.enabled=true \
+  --set vllmService.nodePort=30001 \
+  --set vllmService.interval=15s
+```
+
+**Tip**: Use the information from Step 1 to configure each model correctly.
+
+### Step 6: Verify Migration
+
+Verify that all components are working correctly:
+
+```bash
+# Check controller
+kubectl get pods -n workload-variant-autoscaler-system
+kubectl logs -n workload-variant-autoscaler-system -l app.kubernetes.io/name=workload-variant-autoscaler
+
+# Check model resources for each namespace
+kubectl get variantautoscaling -A
+kubectl get hpa -A | grep vllm
+kubectl get service -A | grep vllm
+kubectl get servicemonitor -A | grep vllm
+
+# Check that the controller is watching all namespaces
+kubectl logs -n workload-variant-autoscaler-system -l app.kubernetes.io/name=workload-variant-autoscaler | grep "Starting workers"
+
+# Verify autoscaling is working
+kubectl describe variantautoscaling -n llm-d-model-a
+kubectl describe hpa -n llm-d-model-a
+```
+
+## Post-Migration Architecture
+
+After migration, your architecture will look like this:
+
+```
+Cluster
+├── workload-variant-autoscaler-system (namespace)
+│   └── wva-controller (Deployment) ← Single controller watching all namespaces
+│
+├── llm-d-model-a (namespace)
+│   ├── llm-d resources (Gateway, Scheduler, vLLM)
+│   └── wva-resources
+│       ├── VariantAutoscaling
+│       ├── HPA
+│       ├── Service
+│       └── ServiceMonitor
+│
+├── llm-d-model-b (namespace)
+│   ├── llm-d resources (Gateway, Scheduler, vLLM)
+│   └── wva-resources
+│       ├── VariantAutoscaling
+│       ├── HPA
+│       ├── Service
+│       └── ServiceMonitor
+│
+└── llm-d-model-c (namespace)
+    ├── llm-d resources (Gateway, Scheduler, vLLM)
+    └── wva-resources
+        ├── VariantAutoscaling
+        ├── HPA
+        ├── Service
+        └── ServiceMonitor
+```
+
+## Adding New Models
+
+After migration, adding new models is straightforward:
+
+```bash
+# Just install model resources for the new model
+helm install wva-model-new ./charts/workload-variant-autoscaler \
+  --set installMode=model-resources-only \
+  --set llmd.namespace=llm-d-model-new \
+  --set llmd.modelName=ms-model-new-llm-d-modelservice \
+  --set llmd.modelID="new/model-id" \
+  --set va.accelerator=<accelerator-type>
+```
+
+The new model resources won't affect existing models!
+
+## Removing Models
+
+To remove a model without affecting others:
+
+```bash
+# Uninstall model resources
+helm uninstall wva-model-a
+
+# Verify only that model's resources are removed
+kubectl get variantautoscaling -A
+kubectl get hpa -A | grep vllm
+```
+
+The controller and other models remain unaffected.
+
+## Troubleshooting
+
+### Controller Not Watching All Namespaces
+
+**Problem**: Controller only watches its own namespace.
+
+**Solution**: Ensure `wva.namespaceScoped=false` was set during controller installation:
+
+```bash
+# Check controller arguments
+kubectl get deployment -n workload-variant-autoscaler-system workload-variant-autoscaler-controller-manager -o yaml | grep watch-namespace
+
+# If --watch-namespace is present, reinstall with correct settings
+helm uninstall wva-controller -n workload-variant-autoscaler-system
+helm install wva-controller ./charts/workload-variant-autoscaler \
+  -n workload-variant-autoscaler-system \
+  --create-namespace \
+  --set installMode=controller-only \
+  --set wva.namespaceScoped=false \
+  --set wva.prometheus.baseURL="<your-prometheus-url>"
+```
+
+### Model Resources Not Reconciling
+
+**Problem**: VariantAutoscaling resources are not being reconciled.
+
+**Solution**: 
+1. Check controller logs for errors
+2. Verify the controller has RBAC permissions for the model namespace
+3. Ensure the VariantAutoscaling resource is correctly configured
+
+```bash
+# Check controller logs
+kubectl logs -n workload-variant-autoscaler-system -l app.kubernetes.io/name=workload-variant-autoscaler | grep ERROR
+
+# Check RBAC
+kubectl auth can-i get variantautoscalings --as=system:serviceaccount:workload-variant-autoscaler-system:workload-variant-autoscaler-controller-manager -n llm-d-model-a
+```
+
+### HPA Shows Unknown Metrics
+
+**Problem**: HPA shows `<unknown>` for the external metric.
+
+**Solution**:
+1. Verify Prometheus Adapter is installed and configured
+2. Check that the VariantAutoscaling resource is emitting metrics
+3. Verify the HPA metric selector matches the VariantAutoscaling metric labels
+
+```bash
+# Check external metrics API
+kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1/namespaces/llm-d-model-a/inferno_desired_replicas" | jq
+
+# Check VariantAutoscaling status
+kubectl describe variantautoscaling -n llm-d-model-a
+```
+
+## Rollback
+
+If you need to rollback to the old installation method:
+
+```bash
+# Uninstall new components
+helm uninstall wva-controller -n workload-variant-autoscaler-system
+helm uninstall wva-model-a
+helm uninstall wva-model-b
+# ... uninstall all model releases
+
+# Reinstall using the old method (installMode=all)
+helm install workload-variant-autoscaler ./charts/workload-variant-autoscaler \
+  -n workload-variant-autoscaler-system \
+  --create-namespace \
+  --set installMode=all \
+  -f /tmp/current-values.yaml  # Use your saved values
+```
+
+## Benefits of Multi-Model Architecture
+
+After migration, you'll benefit from:
+
+- **Isolation**: Each model's resources are independent
+- **Flexibility**: Add/remove models without affecting others
+- **Scalability**: Scale models independently based on their workload
+- **Simplicity**: Simpler management with one controller for all models
+- **Reliability**: Failure in one model's resources doesn't affect others
+
+## Related Documentation
+
+- [Installation Guide](installation.md)
+- [Configuration Guide](configuration.md)
+- [Helm Chart README](../../charts/workload-variant-autoscaler/README.md)
+- [Deployment Guide](../../deploy/README.md)