You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This guide explains how to use the migration scripts in `hack/` to move Azure Disk backed PVCs from Premium_LRS to PremiumV2_LRS. It covers the two supported modes (`inplace`and `dual`), prerequisites, validation steps, safety / rollback, cleanup, and troubleshooting.
3
+
This guide explains how to use the migration scripts in `hack/` to move Azure Disk backed PVCs from Premium_LRS to PremiumV2_LRS. It now covers three supported modes (`inplace`, `dual`, and `attrclass` / VolumeAttributesClass), prerequisites, validation steps, safety / rollback, cleanup, and troubleshooting.
4
4
5
5
---
6
6
@@ -22,11 +22,39 @@ They are intended for controlled batches (not fire‑and‑forget across an enti
| In-place |`hack/premium-to-premiumv2-migrator-inplace.sh`| Deletes original PVC (keeping original PV), recreates same name PVC pointing to snapshot and PremiumV2 SC | Same name preserved; minimal object sprawl | Short window where PVC is absent; workload must be quiesced/detached; rollback relies on retained PV | Smaller batches, controlled maintenance windows |
24
24
| Dual (pv1→pv2) |`hack/premium-to-premiumv2-migrator-dualpvc.sh`| Creates intermediate CSI PV/PVC (if source was in-tree), snapshots, creates a *pv2* PVC (suffix), monitors migration events | Keeps original PVC around longer (reduced disruption); clearer staged artifacts | More objects (intermediate PV/PVC + target); higher cleanup burden; naming complexity | Migration where minimizing initial disruption matters or need visibility before switch |
25
+
| AttrClass (in-place attribute update) |`hack/premium-to-premiumv2-migrator-vac.sh`| (Optionally) converts in-tree PV to CSI same-name first, then applies a `VolumeAttributesClass` to mutate the disk SKU | No new pv2 PVC; minimal object churn; preserves PVC name; avoids creating SC variants | Requires cluster & driver support for VolumeAttributesClass; rollback of SKU change requires another class or snapshot-based restore | Clusters already CSI-enabled or ready to convert; desire lowest object churn |
25
26
26
27
Recommendation:
27
28
1. Pilot on a tiny subset using `inplace` (simpler) in a non-prod namespace.
28
-
2. If operational constraints demand minimal rename churn or extra observation time, use `dual` for broader rollout.
29
-
3. Always label PVCs explicitly to opt them in (staged adoption).
29
+
2. If you need prolonged coexistence / observation, use `dual`.
30
+
3. If your cluster + Azure Disk CSI driver support `VolumeAttributesClass`, prefer `attrclass` for lowest object churn (especially when most PVs are already CSI).
31
+
4. Always label PVCs explicitly to opt them in (staged adoption).
32
+
33
+
### 2.1 AttrClass Mode Details
34
+
35
+
`hack/premium-to-premiumv2-migrator-vac.sh`:
36
+
- Ensures (or recreates if forced) a `VolumeAttributesClass` (default `azuredisk-premiumv2`) with `parameters.skuName=PremiumV2_LRS`.
37
+
- For CSI Premium_LRS PVCs: patches `spec.volumeAttributesClassName` only (no new PVC/PV).
38
+
- For in-tree azureDisk PVs: performs a one-time snapshot-based same-name CSI recreation (like a narrowed “inplace” convert) then patches attr class.
39
+
- Central monitoring loop watches both:
40
+
- PV `.spec.csi.volumeAttributes.skuName|skuname` flip to `PremiumV2_LRS`.
41
+
-`SKUMigration*` events (if emitted) similar to other modes.
42
+
- Rollback before SKU change: same as inplace (retained original PV + annotation / backup). After successful SKU mutation: must apply a different attr class pointing back to Premium_LRS (not auto-created) or restore from snapshot.
./premium-to-premiumv2-migrator-vac.sh | tee run-attrclass-$(date +%Y%m%d-%H%M%S).log
49
+
```
50
+
51
+
Additional env (see section 5):
52
+
```
53
+
ATTR_CLASS_NAME=azuredisk-premiumv2
54
+
ATTR_CLASS_API_VERSION=storage.k8s.io/v1beta1 # or storage.k8s.io/v1 when GA
55
+
TARGET_SKU=PremiumV2_LRS
56
+
ATTR_CLASS_FORCE_RECREATE=false
57
+
```
30
58
31
59
---
32
60
@@ -82,6 +110,12 @@ Change the label (or add additional selectors externally) to control scope. Only
82
110
|`MIGRATION_LABEL`| see above | PVC selection. |
83
111
|`AUDIT_ENABLE`|`true`| Enable audit log lines. |
84
112
|`AUDIT_LOG_FILE`|`pv1-pv2-migration-audit.log`| Rolling append log file. |
113
+
|`ATTR_CLASS_NAME`|`azuredisk-premiumv2`| (AttrClass mode) Name of VolumeAttributesClass to apply. |
114
+
|`ATTR_CLASS_API_VERSION`|`storage.k8s.io/v1beta1`| API version for VolumeAttributesClass (adjust if GA). |
115
+
|`TARGET_SKU`|`PremiumV2_LRS`| Target skuName parameter for the VolumeAttributesClass. |
116
+
|`ATTR_CLASS_FORCE_RECREATE`|`false`| Recreate the attr class each run. |
117
+
|`PV_POLL_INTERVAL_SECONDS`|`10`| (AttrClass) Poll interval for sku check. |
118
+
|`SKU_UPDATE_TIMEOUT_MINUTES`|`60`| (AttrClass optional blocking helper) Per-PVC sku update wait if used directly. |
85
119
86
120
(See top of `lib-premiumv2-migration-common.sh` for the complete list.)
87
121
@@ -90,22 +124,74 @@ Change the label (or add additional selectors externally) to control scope. Only
90
124
## 6. Prerequisites & Validation Checklist
91
125
92
126
Before running:
93
-
1. RBAC: Ensure your principal can `get/list/create/patch/delete` PV/PVC/Snapshot/SC as required. Script will abort if critical verbs fail.
94
-
2. Quota: Check PremiumV2 disk quotas in target subscription/region (script does NOT enforce).
95
-
3. StorageClasses: Confirm original SC(s) are Premium_LRS (cachingMode=none, no unsupported encryption combos).
96
-
4. Workload readiness: Plan for pods referencing target PVCs to be idle / safe to pause if using in-place.
97
-
5. Snapshot CRDs: Ensure `VolumeSnapshot` CRDs installed (the script creates a class if absent).
98
-
6. Label small test set:
127
+
1.**RBAC**: Ensure your principal can `get/list/create/patch/delete` PV/PVC/Snapshot/SC as required. Script will abort if critical verbs fail.
128
+
129
+
2.**Quota**: Check PremiumV2 disk quotas in target subscription/region (script does NOT enforce).
130
+
131
+
3.**StorageClasses**: Confirm original SC(s) are Premium_LRS (cachingMode=none, no unsupported encryption combos).
132
+
133
+
4.**⚠️ Zone Topology Requirements (Critical for PremiumV2_LRS)**:
134
+
135
+
**PremiumV2_LRS disks can only be attached to VMs running in the same Availability Zone.** If your workloads are zone-constrained or you're using topology-aware scheduling, you **must** update your source StorageClasses with `allowedTopologies` before migration.
136
+
137
+
**Action Required**: Update your existing Premium_LRS StorageClasses to include the correct zone topology constraints:
138
+
139
+
```yaml
140
+
apiVersion: storage.k8s.io/v1
141
+
kind: StorageClass
142
+
metadata:
143
+
name: managed-premium # Your existing StorageClass name
144
+
provisioner: disk.csi.azure.com
145
+
parameters:
146
+
skuName: Premium_LRS
147
+
cachingMode: None
148
+
allowedTopologies:
149
+
- matchLabelExpressions:
150
+
- key: topology.disk.csi.azure.com/zone
151
+
values:
152
+
- eastus2-1 # Replace with your target zone(s)
153
+
- eastus2-2 # Add multiple zones if needed
154
+
- eastus2-3
155
+
reclaimPolicy: Delete
156
+
allowVolumeExpansion: true
157
+
volumeBindingMode: WaitForFirstConsumer # Recommended for zone-aware scheduling
kubectl get nodes -o custom-columns="NAME:.metadata.name,ZONE:.metadata.labels['topology\.kubernetes\.io/zone']"
164
+
165
+
# Check zones where your existing PVs are located
166
+
kubectl get pv -o custom-columns="NAME:.metadata.name,ZONE:.spec.nodeAffinity.required.nodeSelectorTerms[0].matchExpressions[0].values[0]"
167
+
168
+
# Check current PVC zones
169
+
kubectl get pvc -A -o custom-columns="NAMESPACE:.metadata.namespace,NAME:.metadata.name,ZONE:.metadata.annotations['volume\.kubernetes\.io/selected-node']" | grep -v '<none>'
101
170
```
102
-
7. Dry run *logic* (syntax & preflight only):
171
+
172
+
**Why this matters**:
173
+
- The migration script inherits `allowedTopologies` from your source StorageClass when creating PremiumV2_LRS variants
174
+
- Without proper topology constraints, PremiumV2 PVCs may be provisioned in zones where your workloads cannot access them
175
+
- This can result in pod scheduling failures or volume attachment timeouts
176
+
177
+
5.**Workload readiness**: Plan for pods referencing target PVCs to be idle / safe to pause if using in-place.
178
+
179
+
6.**Snapshot CRDs**: Ensure `VolumeSnapshot` CRDs installed (the script creates a class if absent).
8. Optional: Run with a deliberately empty label selector to validate preflight (set `MIGRATION_LABEL="doesnotexist=true"` temporarily).
108
191
192
+
9.**Optional**: Run with a deliberately empty label selector to validate preflight (set `MIGRATION_LABEL="doesnotexist=true"` temporarily).
193
+
194
+
**Important**: After updating your source StorageClasses with topology constraints, verify that existing workloads can still schedule properly before proceeding with migration. The script will automatically inherit these topology settings when creating the PremiumV2_LRS variant StorageClasses.
109
195
---
110
196
111
197
## 7. Running the Scripts
@@ -127,6 +213,20 @@ MAX_PVCS=5 MIG_SUFFIX=csi \
127
213
./premium-to-premiumv2-migrator-dualpvc.sh 2>&1| tee run-dual-$(date +%Y%m%d-%H%M%S).log
128
214
```
129
215
216
+
AttrClass example:
217
+
```bash
218
+
cd hack
219
+
MAX_PVCS=5 ATTR_CLASS_NAME=azuredisk-premiumv2 \
220
+
./premium-to-premiumv2-migrator-vac.sh 2>&1| tee run-attrclass-$(date +%Y%m%d-%H%M%S).log
221
+
```
222
+
223
+
AttrClass with in-tree presence (override baseline CSI SC):
224
+
```bash
225
+
cd hack
226
+
CSI_BASELINE_SC=csi-azuredisk-premium \
227
+
MAX_PVCS=3 ./premium-to-premiumv2-migrator-vac.sh
228
+
```
229
+
130
230
Important runtime phases (both):
131
231
1. Pre-req scan (size, SC parameters, binding).
132
232
2. RBAC preflight.
@@ -415,6 +515,9 @@ Summary:
415
515
| No `SKUMigration*` events | Controller not emitting or watch delay | Force in-progress label (script auto after threshold) |
416
516
| Released PV leftovers | Rollback or partial batch | Confirm not needed → delete PV |
417
517
| Rollback failed to rebind | claimRef not cleared or PV reclaimPolicy=Delete | Ensure reclaimPolicy changed to Retain earlier |
518
+
| AttrClass PVC never flips sku | Driver / cluster lacks VolumeAttributesClass update support | Confirm driver version & feature gate; inspect PV `.spec.csi.volumeAttributes`|
519
+
| AttrClass run shows no events | Controller not emitting `SKUMigration*`| Rely on sku attribute polling; consider driver log inspection |
520
+
| AttrClass rollback after sku change | SKU already mutated on disk | Apply alternate attr class (Premium_LRS) or snapshot restore |
0 commit comments