Skip to content

Commit dd0333a

Browse files
author
Justin
committed
update troubleshooting ask
1 parent 70642aa commit dd0333a

File tree

1 file changed

+6
-34
lines changed

1 file changed

+6
-34
lines changed

articles/aks/troubleshooting.md

Lines changed: 6 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -55,7 +55,7 @@ The reason for the warnings on the dashboard is that the cluster is now enabled
5555

5656
The easiest way to access your service outside the cluster is to run `kubectl proxy`, which proxies requests sent to your localhost port 8001 to the Kubernetes API server. From there, the API server can proxy to your service: `http://localhost:8001/api/v1/namespaces/kube-system/services/kubernetes-dashboard/proxy/#!/node?namespace=default`.
5757

58-
If you dont see the Kubernetes dashboard, check whether the `kube-proxy` pod is running in the `kube-system` namespace. If it isn't in a running state, delete the pod and it will restart.
58+
If you don't see the Kubernetes dashboard, check whether the `kube-proxy` pod is running in the `kube-system` namespace. If it isn't in a running state, delete the pod and it will restart.
5959

6060
## I can't get logs by using kubectl logs or I can't connect to the API server. I'm getting "Error from server: error dialing backend: dial tcp…". What should I do?
6161

@@ -116,7 +116,7 @@ Naming restrictions are implemented by both the Azure platform and AKS. If a res
116116
* The AKS *MC_* resource group name combines resource group name and resource name. The auto-generated syntax of `MC_resourceGroupName_resourceName_AzureRegion` must be no greater than 80 chars. If needed, reduce the length of your resource group name or AKS cluster name.
117117
* The *dnsPrefix* must start and end with alphanumeric values and must be between 1-54 characters. Valid characters include alphanumeric values and hyphens (-). The *dnsPrefix* can't include special characters such as a period (.).
118118

119-
## Im receiving errors when trying to create, update, scale, delete or upgrade cluster, that operation is not allowed as another operation is in progress.
119+
## I'm receiving errors when trying to create, update, scale, delete or upgrade cluster, that operation is not allowed as another operation is in progress.
120120

121121
*This troubleshooting assistance is directed from aka.ms/aks-pending-operation*
122122

@@ -154,41 +154,14 @@ Verify that your settings are not conflicting with any of the required or option
154154
| 1.14 | 1.14.2 or later |
155155

156156

157-
### What versions of Kubernetes have Azure Disk support on the Sovereign Cloud?
157+
### What versions of Kubernetes have Azure Disk support on the Sovereign Clouds?
158158

159159
| Kubernetes version | Recommended version |
160160
| -- | :--: |
161161
| 1.12 | 1.12.0 or later |
162162
| 1.13 | 1.13.0 or later |
163163
| 1.14 | 1.14.0 or later |
164164

165-
166-
### WaitForAttach failed for Azure Disk: parsing "/dev/disk/azure/scsi1/lun1": invalid syntax
167-
168-
In Kubernetes version 1.10, MountVolume.WaitForAttach may fail with an the Azure Disk remount.
169-
170-
On Linux, you may see an incorrect DevicePath format error. For example:
171-
172-
```console
173-
MountVolume.WaitForAttach failed for volume "pvc-f1562ecb-3e5f-11e8-ab6b-000d3af9f967" : azureDisk - Wait for attach expect device path as a lun number, instead got: /dev/disk/azure/scsi1/lun1 (strconv.Atoi: parsing "/dev/disk/azure/scsi1/lun1": invalid syntax)
174-
Warning FailedMount 1m (x10 over 21m) kubelet, k8s-agentpool-66825246-0 Unable to mount volumes for pod
175-
```
176-
177-
On Windows, you may see a wrong DevicePath(LUN) number error. For example:
178-
179-
```console
180-
Warning FailedMount 1m kubelet, 15282k8s9010 MountVolume.WaitForAttach failed for volume "disk01" : azureDisk - WaitForAttach failed within timeout node (15282k8s9010) diskId:(andy-mghyb
181-
1102-dynamic-pvc-6c526c51-4a18-11e8-ab5c-000d3af7b38e) lun:(4)
182-
```
183-
184-
This issue has been fixed in the following versions of Kubernetes:
185-
186-
| Kubernetes version | Fixed version |
187-
| -- | :--: |
188-
| 1.10 | 1.10.2 or later |
189-
| 1.11 | 1.11.0 or later |
190-
| 1.12 and later | N/A |
191-
192165
### Failure when setting uid and gid in mountOptions for Azure Disk
193166

194167
Azure Disk uses the ext4,xfs filesystem by default and mountOptions such as uid=x,gid=x can't be set at mount time. For example if you tried to set mountOptions uid=999,gid=999, would see an error like:
@@ -293,7 +266,7 @@ If you are using a version of Kubernetes that does not have the fix for this iss
293266
In some cases, if an Azure Disk detach operation fails on the first attempt, it will not retry the detach operation and will remain attached to the original node VM. This error can occur when moving a disk from one node to another. For example:
294267

295268
```console
296-
[Warning] AttachVolume.Attach failed for volume pvc-7b7976d7-3a46-11e9-93d5-dee1946e6ce9 : Attach volume kubernetes-dynamic-pvc-7b7976d7-3a46-11e9-93d5-dee1946e6ce9" to instance /subscriptions/XXX/resourceGroups/XXX/providers/Microsoft.Compute/virtualMachines/aks-agentpool-57634498-0 failed with compute.VirtualMachinesClient#CreateOrUpdate: Failure sending request: StatusCode=0 -- Original Error: autorest/azure: Service returned an error. Status= Code=ConflictingUserInput Message=Disk /subscriptions/XXX/resourceGroups/XXX/providers/Microsoft.Compute/disks/kubernetes-dynamic-pvc-7b7976d7-3a46-11e9-93d5-dee1946e6ce9 cannot be attached as the disk is already owned by VM /subscriptions/XXX/resourceGroups/XXX/providers/Microsoft.Compute/virtualMachines/aks-agentpool-57634498-1’.”
269+
[Warning] AttachVolume.Attach failed for volume "pvc-7b7976d7-3a46-11e9-93d5-dee1946e6ce9" : Attach volume "kubernetes-dynamic-pvc-7b7976d7-3a46-11e9-93d5-dee1946e6ce9" to instance "/subscriptions/XXX/resourceGroups/XXX/providers/Microsoft.Compute/virtualMachines/aks-agentpool-57634498-0" failed with compute.VirtualMachinesClient#CreateOrUpdate: Failure sending request: StatusCode=0 -- Original Error: autorest/azure: Service returned an error. Status= Code="ConflictingUserInput" Message="Disk '/subscriptions/XXX/resourceGroups/XXX/providers/Microsoft.Compute/disks/kubernetes-dynamic-pvc-7b7976d7-3a46-11e9-93d5-dee1946e6ce9' cannot be attached as the disk is already owned by VM '/subscriptions/XXX/resourceGroups/XXX/providers/Microsoft.Compute/virtualMachines/aks-agentpool-57634498-1'."
297270
```
298271

299272
This issue has been fixed in the following versions of Kubernetes:
@@ -322,7 +295,6 @@ This issue has been fixed in the following versions of Kubernetes:
322295

323296
If you are using a version of Kubernetes that does not have the fix for this issue and your node VM has an obsolete disk list, you can mitigate the issue by detaching all non-existing disks from the VM as a single, bulk operation. **Individually detaching non-existing disks may fail.**
324297

325-
326298
### Large number of Azure Disks causes slow attach/detach
327299

328300
When the number of Azure Disks attached to a node VM is larger than 10, attach and detach operations may be slow. This issue is a known issue and there are no workarounds at this time.
@@ -379,7 +351,7 @@ Recommended settings:
379351
| 1.12.0 - 1.12.1 | 0755 |
380352
| 1.12.2 and later | 0777 |
381353

382-
If using a cluster with Kuberetes version 1.8.5 or greater and dynamically creating the persistent volume with a storage class, mount options can be specified on the storage class object. The following example sets *0777*:
354+
Mount options can be specified on the storage class object. The following example sets *0777*:
383355

384356
```yaml
385357
kind: StorageClass
@@ -475,7 +447,7 @@ To update your Azure secret file, use `kubectl edit secret`. For example:
475447
kubectl edit secret azure-storage-account-{storage-account-name}-secret
476448
```
477449

478-
After a few minutes, the agent node will retry the azure file mount with the updated storage key.
450+
After a few minutes, the agent node will retry the file mount with the updated storage key.
479451

480452
### Cluster autoscaler fails to scale with error failed to fix node group sizes
481453

0 commit comments

Comments
 (0)