You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/databox-online/azure-stack-edge-gpu-2403-release-notes.md
+16-17Lines changed: 16 additions & 17 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -30,9 +30,9 @@ This article applies to the **Azure Stack Edge 2403** release, which maps to sof
30
30
31
31
To apply the 2403 update, your device must be running version 2303 or later.
32
32
33
-
- If you are not running the minimum required version, you'll see this error:
33
+
- If you aren't running the minimum required version, you'll see this error:
34
34
35
-
*Update package cannot be installed as its dependencies are not met.*
35
+
*Update package can't be installed as its dependencies aren't met.*
36
36
37
37
- You can update to 2303 from 2207 or later, and then install 2403.
38
38
@@ -52,26 +52,26 @@ The 2403 release has the following new features and enhancements:
52
52
53
53
- CAT-1 STIG security fixes Mariner Guest OS for AKS on Azure Stack Edge.
54
54
- Deprecating support for AKS-Telemetry on AKS on Azure Stack Edge.
55
-
- Zone-label support for 2-node Kubernetes clusters.
55
+
- Zone-label support for two-node Kubernetes clusters.
56
56
- Hyper-V VM management: memory usage monitoring on Azure Stack Edge host.
57
57
58
58
## Issues fixed in this release
59
59
60
60
| No. | Feature | Issue |
61
61
| --- | --- | --- |
62
-
|**1.**|2-node cold boot of the server causes high availability VM cluster resources to come up as offline. | Changed ColdStartSetting to AlwaysStart. |
62
+
|**1.**|Two-node cold boot of the server causes high availability VM cluster resources to come up as offline. | Changed ColdStartSetting to AlwaysStart. |
63
63
|**2.**| Marketplace image support. | Fixed bug allowing Windows Marketplace image on Azure Stack Edge A and TMA. |
64
64
|**3.**| Fixed VM NIC link flapping after Azure Stack Edge host power off/on, which can cause VM losing its DHCP IP. ||
65
65
|**4.**| Due to proxy ARP configurations in some customer environments, **IP address in use** check returns false positive even though no endpoint in the network is using the IP. | The fix skips the ARP-based VM **IP address in use** check if the IP address is allocated from an internal network managed by Azure Stack Edge. |
66
-
|**5.**| VM NIC change operation times out after 3 hours which blocks other VM update operations. On Microsoft Kubernetes clusters, Persistent Volume (PV) dependent pods get stuck. The issue occurs when multiple NICs within a VM are being transferred from a VLAN virtual network to a non-VLAN virtual network. The transfer involves asynchronous calls that are processed in the same thread. If one NIC transfer is made while the other NIC transfer is still processing, a deadlock could occur due to the shared API activity ID in one thread. | To mitigate this issue, the API activity ID would be uniquely generated even within one thread, ensuring that the parallel calls are independent and don't affect each other. After the fix, the VM NIC change operation times out quickly and the VM update won't be blocked. |
67
-
|**6.**| Kubernetes | Overall 2-node Kubernetes resiliency improvements, like increasing memory for control plane for AKS workload cluster, increasing limits for etcd, multi-replica, and hard anti-affinity support for core DNS and Azure disk csi controller pods to improve VM failover times. |
66
+
|**5.**| VM NIC change operation times out after 3 hours, which blocks other VM update operations. On Microsoft Kubernetes clusters, Persistent Volume (PV) dependent pods get stuck. The issue occurs when multiple NICs within a VM are being transferred from a VLAN virtual network to a non-VLAN virtual network. The transfer involves asynchronous calls that are processed in the same thread. If one NIC transfer is made while the other NIC transfer is still processing, a deadlock could occur due to the shared API activity ID in one thread. | To mitigate this issue, the API activity ID would be uniquely generated even within one thread, ensuring that the parallel calls are independent and don't affect each other. After the fix, the VM NIC change operation times out quickly and the VM update won't be blocked. |
67
+
|**6.**| Kubernetes | Overall two-node Kubernetes resiliency improvements, like increasing memory for control plane for AKS workload cluster, increasing limits for etcd, multi-replica, and hard anti-affinity support for core DNS and Azure disk csi controller pods to improve VM failover times. |
68
68
|**7.**| Compute Diagnostic and Update | Resiliency fixes |
69
69
70
70
CAT-1 STIG security fixes Mariner Guest OS for AKS on ASE.
71
71
72
72
Deprecating support for AKS-Telemetry on AKS on ASE
73
73
74
-
Zone-label support for Kubernetes cluster on 2-node
74
+
Zone-label support for Kubernetes cluster on two-node
75
75
76
76
Field IcM - monitor "Hyper-V Virtual Machine Management" memory usage on ASE host and keep it in check.
77
77
@@ -86,19 +86,19 @@ Field IcM - monitor "Hyper-V Virtual Machine Management" memory usage on ASE hos
86
86
87
87
| No. | Feature | Issue | Workaround/comments |
88
88
| --- | --- | --- | --- |
89
-
|**1.**| Azure Storage Explorer | The Blob Endpoint certificate that's auto-generated by the Azure Stack Edge device may not work properly with Azure Storage Explorer. | Replace Blob Endpoint certificate. For detailed steps, see [Bring your own certificates](azure-stack-edge-gpu-deploy-configure-certificates.md#bring-your-own-certificates). |
90
-
|**2.**| Network connectivity | On a 2-node Azure Stack Edge Pro 2 cluster with a teamed virtual switch for Port 1 and Port 2, if a Port 1 or Port 2 link is down, it can take up to 5 seconds to resume network connectivity on the remaining active port. If a Kubernetes cluster uses this teamed virtual switch for management traffic, pod communication may be disrupted up to 5 seconds. ||
91
-
|**3.**| Virtual machine | After the host or Kubernetes node pool VM is shut down, there is a chance that kubelet in nodepool VM fails to start due to a CPU static policy error. Nodepool VM shows **Not ready** status, and pods won't be scheduled on this VM. | Enter a support session and ssh into the nodepool VM, then follow steps in [Changin the CPU Manager Policy](https://kubernetes.io/docs/tasks/administer-cluster/cpu-management-policies/#changing-the-cpu-manager-policy) to remediate the kubelet service. |
89
+
|**1.**| Azure Storage Explorer | The Blob Endpoint certificate that's autogenerated by the Azure Stack Edge device may not work properly with Azure Storage Explorer. | Replace Blob Endpoint certificate. For detailed steps, see [Bring your own certificates](azure-stack-edge-gpu-deploy-configure-certificates.md#bring-your-own-certificates). |
90
+
|**2.**| Network connectivity | On a two-node Azure Stack Edge Pro 2 cluster with a teamed virtual switch for Port 1 and Port 2, if a Port 1 or Port 2 link is down, it can take up to 5 seconds to resume network connectivity on the remaining active port. If a Kubernetes cluster uses this teamed virtual switch for management traffic, pod communication may be disrupted up to 5 seconds. ||
91
+
|**3.**| Virtual machine | After the host or Kubernetes node pool VM is shut down, there's a chance that kubelet in node pool VM fails to start due to a CPU static policy error. Node pool VM shows **Not ready** status, and pods won't be scheduled on this VM. | Enter a support session and ssh into the node pool VM, then follow steps in [Changing the CPU Manager Policy](https://kubernetes.io/docs/tasks/administer-cluster/cpu-management-policies/#changing-the-cpu-manager-policy) to remediate the kubelet service. |
92
92
93
93
## Known issues from previous releases
94
94
95
95
The following table provides a summary of known issues carried over from the previous releases.
96
96
97
97
| No. | Feature | Issue | Workaround/comments |
98
98
| --- | --- | --- | --- |
99
-
|**1.**|Azure Stack Edge Pro + Azure SQL | Creating SQL database requires Administrator access. |Do the following steps instead of Steps 1-2 in [Create-the-sql-database](../iot-edge/tutorial-store-data-sql-server.md#create-the-sql-database). <br> 1. In the local UI of your device, enable compute interface. Select **Compute > Port # > Enable for compute > Apply.**<br> 2. Download `sqlcmd` on your client machine from [SQL command utility](/sql/tools/sqlcmd-utility). <br> 3. Connect to your compute interface IP address (the port that was enabled), adding a ",1401" to the end of the address.<br> 4. Final command will look like this: sqlcmd -S {Interface IP},1401 -U SA -P "Strong!Passw0rd". After this, steps 3-4 from the current documentation should be identical. |
99
+
|**1.**|Azure Stack Edge Pro + Azure SQL | Creating SQL database requires Administrator access. |Do the following steps instead of Steps 1-2 in [Create-the-sql-database](../iot-edge/tutorial-store-data-sql-server.md#create-the-sql-database). <br> 1. In the local UI of your device, enable compute interface. Select **Compute > Port # > Enable for compute > Apply.**<br> 2. Download `sqlcmd` on your client machine from [SQL command utility](/sql/tools/sqlcmd-utility). <br> 3. Connect to your compute interface IP address (the port that was enabled), adding a ",1401" to the end of the address.<br> 4. Final command looks like this: sqlcmd -S {Interface IP},1401 -U SA -P "Strong!Passw0rd". After this, steps 3-4 from the current documentation should be identical. |
100
100
|**2.**|Refresh| Incremental changes to blobs restored via **Refresh** are NOT supported |For Blob endpoints, partial updates of blobs after a Refresh, might result in the updates not getting uploaded to the cloud. For example, sequence of actions such as:<br> 1. Create blob in cloud. Or delete a previously uploaded blob from the device.<br> 2. Refresh blob from the cloud into the appliance using the refresh functionality.<br> 3. Update only a portion of the blob using Azure SDK REST APIs. These actions can result in the updated sections of the blob to not get updated in the cloud. <br>**Workaround**: Use tools such as robocopy, or regular file copy through Explorer or command line, to replace entire blobs.|
101
-
|**3.**|Throttling|During throttling, if new writes to the device aren't allowed, writes by the NFS client fail with a "Permission Denied" error.| The error will show as below:<br>`hcsuser@ubuntu-vm:~/nfstest$ mkdir test`<br>mkdir: can't create directory 'test': Permission denied|
101
+
|**3.**|Throttling|During throttling, if new writes to the device aren't allowed, writes by the NFS client fail with a "Permission Denied" error.| The error shows as below:<br>`hcsuser@ubuntu-vm:~/nfstest$ mkdir test`<br>mkdir: can't create directory 'test': Permission denied|
102
102
|**4.**|Blob Storage ingestion|When using AzCopy version 10 for Blob storage ingestion, run AzCopy with the following argument: `Azcopy <other arguments> --cap-mbps 2000`| If these limits aren't provided for AzCopy, it could potentially send a large number of requests to the device, resulting in issues with the service.|
103
103
|**5.**|Tiered storage accounts|The following apply when using tiered storage accounts:<br> - Only block blobs are supported. Page blobs aren't supported.<br> - There's no snapshot or copy API support.<br> - Hadoop workload ingestion through `distcp` isn't supported as it uses the copy operation heavily.||
104
104
|**6.**|NFS share connection|If multiple processes are copying to the same share, and the `nolock` attribute isn't used, you might see errors during the copy.|The `nolock` attribute must be passed to the mount command to copy files to the NFS share. For example: `C:\Users\aseuser mount -o anon \\10.1.1.211\mnt\vms Z:`.|
@@ -113,20 +113,19 @@ The following table provides a summary of known issues carried over from the pre
113
113
|**15.**|Certificates|Alerts related to signing chain certificates aren't removed from the portal even after uploading new signing chain certificates.||
114
114
|**16.**|Web proxy |NTLM authentication-based web proxy isn't supported. ||
115
115
|**17.**|Internet Explorer|If enhanced security features are enabled, you might not be able to access local web UI pages. | Disable enhanced security, and restart your browser.|
116
-
|**18.**|Kubernetes |Kubernetes doesn't support ":" in environment variable names that are used by .NET applications. This is also required for Event Grid IoT Edge module to function on Azure Stack Edge device and other applications. For more information, see [ASP.NET core documentation](/aspnet/core/fundamentals/configuration/?tabs=basicconfiguration#environment-variables).|Replace ":" by double underscore. For more information,see [Kubernetes issue](https://github.com/kubernetes/kubernetes/issues/53201)|
116
+
|**18.**|Kubernetes |Kubernetes doesn't support ":" in environment variable names that are used by .NET applications. This is also required for Event Grid IoT Edge module to function on Azure Stack Edge device and other applications. For more information, see [ASP.NET core documentation](/aspnet/core/fundamentals/configuration/?tabs=basicconfiguration#environment-variables).|Replace ":" by double underscore. For more information,see [Kubernetes issue](https://github.com/kubernetes/kubernetes/issues/53201)|
117
117
|**19.**|Azure Arc + Kubernetes cluster |By default, when resource `yamls` are deleted from the Git repository, the corresponding resources aren't deleted from the Kubernetes cluster. |To allow the deletion of resources when they're deleted from the git repository, set `--sync-garbage-collection` in Arc OperatorParams. For more information, see [Delete a configuration](../azure-arc/kubernetes/tutorial-use-gitops-connected-cluster.md#additional-parameters). |
118
118
|**20.**|NFS |Applications that use NFS share mounts on your device to write data should use Exclusive write. That ensures the writes are written to the disk.||
119
119
|**21.**|Compute configuration |Compute configuration fails in network configurations where gateways or switches or routers respond to Address Resolution Protocol (ARP) requests for systems that don't exist on the network.||
120
120
|**22.**|Compute and Kubernetes |If Kubernetes is set up first on your device, it claims all the available GPUs. Hence, it isn't possible to create Azure Resource Manager VMs using GPUs after setting up the Kubernetes. |If your device has 2 GPUs, then you can create one VM that uses the GPU and then configure Kubernetes. In this case, Kubernetes will use the remaining available one GPU. |
121
121
|**23.**|Custom script VM extension |There's a known issue in the Windows VMs that were created in an earlier release and the device was updated to 2103. <br> If you add a custom script extension on these VMs, the Windows VM Guest Agent (Version 2.7.41491.901 only) gets stuck in the update causing the extension deployment to time out. | To work around this issue: <br> 1. Connect to the Windows VM using remote desktop protocol (RDP). <br> 2. Make sure that the `waappagent.exe` is running on the machine: `Get-Process WaAppAgent`. <br> 3. If the `waappagent.exe` isn't running, restart the `rdagent` service: `Get-Service RdAgent` \| `Restart-Service`. Wait for 5 minutes.<br> 4. While the `waappagent.exe` is running, kill the `WindowsAzureGuest.exe` process. <br> 5. After you kill the process, the process starts running again with the newer version. <br> 6. Verify that the Windows VM Guest Agent version is 2.7.41491.971 using this command: `Get-Process WindowsAzureGuestAgent` \| `fl ProductVersion`.<br> 7. [Set up custom script extension on Windows VM](azure-stack-edge-gpu-deploy-virtual-machine-custom-script-extension.md). |
122
122
|**24.**|Multi-Process Service (MPS) |When the device software and the Kubernetes cluster are updated, the MPS setting isn't retained for the workloads. |[Re-enable MPS](azure-stack-edge-gpu-connect-powershell-interface.md#connect-to-the-powershell-interface) and redeploy the workloads that were using MPS. |
123
123
|**25.**|Wi-Fi |Wi-Fi doesn't work on Azure Stack Edge Pro 2 in this release. |
124
-
|**26.**|Azure IoT Edge |The managed Azure IoT Edge solution on Azure Stack Edge is running on an older, obsolete IoT Edge runtime that is at end of life. For more information, see [IoT Edge v1.1 EoL: What does that mean for me?](https://techcommunity.microsoft.com/t5/internet-of-things-blog/iot-edge-v1-1-eol-what-does-that-mean-for-me/ba-p/3662137). Although the solution does not stop working past end of life, there are no plans to update it. |To run the latest version of Azure IoT Edge [LTSs](../iot-edge/version-history.md#version-history) with the latest updates and features on their Azure Stack Edge, we **recommend** that you deploy a [customer self-managed IoT Edge solution](azure-stack-edge-gpu-deploy-iot-edge-linux-vm.md) that runs on a Linux VM. For more information, see [Move workloads from managed IoT Edge on Azure Stack Edge to an IoT Edge solution on a Linux VM](azure-stack-edge-move-to-self-service-iot-edge.md). |
125
-
|**27.**|AKS on Azure Stack Edge |In this release, you can't modify the virtual networks once the AKS cluster is deployed on your Azure Stack Edge cluster.| To modify the virtual network, you will need to delete the AKS cluster, then modify virtual networks, and then recreate AKS cluster on your Azure Stack Edge. |
126
-
|**28.**|AKS Update |The AKS Kubernetes update might fail if one of the AKS VMs is not running. This issue might be seen in the 2-node cluster. |If the AKS update has failed, [Connect to the PowerShell interface of the device](azure-stack-edge-gpu-connect-powershell-interface.md). Check the state of the Kubernetes VMs by running `Get-VM` cmdlet. If the VM is off, run the `Start-VM` cmdlet to restart the VM. Once the Kubernetes VM is running, reapply the update. |
124
+
|**26.**|Azure IoT Edge |The managed Azure IoT Edge solution on Azure Stack Edge is running on an older, obsolete IoT Edge runtime that is at end of life. For more information, see [IoT Edge v1.1 EoL: What does that mean for me?](https://techcommunity.microsoft.com/t5/internet-of-things-blog/iot-edge-v1-1-eol-what-does-that-mean-for-me/ba-p/3662137). Although the solution doesn't stop working past end of life, there are no plans to update it. |To run the latest version of Azure IoT Edge [LTSs](../iot-edge/version-history.md#version-history) with the latest updates and features on their Azure Stack Edge, we **recommend** that you deploy a [customer self-managed IoT Edge solution](azure-stack-edge-gpu-deploy-iot-edge-linux-vm.md) that runs on a Linux VM. For more information, see [Move workloads from managed IoT Edge on Azure Stack Edge to an IoT Edge solution on a Linux VM](azure-stack-edge-move-to-self-service-iot-edge.md). |
125
+
|**27.**|AKS on Azure Stack Edge |In this release, you can't modify the virtual networks once the AKS cluster is deployed on your Azure Stack Edge cluster.| To modify the virtual network, you'll need to delete the AKS cluster, then modify virtual networks, and then recreate AKS cluster on your Azure Stack Edge. |
126
+
|**28.**|AKS Update |The AKS Kubernetes update might fail if one of the AKS VMs isn't running. This issue might be seen in the two-node cluster. |If the AKS update has failed, [Connect to the PowerShell interface of the device](azure-stack-edge-gpu-connect-powershell-interface.md). Check the state of the Kubernetes VMs by running `Get-VM` cmdlet. If the VM is off, run the `Start-VM` cmdlet to restart the VM. Once the Kubernetes VM is running, reapply the update. |
127
127
|**29.**|Wi-Fi |Wi-Fi functionality for Azure Stack Edge Mini R has been deprecated. |
128
128
129
129
## Next steps
130
130
131
131
-[Update your device](azure-stack-edge-gpu-install-update.md).
0 commit comments