Skip to content

Commit 9864548

Browse files
Moving host maintenance to a separate article
1 parent 6d7c50f commit 9864548

File tree

2 files changed

+9
-50
lines changed

2 files changed

+9
-50
lines changed

articles/azure-vmware/architecture-private-clouds copy.md

Lines changed: 9 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -59,10 +59,15 @@ Azure VMware Solution monitors the following conditions on the host:
5959
## Alert Codes and Remediation Table
6060
| Error Code | Error Details | Recommended Action |
6161
|--------------------|---------------------------------|---------------------|
62-
| EPC_SCSIDEVICE_SHARINGMODE | This error is encountered when a Virtual Machine is configured to use a device that prevents a maintenance operation: A device that is a SCSI controller which is engaged in bus-sharing | Follow the KB article for the removal of any SCSI controller engaged in bus-sharing attached to VMs  https://knowledge.broadcom.com/external/article?legacyId=79910 |
63-
| EPC_CDROM_EMULATEMODE | This error is encountered when CD-ROM on the Virtual Machine uses emulate mode, whose ISO image is not accessible | Follow the KB article for the removal of any CDROM mounted on customer's workload Virtual Machines in emulate mode or detach ISO. It is recommended to use Passthrough mode for mounting any CD-ROM. https://knowledge.broadcom.com/external/article?legacyId=79306 |
64-
| EPC_DATASTORE_INACCESSIBLE | This error is encountered when any external Datastore attached to AVS Private Cloud becomes inaccessible | Follow the KB article for the removal of any stale Datastore attached to cluster /azure/azure-vmware/attach-azure-netapp-files-to-azure-vmware-solution-hosts?tabs=azure-portal#performance-best-practices |
65-
| EPC_NWADAPTER_STALE | This error is encountered when connected Network interface on the Virtual Machine uses network adapter which becomes inaccessible | Follow the KB article for the removal of any stale N/W adapters attached to Virtual Machines https://knowledge.broadcom.com/external/article/318738/troubleshooting-the-migration-compatibil.html |
62+
| EPC_CDROM_EMULATEMODE | This error is encountered when CD-ROM on the Virtual Machine uses emulate mode, whose ISO image is not accessible | Follow [this KB article](https://knowledge.broadcom.com/external/article?legacyId=79306) for the removal of any CDROM mounted on customer's workload Virtual Machines in emulate mode or detach ISO. It is recommended to use "Passthrough mode" for mounting any CD-ROM. |
63+
| EPC_DRSOVERRIDERULE | This error is encountered when there is a Virtual Machine with DRS Override set to “Disabled” mode. | VM should not block vMotion while putting host into maintenance. Set Partially Automated DRS rules for the VM. Refer to [this document](/azure/azure-vmware/create-placement-policy#enable-restrict-vm-movement-for-specific-vms) to know more about VM placement policies. |
64+
| EPC_SCSIDEVICE_SHARINGMODE | This error is encountered when a Virtual Machine is configured to use a device that prevents a maintenance operation: A device that is a SCSI controller which is engaged in bus-sharing | Follow [this KB article](https://knowledge.broadcom.com/external/article?legacyId=79910) for the removal of any SCSI controller engaged in bus-sharing attached to VMs |
65+
| EPC_DATASTORE_INACCESSIBLE | This error is encountered when any external Datastore attached to AVS Private Cloud becomes inaccessible | Follow [this article](/azure/azure-vmware/attach-azure-netapp-files-to-azure-vmware-solution-hosts?tabs=azure-portal#performance-best-practices) for the removal of any stale Datastore attached to cluster |
66+
| EPC_NWADAPTER_STALE | This error is encountered when connected Network interface on the Virtual Machine uses network adapter which becomes inaccessible | Follow [this KB article](https://knowledge.broadcom.com/external/article/318738/troubleshooting-the-migration-compatibil.html) for the removal of any stale N/W adapters attached to Virtual Machines |
67+
| EPC_SERIAL_PORT | This error is encountered when a Virtual Machine’s serial port is connected to a device that cannot be accessed on the destination host. | If you are using an image file (ISO, FLP, and so on), ensure that it is accessible from all ESXi servers on the cluster. Store the files on a data store that is shared between all ESXi servers that will participate in vMotion of the virtual machine. Refer to [this KB article](https://knowledge.broadcom.com/external/article/324829/vmotion-fails-with-the-compatibility-err.html) from Broadcom for more information. |
68+
| EPC_HARDWARE_DEVICE | This error is encountered when a Virtual Machine’s parallel Port/USB Device is connected to a device cannot be accessed on the destination host. | If you are using an image file (ISO, FLP, and so on), ensure that it is accessible from all ESXi servers of the cluster. Store the files on a data store that is shared between all ESXi servers that will participate in the vMotion of the virtual machine. Refer to [this KB article](https://knowledge.broadcom.com/external/article/324829/vmotion-fails-with-the-compatibility-err.html) from Broadcom for more information. |
69+
| EPC_INVALIDVM / EPC_ORPHANVM | This error is encountered when there is an orphaned or Invalid VM in the inventory | Ensure all your Virtual Machines are accessible to the vCenter. Refer to [this KB article](https://knowledge.broadcom.com/external/article/312831/virtual-machines-appear-as-invalid-or-or.html) for more information |
70+
6671

6772
> [!NOTE]
6873
> Azure VMware Solution tenant admins must not edit or delete the previously defined VMware vCenter Server alarms because they are managed by the Azure VMware Solution control plane on vCenter Server. These alarms are used by Azure VMware Solution monitoring to trigger the Azure VMware Solution host remediation process.

articles/azure-vmware/architecture-private-clouds.md

Lines changed: 0 additions & 46 deletions
Original file line numberDiff line numberDiff line change
@@ -125,52 +125,6 @@ Microsoft is a member of the VMware Metal-as-a-Service (MaaS) program and uses t
125125

126126
[!INCLUDE [vmware-software-versions](includes/vmware-software-versions.md)]
127127

128-
## Host maintenance and lifecycle management
129-
130-
[!INCLUDE [vmware-software-update-frequency](includes/vmware-software-update-frequency.md)]
131-
132-
## Host monitoring and remediation
133-
134-
Azure VMware Solution continuously monitors the health of both the VMware components and underlay. When Azure VMware Solution detects a failure, it takes action to repair the failed components. When Azure VMware Solution detects a degradation or failure on an Azure VMware Solution node, it triggers the host remediation process.
135-
136-
Host remediation involves replacing the faulty node with a new healthy node in the cluster. Then, when possible, the faulty host is placed in VMware vSphere maintenance mode. VMware vSphere vMotion moves the VMs off the faulty host to other available servers in the cluster, potentially allowing zero downtime for live migration of workloads. If the faulty host can't be placed in maintenance mode, the host is removed from the cluster. Before the faulty host is removed, the customer workloads are migrated to a newly added host.
137-
138-
> [!TIP]
139-
> **Customer communication:** An email is sent to the customer's email address before the replacement is initiated and again after the replacement is successful.
140-
>
141-
> To receive emails related to host replacement, you need to be added to any of the following Azure RBAC roles in the subscription: 'ServiceAdmin', 'CoAdmin', 'Owner', 'Contributor'.
142-
143-
Azure VMware Solution monitors the following conditions on the host:
144-
145-
- Processor status
146-
- Memory status
147-
- Connection and power state
148-
- Hardware fan status
149-
- Network connectivity loss
150-
- Hardware system board status
151-
- Errors occurred on the disk(s) of a vSAN host
152-
- Hardware voltage
153-
- Hardware temperature status
154-
- Hardware power status
155-
- Storage status
156-
- Connection failure
157-
158-
## Alert Codes and Remediation Table
159-
| Error Code | Error Details | Recommended Action |
160-
|--------------------|---------------------------------|---------------------|
161-
| EPC_CDROM_EMULATEMODE | This error is encountered when CD-ROM on the Virtual Machine uses emulate mode, whose ISO image is not accessible | Follow [this KB article](https://knowledge.broadcom.com/external/article?legacyId=79306) for the removal of any CDROM mounted on customer's workload Virtual Machines in emulate mode or detach ISO. It is recommended to use "Passthrough mode" for mounting any CD-ROM. |
162-
| EPC_DRSOVERRIDERULE | This error is encountered when there is a Virtual Machine with DRS Override set to “Disabled” mode. | VM should not block vMotion while putting host into maintenance. Set Partially Automated DRS rules for the VM. Refer to [this document](/azure/azure-vmware/create-placement-policy#enable-restrict-vm-movement-for-specific-vms) to know more about VM placement policies. |
163-
| EPC_SCSIDEVICE_SHARINGMODE | This error is encountered when a Virtual Machine is configured to use a device that prevents a maintenance operation: A device that is a SCSI controller which is engaged in bus-sharing | Follow [this KB article](https://knowledge.broadcom.com/external/article?legacyId=79910) for the removal of any SCSI controller engaged in bus-sharing attached to VMs |
164-
| EPC_DATASTORE_INACCESSIBLE | This error is encountered when any external Datastore attached to AVS Private Cloud becomes inaccessible | Follow [this article](/azure/azure-vmware/attach-azure-netapp-files-to-azure-vmware-solution-hosts?tabs=azure-portal#performance-best-practices) for the removal of any stale Datastore attached to cluster |
165-
| EPC_NWADAPTER_STALE | This error is encountered when connected Network interface on the Virtual Machine uses network adapter which becomes inaccessible | Follow [this KB article](https://knowledge.broadcom.com/external/article/318738/troubleshooting-the-migration-compatibil.html) for the removal of any stale N/W adapters attached to Virtual Machines |
166-
| EPC_SERIAL_PORT | This error is encountered when a Virtual Machine’s serial port is connected to a device that cannot be accessed on the destination host. | If you are using an image file (ISO, FLP, and so on), ensure that it is accessible from all ESXi servers on the cluster. Store the files on a data store that is shared between all ESXi servers that will participate in vMotion of the virtual machine. Refer to [this KB article](https://knowledge.broadcom.com/external/article/324829/vmotion-fails-with-the-compatibility-err.html) from Broadcom for more information. |
167-
| EPC_HARDWARE_DEVICE | This error is encountered when a Virtual Machine’s parallel Port/USB Device is connected to a device cannot be accessed on the destination host. | If you are using an image file (ISO, FLP, and so on), ensure that it is accessible from all ESXi servers of the cluster. Store the files on a data store that is shared between all ESXi servers that will participate in the vMotion of the virtual machine. Refer to [this KB article](https://knowledge.broadcom.com/external/article/324829/vmotion-fails-with-the-compatibility-err.html) from Broadcom for more information. |
168-
| EPC_INVALIDVM / EPC_ORPHANVM | This error is encountered when there is an orphaned or Invalid VM in the inventory | Ensure all your Virtual Machines are accessible to the vCenter. Refer to [this KB article](https://knowledge.broadcom.com/external/article/312831/virtual-machines-appear-as-invalid-or-or.html) for more information |
169-
170-
171-
> [!NOTE]
172-
> Azure VMware Solution tenant admins must not edit or delete the previously defined VMware vCenter Server alarms because they are managed by the Azure VMware Solution control plane on vCenter Server. These alarms are used by Azure VMware Solution monitoring to trigger the Azure VMware Solution host remediation process.
173-
174128
## Backup and restore
175129

176130
Azure VMware Solution private cloud vCenter Server and HCX Manager (if enabled) configurations are on a daily backup schedule and NSX configuration has an hourly backup schedule. The backups are retained for a minimum of three days. Open a [support request](https://rc.portal.azure.com/#create/Microsoft.Support) in the Azure portal to request restoration.

0 commit comments

Comments
 (0)