Skip to content

Commit e6a0e82

Browse files
committed
Merge remote-tracking branch 'upstream/main' into Branch-CI3606
2 parents 8a5c419 + 7dd51b7 commit e6a0e82

25 files changed

+614
-830
lines changed

.openpublishing.redirection.json

Lines changed: 33 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12883,6 +12883,38 @@
1288312883
{
1288412884
"source_path": "support/power-platform/dataverse/d365-app-outlook/current-user-role-not-have-required-permissions.md",
1288512885
"redirect_url": "/troubleshoot/power-platform/dataverse/d365-app-outlook/privilege-error-occurs-when-using-dynamics-365-app-for-outlook"
12886-
}
12886+
},
12887+
{
12888+
"source_path": "support/windows-server/performance/disable-enable-dr-watson-program.md",
12889+
"redirect_url": "/previous-versions/troubleshoot/windows-server/disable-enable-dr-watson-program"
12890+
},
12891+
{
12892+
"source_path": "support/windows-server/performance/accessing-shared-folder-from-application-fails.md",
12893+
"redirect_url": "/previous-versions/troubleshoot/windows-server/accessing-shared-folder-from-application-fails"
12894+
},
12895+
{
12896+
"source_path": "support/windows-server/performance/stop-error-code-0x00000019.md",
12897+
"redirect_url": "/previous-versions/troubleshoot/windows-server/stop-error-code-0x00000019"
12898+
},
12899+
{
12900+
"source_path": "support/windows-server/performance/stop-error-driver-irql-not-less-or-equal.md",
12901+
"redirect_url": "/previous-versions/troubleshoot/windows-server/stop-error-driver-irql-not-less-or-equal"
12902+
},
12903+
{
12904+
"source_path": "support/windows-server/performance/stop-error-0x109-on-VMWare-virtual-machine.md",
12905+
"redirect_url": "/previous-versions/troubleshoot/windows-server/stop-error-0x109-on-VMWare-virtual-machine"
12906+
},
12907+
{
12908+
"source_path": "support/windows-server/performance/stop-0x0000000a-error-processor-c1-idle-state.md",
12909+
"redirect_url": "/previous-versions/troubleshoot/windows-server/stop-0x0000000a-error-processor-c1-idle-state"
12910+
},
12911+
{
12912+
"source_path": "support/windows-server/performance/troubleshoot-stop-0xc000021a-error.md",
12913+
"redirect_url": "/previous-versions/troubleshoot/windows-server/troubleshoot-stop-0xc000021a-error"
12914+
},
12915+
{
12916+
"source_path": "support/windows-server/performance/toggle-terminal-services-application-server-mode.md",
12917+
"redirect_url": "/previous-versions/troubleshoot/windows-server/toggle-terminal-services-application-server-mode"
12918+
}
1288712919
]
1288812920
}

support/azure/azure-kubernetes/availability-performance/cluster-node-virtual-machine-failed-state.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
---
22
title: Azure Kubernetes Service cluster/node is in a failed state
33
description: Helps troubleshoot an issue where an Azure Kubernetes Service (AKS) cluster/node is in a failed state.
4-
ms.date: 04/01/2024
4+
ms.date: 03/10/2025
55
ms.reviewer: chiragpa, nickoman, v-weizhu, v-six, aritraghosh
66
ms.service: azure-kubernetes-service
77
keywords:
@@ -114,7 +114,7 @@ If you prefer to use Azure CLI to view the activity log for a failed cluster, fo
114114

115115
In the Azure portal, navigate to your AKS cluster resource and select **Diagnose and solve problems** from the left menu. You'll see a list of categories and scenarios that you can select to run diagnostic checks and get recommended solutions.
116116

117-
In the Azure CLI, use the `az aks collect` command with the `--name` and `--resource-group` parameters to collect diagnostic data from your cluster nodes. You can also use the `--storage-account` and `--sas-token` parameters to specify an Azure Storage account where the data will be uploaded. The output will include a link to the **Diagnose and Solve Problems** blade where you can view the results and suggested actions.
117+
In the Azure CLI, use the `az aks kollect` command with the `--name` and `--resource-group` parameters to collect diagnostic data from your cluster nodes. You can also use the `--storage-account` and `--sas-token` parameters to specify an Azure Storage account where the data will be uploaded. The output will include a link to the **Diagnose and Solve Problems** blade where you can view the results and suggested actions.
118118

119119
In the **Diagnose and Solve Problems** blade, you can select **Cluster Issues** as the category. If any issues are detected, you'll see a list of possible solutions that you can follow to fix them.
120120

support/azure/azure-kubernetes/load-bal-ingress-c/create-unmanaged-ingress-controller.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,11 @@
11
---
22
title: Create an unmanaged ingress controller
33
description: Learn how to create and configure an ingress controller in an Azure Kubernetes Service (AKS) cluster.
4-
ms.reviewer: allensu, v-rekhanain, v-weizhu
4+
ms.reviewer: allensu, v-rekhanain, jamielo, v-weizhu
55
ms.service: azure-kubernetes-service
66
ms.custom: sap:Load balancer and Ingress controller
77
ms.topic: how-to
8-
ms.date: 10/17/2024
8+
ms.date: 03/10/2025
99
---
1010
# Create an unmanaged ingress controller
1111

@@ -574,7 +574,7 @@ Alternatively, a more granular approach is to delete the individual resources cr
574574

575575
To configure TLS with your existing ingress components, see [Use TLS with an ingress controller](/previous-versions/azure/aks/ingress-tls).
576576

577-
To configure your AKS cluster to use HTTP application routing, see [Enable the HTTP application routing add-on](/previous-versions/azure/aks/http-application-routing).
577+
To configure your AKS cluster to use application routing, see [Application routing add-on](/azure/aks/app-routing).
578578

579579
This article included some external components to AKS. To learn more about these components, see the following project pages:
580580

support/azure/virtual-machines/linux/serial-console-linux.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ ms.collection: linux
1212
ms.topic: article
1313
ms.tgt_pltfrm: vm-linux
1414
ms.workload: infrastructure-services
15-
ms.date: 02/10/2025
15+
ms.date: 03/11/2025
1616
ms.author: mbifeld
1717
---
1818

@@ -131,7 +131,7 @@ Serial Console uses the storage account configured for boot diagnostics in its c
131131
| UAE | UAE Central, UAE North | 20.38.141.5, 20.45.95.64, 20.45.95.65, 20.45.95.66, 20.203.93.198, 20.233.132.205, 40.120.87.50, 40.120.87.51 |
132132
| United Kingdom | UK South, UK West | 20.58.68.62, 20.58.68.63, 20.90.32.180, 20.90.132.144, 20.90.132.145, 51.104.30.169, 172.187.0.26, 172.187.65.53 |
133133
| United States | US Central, US East, US East 2, US East 2 EUAP, US North, US South, US West, US West 2, US West 3 | 4.149.249.197, 4.150.239.210, 20.14.127.175, 20.40.200.175, 20.45.242.18, 20.45.242.19, 20.45.242.20, 20.47.232.186, 20.51.21.252, 20.69.5.160, 20.69.5.161, 20.69.5.162, 20.83.222.100, 20.83.222.101, 20.83.222.102, 20.98.146.84, 20.98.146.85, 20.98.194.64, 20.98.194.65, 20.98.194.66, 20.168.188.34, 20.241.116.153, 52.159.214.194, 57.152.124.244, 68.220.123.194, 74.249.127.175, 74.249.142.218, 157.55.93.0, 168.61.232.59, 172.183.234.204, 172.191.219.35 |
134-
| USGov | All US Government Cloud regions | 20.140.104.48, 20.140.105.3, 20.140.144.58, 20.140.144.59, 20.140.147.168, 20.140.53.121, 20.141.10.130, 20.141.10.131, 20.141.13.121, 20.141.15.104, 52.127.55.131, 52.235.252.252, 52.235.252.253, 52.243.247.124, 52.245.155.139, 52.245.156.185, 62.10.196.24, 62.10.196.25, 62.10.84.240, 62.11.6.64, 62.11.6.65 |
134+
| USGov | All US Government Cloud regions | 20.140.104.48, 20.140.105.3, 20.140.144.58, 20.140.144.59, 20.140.147.168, 20.140.53.121, 20.141.10.130, 20.141.10.131, 20.141.13.121, 20.141.15.104, 52.127.55.131, 52.235.252.252, 52.235.252.253, 52.243.247.124, 52.245.155.139, 52.245.156.185, 62.10.84.240 |
135135

136136
> [!IMPORTANT]
137137
> - The IPs that need to be permitted are specific to the region where the VM is located. For example, a virtual machine deployed in the North Europe region needs to add the following IP exclusions to the storage account firewall for the Europe geography: 52.146.139.220 and 20.105.209.72. View the table above to find the correct IPs for your region and geography.

support/azure/virtual-machines/linux/troubleshoot-rhel-pacemaker-cluster-services-resources-startup-issues.md

Lines changed: 94 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,11 @@
11
---
2-
title: Troubleshoot RHEL pacemaker cluster services and resources startup issues in Azure
2+
title: Troubleshoot RHEL Pacemaker Cluster Services and Resources Startup Issues in Azure
33
description: Provides troubleshooting guidance for issues related to cluster resources or services in RedHat Enterprise Linux (RHEL)) Pacemaker Cluster
44
ms.reviewer: rnirek,srsakthi
5-
ms.author: skarthikeyan
5+
ms.author: rnirek
66
author: skarthikeyan7-msft
77
ms.topic: troubleshooting
8-
ms.date: 01/22/2025
8+
ms.date: 02/24/2025
99
ms.service: azure-virtual-machines
1010
ms.collection: linux
1111
ms.custom: sap:Issue with Pacemaker clustering, and fencing
@@ -71,7 +71,7 @@ quorum {
7171

7272
### Resolution for scenario 1
7373

74-
1. Before you make any changes, ensure you have a backup or snapshot. For more information, see [Azure VM backup](/azure/backup/backup-azure-vms-introduction).
74+
1. Before you make any changes, make sure that you have a backup or snapshot. For more information, see [Azure VM backup](/azure/backup/backup-azure-vms-introduction).
7575

7676
2. Check for missing quorum section in `/etc/corosync/corosync.conf`. Compare the existing `corosync.conf` with any backup that's available in `/etc/corosync/`.
7777

@@ -125,7 +125,7 @@ quorum {
125125
}
126126
```
127127

128-
5. Remove the cluster from maintenance-mode.
128+
5. Remove the cluster from maintenance mode.
129129

130130
```bash
131131
sudo pcs property set maintenance-mode=false
@@ -149,7 +149,7 @@ quorum {
149149

150150
A virtual IP resource (`IPaddr2` resource) didn't start or stop in Pacemaker.
151151
152-
The following error messages are logged in `/var/log/pacemaker.log`:
152+
The following error entries are logged in `/var/log/pacemaker.log`:
153153
154154
```output
155155
25167 IPaddr2(VIP)[16985]: 2024/09/07_15:44:19 ERROR: Unable to find nic or netmask.
@@ -208,7 +208,7 @@ vip_HN1_03_start_0 on node-1 'unknown error' (1): call=30, status=complete, exit
208208

209209
If a route that matches the `VIP` isn't in the default routing table, you can specify the `NIC` name in the Pacemaker resource so that it can be configured to bypass the check:
210210
211-
1. Before you make any changes, ensure you have a backup or snapshot. For more information, see [Azure VM backup](/azure/backup/backup-azure-vms-introduction).
211+
1. Before you make any changes, make sure that you have a backup or snapshot. For more information, see [Azure VM backup](/azure/backup/backup-azure-vms-introduction).
212212
213213
2. Put the cluster into maintenance mode:
214214
@@ -334,7 +334,7 @@ The SAP HANA resource can't be started by Pacemaker if there are `SYN` failures
334334
> [!Important]
335335
> Steps 2, 3, and 4 must be performed by using a SAP administrator account. This is because these steps use a SAP System ID to stop, start, and re-enable replication manually.
336336
337-
1. Before you make any changes, ensure you have a backup or snapshot. For more information, see [Azure VM backup](/azure/backup/backup-azure-vms-introduction).
337+
1. Before you make any changes, make sure that you have a backup or snapshot. For more information, see [Azure VM backup](/azure/backup/backup-azure-vms-introduction).
338338
339339
2. Put the cluster into maintenance mode:
340340
@@ -512,7 +512,7 @@ This issue frequently occurs if the database is modified (manually stopped or st
512512
> [!Note]
513513
> Steps 1 through 5 should be performed by an SAP administrator.
514514

515-
1. Before you make any changes, ensure you have a backup or snapshot. For more information, see [Azure VM backup](/azure/backup/backup-azure-vms-introduction).
515+
1. Before you make any changes, make sure that you have a backup or snapshot. For more information, see [Azure VM backup](/azure/backup/backup-azure-vms-introduction).
516516

517517
2. Put the cluster into maintenance mode:
518518

@@ -620,7 +620,7 @@ Because of incorrect `InstanceName` and `START_PROFILE` attributes, the SAP inst
620620
> [!Note]
621621
> This resolution is applicable if `InstanceName` and `START_PROFILE` are separate files.
622622
623-
1. Before you make any changes, ensure you have a backup or snapshot. For more information, see [Azure VM backup](/azure/backup/backup-azure-vms-introduction).
623+
1. Before you make any changes, make sure that you have a backup or snapshot. For more information, see [Azure VM backup](/azure/backup/backup-azure-vms-introduction).
624624
625625
2. Put the cluster into maintenance mode:
626626
@@ -659,6 +659,90 @@ Because of incorrect `InstanceName` and `START_PROFILE` attributes, the SAP inst
659659
sudo pcs property set maintenance-mode=false
660660
```
661661
662+
## Scenario 5: Fenced node doesn't rejoin cluster
663+
664+
### Symptom for scenario 5
665+
666+
After the fencing operation is finished, the affected node typically doesn't rejoin the Pacemaker Cluster, and both the Pacemaker and Corosync services remain stopped unless they are manually started to restore the cluster online.
667+
668+
### Cause for scenario 5
669+
670+
After the node is fenced and restarted and has restarted its cluster services, it subsequently receives a message that states, `We were allegedly just fenced`. This causes it to shut down its Pacemaker and Corosync services and prevent the cluster from starting. Node1 initiates a STONITH action against node2. At `03:27:23`, when the network issue is resolved, node2 rejoins the Corosync membership. Consequently, a new two-node membership is established, as shown in `/var/log/messages` for node1:
671+
672+
```output
673+
Feb 20 03:26:56 node1 corosync[1722]: [TOTEM ] A processor failed, forming new configuration.
674+
Feb 20 03:27:23 node1 corosync[1722]: [TOTEM ] A new membership (1.116f4) was formed. Members left: 2
675+
Feb 20 03:27:24 node1 corosync[1722]: [QUORUM] Members[1]: 1
676+
...
677+
Feb 20 03:27:24 node1 pacemaker-schedulerd[1739]: warning: Cluster node node2 will be fenced: peer is no longer part of the cluster
678+
...
679+
Feb 20 03:27:24 node1 pacemaker-fenced[1736]: notice: Delaying 'reboot' action targeting node2 using for 20s
680+
Feb 20 03:27:25 node1 corosync[1722]: [TOTEM ] A new membership (1.116f8) was formed. Members joined: 2
681+
Feb 20 03:27:25 node1 corosync[1722]: [QUORUM] Members[2]: 1 2
682+
Feb 20 03:27:25 node1 corosync[1722]: [MAIN ] Completed service synchronization, ready to provide service.
683+
```
684+
685+
Node1 received confirmation that node2 was successfully restarted, as shown in `/var/log/messages` for node2.
686+
687+
```output
688+
Feb 20 03:27:46 node1 pacemaker-fenced[1736]: notice: Operation 'reboot' [43895] (call 28 from pacemaker-controld.1740) targeting node2 using xvm2 returned 0 (OK)
689+
```
690+
691+
To fully complete the STONITH action, the system had to deliver the confirmation message to every node. Because node2 rejoined the group at `03:27:25` and no new membership that excluded node2 was yet formed because of the token and consensus timeouts not expiring, the confirmation message is delayed until node2 restarts its cluster services after startup. Upon receiving the message, node2 recognizes that it has been fenced and, consequently, shut down its services as shown:
692+
693+
`/var/log/messages` in node1:
694+
```output
695+
Feb 20 03:29:02 node1 corosync[1722]: [TOTEM ] A processor failed, forming new configuration.
696+
Feb 20 03:29:10 node1 corosync[1722]: [TOTEM ] A new membership (1.116fc) was formed. Members joined: 2 left: 2
697+
Feb 20 03:29:10 node1 corosync[1722]: [QUORUM] Members[2]: 1 2
698+
Feb 20 03:29:10 node1 pacemaker-fenced[1736]: notice: Operation 'reboot' targeting node2 by node1 for pacemaker-controld.1740@node1: OK
699+
Feb 20 03:29:10 node1 pacemaker-controld[1740]: notice: Peer node2 was terminated (reboot) by node1 on behalf of pacemaker-controld.1740: OK
700+
...
701+
Feb 20 03:29:11 node1 corosync[1722]: [CFG ] Node 2 was shut down by sysadmin
702+
Feb 20 03:29:11 node1 corosync[1722]: [TOTEM ] A new membership (1.11700) was formed. Members left: 2
703+
Feb 20 03:29:11 node1 corosync[1722]: [QUORUM] Members[1]: 1
704+
Feb 20 03:29:11 node1 corosync[1722]: [MAIN ] Completed service synchronization, ready to provide service.
705+
```
706+
707+
`/var/log/messages` in node2:
708+
```output
709+
Feb 20 03:29:11 [1155] node2 corosync notice [TOTEM ] A new membership (1.116fc) was formed. Members joined: 1
710+
Feb 20 03:29:11 [1155] node2 corosync notice [QUORUM] Members[2]: 1 2
711+
Feb 20 03:29:09 node2 pacemaker-controld [1323] (tengine_stonith_notify) crit: We were allegedly just fenced by node1 for node1!
712+
```
713+
714+
### Resolution for scenario 5
715+
716+
Configure a startup delay for the Crosync service. This pause provides sufficient time for a new Closed Process Group (CPG) membership to form and exclude the fenced node so that the STONITH restart process can finish by making sure the completion message reaches all nodes in the membership.
717+
718+
To achieve this effect, run the following commands:
719+
720+
1. Put the cluster into maintenance mode:
721+
722+
```bash
723+
sudo pcs property set maintenance-mode=true
724+
```
725+
2. Create a systemd drop-in file on all the nodes in the cluster:
726+
727+
- Edit the Corosync file:
728+
```bash
729+
sudo systemctl edit corosync.service
730+
```
731+
- Add the following lines:
732+
```config
733+
[Service]
734+
ExecStartPre=/bin/sleep 60
735+
```
736+
- After you save the file and exit the text editor, reload the systemd manager configuration:
737+
```bash
738+
sudo systemctl daemon-reload
739+
```
740+
3. Remove the cluster from maintenance mode:
741+
```bash
742+
sudo pcs property set maintenance-mode=false
743+
```
744+
For more information refer to [Fenced Node Fails to Rejoin Cluster Without Manual Intervention](https://access.redhat.com/solutions/5644441)
745+
662746
## Next steps
663747
664748
For additional help, open a support request by using the following instructions. When you submit your request, attach the [SOS report](https://access.redhat.com/solutions/3592) from all the nodes in the cluster for troubleshooting.

support/azure/virtual-machines/windows/serial-console-windows.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ ms.collection: windows
1111
ms.topic: article
1212
ms.tgt_pltfrm: vm-windows
1313
ms.workload: infrastructure-services
14-
ms.date: 01/10/2025
14+
ms.date: 03/11/2025
1515
ms.author: mbifeld
1616
ms.custom: sap:VM Admin - Windows (Guest OS)
1717
---
@@ -185,7 +185,7 @@ Serial Console uses the storage account configured for boot diagnostics in its c
185185
| UAE | UAE Central, UAE North | 20.38.141.5, 20.45.95.64, 20.45.95.65, 20.45.95.66, 20.203.93.198, 20.233.132.205, 40.120.87.50, 40.120.87.51 |
186186
| United Kingdom | UK South, UK West | 20.58.68.62, 20.58.68.63, 20.90.32.180, 20.90.132.144, 20.90.132.145, 51.104.30.169, 172.187.0.26, 172.187.65.53 |
187187
| United States | US Central, US East, US East 2, US East 2 EUAP, US North, US South, US West, US West 2, US West 3 | 4.149.249.197, 4.150.239.210, 20.14.127.175, 20.40.200.175, 20.45.242.18, 20.45.242.19, 20.45.242.20, 20.47.232.186, 20.51.21.252, 20.69.5.160, 20.69.5.161, 20.69.5.162, 20.83.222.100, 20.83.222.101, 20.83.222.102, 20.98.146.84, 20.98.146.85, 20.98.194.64, 20.98.194.65, 20.98.194.66, 20.168.188.34, 20.241.116.153, 52.159.214.194, 57.152.124.244, 68.220.123.194, 74.249.127.175, 74.249.142.218, 157.55.93.0, 168.61.232.59, 172.183.234.204, 172.191.219.35 |
188-
| USGov | All US Government Cloud regions | 20.140.104.48, 20.140.105.3, 20.140.144.58, 20.140.144.59, 20.140.147.168, 20.140.53.121, 20.141.10.130, 20.141.10.131, 20.141.13.121, 20.141.15.104, 52.127.55.131, 52.235.252.252, 52.235.252.253, 52.243.247.124, 52.245.155.139, 52.245.156.185, 62.10.196.24, 62.10.196.25, 62.10.84.240, 62.11.6.64, 62.11.6.65 |
188+
| USGov | All US Government Cloud regions | 20.140.104.48, 20.140.105.3, 20.140.144.58, 20.140.144.59, 20.140.147.168, 20.140.53.121, 20.141.10.130, 20.141.10.131, 20.141.13.121, 20.141.15.104, 52.127.55.131, 52.235.252.252, 52.235.252.253, 52.243.247.124, 52.245.155.139, 52.245.156.185, 62.10.84.240 |
189189

190190
> [!IMPORTANT]
191191
> - The IPs that need to be permitted are specific to the region where the VM is located. For example, a virtual machine deployed in the North Europe region needs to add the following IP exclusions to the storage account firewall for the Europe geography: 52.146.139.220 and 20.105.209.72. View the table above to find the correct IPs for your region and geography.

0 commit comments

Comments
 (0)