You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
title: Repair a Linux VM automatically with the help of ALAR
3
-
description: This article describes how to automatically repair a nonbootable VM with the Azure Linux Auto Repair (ALAR) scripts.
3
+
description: This article describes how to automatically repair a non-bootable VM with the Azure Linux Auto Repair (ALAR) scripts.
4
4
services: virtual-machines-linux
5
5
documentationcenter: ''
6
-
author: malachma
7
-
manager: noambi
6
+
author: pagienge
8
7
editor: v-jsitser
9
8
tags: virtual-machines
10
9
ms.custom: sap:VM Admin - Linux (Guest OS), linux-related-content
@@ -13,8 +12,8 @@ ms.topic: troubleshooting
13
12
ms.workload: infrastructure-services
14
13
ms.tgt_pltfrm: vm-linux
15
14
ms.devlang: azurecli
16
-
ms.date: 09/24/2024
17
-
ms.author: malachma
15
+
ms.date: 10/31/2025
16
+
ms.author: pagienge
18
17
---
19
18
20
19
# Use Azure Linux Auto Repair (ALAR) to fix a Linux VM
@@ -27,13 +26,61 @@ ALAR utilizes the VM repair extension that's described in [Repair a Linux VM by
27
26
28
27
ALAR covers the following repair scenarios:
29
28
30
-
- Malformed /etc/fstab
31
-
syntax error
32
-
missing disk
33
-
- Damaged initrd or missing initrd line in the /boot/grub/grub.cfg
34
-
- Last installed kernel isn't bootable
35
-
- Serial console and GRUB serial are incorrectly configured or are missing
36
-
- GRUB/EFI installation or configuration damaged
29
+
- No-boot scenarios
30
+
- Malformed */etc/fstab*
31
+
- syntax error
32
+
- missing disk
33
+
- Damaged initrd or missing initrd line in the */boot/grub/grub.cfg*
34
+
- Last installed kernel isn't bootable
35
+
- GRUB/EFI installation or configuration damaged
36
+
- Disk space/auditd forced shutdowns
37
+
- Configuration issues
38
+
- Serial console and GRUB serial are incorrectly configured or are missing
39
+
- Sudo misconfiguration
40
+
41
+
## How to use ALAR
42
+
43
+
The ALAR scripts use the [az vm repair](/cli/azure/vm/repair) extension, `run` command, and its `--run-id` option. The value of the `--run-id` option for the automated recovery is `linux-alar2`. To fix a Linux VM by using an ALAR script, follow these steps:
44
+
45
+
> [!NOTE]
46
+
> The VM Contributor role doesn't provide enough permissions to run these scripted operations, as they require permissions to read, write, and delete resources in the resource group that includes the target VM. Therefore roles such as Contributor or Owner at the resource group level is required.
47
+
48
+
1. Create a rescue VM:
49
+
50
+
```azurecli-interactive
51
+
az vm repair create --verbose --resource-group <RG-NAME> --name <VM-NAME>
52
+
```
53
+
54
+
- There are currently three parameters that prompt for values if they aren't given on the command line. Add these parameters and values to the command for a non-interactive execution
55
+
- `--repair-username <RESCUE-USERNAME>`
56
+
- `--repair-password <RESCUE-PASS>`
57
+
- `--associate-public-ip`
58
+
- See the [az vm repair](/cli/azure/vm/repair) documentation for more options that can be used to control the creation of the repair VM
59
+
60
+
2. Run the `linux-alar2` script, along with parameters for one or more of the ALAR actions on the rescue VM:
61
+
62
+
```azurecli-interactive
63
+
az vm repair run --verbose --resource-group <RG-NAME> --name <VM-NAME> --run-id linux-alar2 --parameters <action1,action2,...> --run-on-repair
64
+
```
65
+
66
+
See the following for valid action names.
67
+
68
+
3. Swap the copy of the OS disk back to the original VM and delete the temporary resources:
69
+
70
+
```azurecli-interactive
71
+
az vm repair restore --verbose --resource-group <RG-NAME> --name <VM-NAME>
72
+
```
73
+
74
+
> [!NOTE]
75
+
> The original and new disks aren't deleted during the `restore` phase.
76
+
77
+
In all of the example commands these are the parameters shown:
78
+
79
+
- `RG-NAME`: The name of the resource group containing the broken VM.
80
+
- `VM-NAME`: The name of the broken VM.
81
+
- `RESCUE-USERNAME`: The user created on the repair VM for login. It's the equivalent of the user created on a new VM in the Azure portal.
82
+
- `RESCUE-PASS`: The password for `RESCUE-USERNAME`, enclosed in single quotes. For example: `'password!234'`.
83
+
- `action1,action2`, etc.: One or more of the defined actions available to apply to the broken VM. See the following for a complete list of actions and in the [ALAR GitHub ReadMe](https://github.com/Azure/ALAR). You can pass one or more actions that are run consecutively. For multiple operations, delineate them using commas without spaces, like `fstab,sudo`.
37
84
38
85
## The ALAR actions
39
86
@@ -43,11 +90,13 @@ This action strips off any lines in the */etc/fstab* file that aren't needed to
43
90
44
91
For more information about issues with a malformed */etc/fstab* file, see [Troubleshoot Linux VM starting issues because fstab errors](./linux-virtual-machine-cannot-start-fstab-errors.md).
45
92
46
-
### kernel
93
+
### efifix
47
94
48
-
This action changes the default kernel. The script replaces the broken kernel with the previously installed version.
95
+
This action can be used to reinstall the required software to boot from a GEN2 VM. The *grub.cfg* file is also regenerated.
49
96
50
-
For more information about messages that might be logged on the serial console for kernel-related startup events, see [How to recover an Azure Linux virtual machine from kernel-related boot issues](kernel-related-boot-issues.md).
97
+
### grubfix
98
+
99
+
This action can be used to reinstall GRUB and regenerate the *grub.cfg* file.
51
100
52
101
### initrd
53
102
@@ -64,63 +113,30 @@ In both cases, the following information is logged before the error entries are
This action changes the default kernel by replacing the default/broken kernel with a previously installed version.
119
+
120
+
For more information about messages that might be logged on the serial console for kernel-related startup events, see [How to recover an Azure Linux virtual machine from kernel-related boot issues](kernel-related-boot-issues.md).
121
+
67
122
### serialconsole
68
123
69
124
This action corrects an incorrect or malformed serial console configuration for the Linux kernel or GRUB. We recommend that you run this action in the following cases:
70
125
71
126
- No GRUB menu is displayed at VM startup.
72
127
- No operating system related information is written to the serial console.
73
128
74
-
### grubfix
75
-
76
-
This action can be used to reinstall GRUB and regenerate the *grub.cfg* file.
77
-
78
-
### efifix
129
+
### sudo
79
130
80
-
This action can be used to reinstall the required software to boot from a GEN2 VM. The *grub.cfg* file is also regenerated.
131
+
The `sudo` action resets the permissions on the */etc/sudoers* file and all files in */etc/sudoers.d* to the required 0440 modes and check other best practices. A basic check is run to detect and report on duplicate user entries and move only the */etc/sudoers.d/waagent* file if it's found to conflict with other files.
81
132
82
133
### auditd
83
134
84
-
If your VM shuts down immediately upon startup due to the audit daemon configuration, use this action. This action modifies the audit daemon configuration (in the */etc/audit/auditd.conf* file) by changing the `HALT` value configured for any `action` parameters to `SYSLOG`, which doesn't force the system to shut down. In a Logical Volume Manager (LVM) environment, if the logical volume that contains the audit logs is full and there's available space in the volume group, the logical volume will also be extended by 10% of the current size. However, if you're not using an LVM environment or there's no available space, only the configuration file is altered.
135
+
If your VM shuts down immediately upon startup due to the audit daemon configuration, use this action. This action modifies the audit daemon configuration (in the */etc/audit/auditd.conf* file) by changing the `HALT` value configured for any `action` parameters to `SYSLOG`, which doesn't force the system to shut down. In a Logical Volume Manager (LVM) environment, if the logical volume that contains the audit logs is full and there's available space in the volume group, the logical volume can be extended by 10% of the current size. However, if you're not using an LVM environment or there's no available space, only the `auditd` configuration file is altered.
85
136
86
137
> [!IMPORTANT]
87
-
> This action will change the VM's security posture by altering the audit daemon configuration so that the VM shutdown issue can be resolved. Once the VM is running and accessible, you need to revert the audit daemon configuration to the original state. For this purpose, a backup of the *auditd.conf* file is created in */etc/audit* by the ALAR action.
88
-
89
-
## How to use ALAR
90
-
91
-
The ALAR scripts use the repair extension `run` command and its `--run-id` option. The value of the `--run-id` option for the automated recovery is `linux-alar2`. To fix a Linux VM by using an ALAR script, follow these steps:
92
-
93
-
> [!NOTE]
94
-
> The VM Contributor role doesn't provide enough permissions to run the scripts, as they require permissions to read, write, and delete resources in the resource group that includes the target VM. Therefore roles such as Contributor or Owner at the resource group level is required.
95
-
96
-
1. Create a rescue VM:
97
-
98
-
```azurecli-interactive
99
-
az vm repair create --verbose -g RG-NAME -n VM-NAME --repair-username RESCUE-UID --repair-password RESCUE-PASS --copy-disk-name DISK-COPY
100
-
```
101
-
2. Run a script with one of the ALAR actions on the rescue VM:
102
-
103
-
```azurecli-interactive
104
-
az vm repair run --verbose -g RG-NAME -n VM-NAME --run-id linux-alar2 --parameters ACTION --run-on-repair
105
-
```
106
-
3. Swap the OS disks and delete the temporary resources:
107
-
108
-
```azurecli-interactive
109
-
az vm repair restore --verbose -g RG-NAME -n VM-NAME
110
-
```
111
-
112
-
> [!NOTE]
113
-
> The original and new disks won't be deleted.
138
+
> This action changes the VM's security posture by altering the audit daemon configuration so that the VM shutdown issue can be resolved. Once the VM is running and accessible, you need to evaluate the configuration and potentially revert it to the original state. For this purpose, a backup of the *auditd.conf* file is created in */etc/audit* by the ALAR action.
114
139
115
-
Here are explanations for the parameters in the commands above:
116
-
117
-
- `RG-NAME`: The name of the resource group containing the broken VM.
118
-
- `VM-NAME`: The name of the broken VM.
119
-
- `RESCUE-UID`: The user created on the repair VM for login. It's the equivalent of the user created on a new VM in the Azure portal.
120
-
- `RESCUE-PASS`: The password for `RESCUE-UID`, enclosed in single quotes. For example: `'password!234'`.
121
-
- `DISK-COPY`: The name of the OS disk copy that will be created from the broken VM.
122
-
- `ACTION`: A scripted task to run, such as `initrd` or `fstab`.
123
-
You can pass over single or multiple recovery operations. For multiple operations, delineate them using commas without spaces, such as `fstab,initrd`.
124
140
125
141
## Limitation
126
142
@@ -133,3 +149,5 @@ If you experience a bug or want to request an enhancement to the ALAR tool, post
133
149
You can also find the latest information about the ALAR tool on [GitHub](https://github.com/Azure/ALAR).
134
150
135
151
[!INCLUDE [Azure Help Support](../../../includes/azure-help-support.md)]
0 commit comments