Skip to content

Commit 38b980e

Browse files
Merge pull request #15607 from pauljewellmsft/amlfs-monitoring
[AMLFS] Add monitoring/metrics articles
2 parents 305e9ad + 2801983 commit 38b980e

File tree

4 files changed

+154
-0
lines changed

4 files changed

+154
-0
lines changed

.openpublishing.publish.config.json

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -92,6 +92,12 @@
9292
"branch": "main",
9393
"branch_mapping": {}
9494
},
95+
{
96+
"path_to_root": "azure-reference-other-repo",
97+
"url": "https://github.com/MicrosoftDocs/azure-reference-other-pr",
98+
"branch": "main",
99+
"branch_mapping": {}
100+
},
95101
{
96102
"path_to_root": "quickstart-templates",
97103
"url": "https://github.com/Azure/azure-quickstart-templates",

azure-managed-lustre/TOC.yml

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -44,6 +44,12 @@
4444
href: configure-network-security-group.md
4545
- name: Use customer-managed encryption keys
4646
href: customer-managed-encryption-keys.md
47+
- name: Monitoring metrics and logs
48+
items:
49+
- name: Monitor a file system
50+
href: monitor-file-system.md
51+
- name: Monitoring reference for metrics and logs
52+
href: monitor-file-system-reference.md
4753
- name: Availability and disaster recovery
4854
items:
4955
- name: Recover from a regional outage
Lines changed: 62 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,62 @@
1+
---
2+
title: Monitoring data reference for Azure Managed Lustre
3+
description: This article contains important reference material you need when you monitor Azure Managed Lustre.
4+
ms.date: 08/16/2024
5+
ms.custom: horz-monitor
6+
ms.topic: reference
7+
author: pauljewellmsft
8+
ms.author: pauljewell
9+
ms.service: azure-managed-lustre
10+
---
11+
12+
# Azure Managed Lustre monitoring data reference
13+
14+
[!INCLUDE [horz-monitor-ref-intro](~/../azure-stack/reusable-content/ce-skilling/azure/includes/azure-monitor/horizontals/horz-monitor-ref-intro.md)]
15+
16+
See [Monitor Azure Managed Lustre](monitor-file-system.md) for details on the data you can collect for Azure Managed Lustre and how to use it.
17+
18+
[!INCLUDE [horz-monitor-ref-metrics-intro](~/../azure-stack/reusable-content/ce-skilling/azure/includes/azure-monitor/horizontals/horz-monitor-ref-metrics-intro.md)]
19+
20+
### Supported metrics for Microsoft.StorageCache/amlFilesystems
21+
22+
The following table lists the metrics available for the Microsoft.StorageCache/amlFilesystems resource type.
23+
24+
[!INCLUDE [horz-monitor-ref-metrics-tableheader](~/../azure-stack/reusable-content/ce-skilling/azure/includes/azure-monitor/horizontals/horz-monitor-ref-metrics-tableheader.md)]
25+
26+
[!INCLUDE [Microsoft.StorageCache/amlFilesystems](~/../azure-reference-other-repo/azure-monitor-ref/supported-metrics/includes/microsoft-storagecache-amlfilesystems-metrics-include.md)]
27+
28+
> [!NOTE]
29+
> The metric `OSTBytesUsed` represents the total capacity consumed on the file system, including all metadata and overhead associated with the files. The value for `OSTBytesUsed` might be greater than the result of running `lfs df` on the file system, as `df` output for **Used** only attempts to capture the data that the end user has placed on the file system.
30+
31+
[!INCLUDE [horz-monitor-ref-metrics-dimensions-intro](~/../azure-stack/reusable-content/ce-skilling/azure/includes/azure-monitor/horizontals/horz-monitor-ref-metrics-dimensions-intro.md)]
32+
33+
[!INCLUDE [horz-monitor-ref-metrics-dimensions](~/../azure-stack/reusable-content/ce-skilling/azure/includes/azure-monitor/horizontals/horz-monitor-ref-metrics-dimensions.md)]
34+
35+
### Dimensions specific to Azure Managed Lustre
36+
37+
| Dimension name | Description |
38+
| --- | --- |
39+
| `ostnum` | Object Storage Target (OST) index number |
40+
| `mdtnum` | Metadata Target (MDT) index number |
41+
| `operation` | Type of operation performed |
42+
43+
### Supported resource logs for Microsoft.StorageCache/amlFilesystems
44+
45+
[!INCLUDE [Microsoft.StorageCache/amlFilesystems](~/../azure-reference-other-repo/azure-monitor-ref/supported-logs/includes/microsoft-storagecache-amlfilesystems-logs-include.md)]
46+
47+
### Azure Monitor Logs tables
48+
49+
This section lists the Azure Monitor Logs tables relevant to this service, which are available for query by Log Analytics using Kusto queries.
50+
51+
- [AFSAuditLogs](/azure/azure-monitor/reference/tables/AFSAuditLogs)
52+
- [AzureActivity](/azure/azure-monitor/reference/tables/azureactivity)
53+
- [AzureMetrics](/azure/azure-monitor/reference/tables/azuremetrics)
54+
55+
[!INCLUDE [horz-monitor-ref-activity-log](~/../azure-stack/reusable-content/ce-skilling/azure/includes/azure-monitor/horizontals/horz-monitor-ref-activity-log.md)]
56+
57+
- [Microsoft.StorageCache permissions](/azure/role-based-access-control/permissions/storage#microsoftstoragecache)
58+
59+
## Related content
60+
61+
- See [Monitor Azure Managed Lustre](monitor-file-system.md) for a description of monitoring Azure Managed Lustre.
62+
- See [Monitor Azure resources with Azure Monitor](/azure/azure-monitor/essentials/monitor-azure-resource) for details on monitoring Azure resources.
Lines changed: 80 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,80 @@
1+
---
2+
title: Monitor Azure Managed Lustre
3+
description: Start here to learn how to monitor Azure Managed Lustre.
4+
ms.date: 08/16/2024
5+
ms.custom: horz-monitor
6+
ms.topic: conceptual
7+
author: pauljewellmsft
8+
ms.author: pauljewell
9+
ms.service: azure-managed-lustre
10+
---
11+
12+
# Monitor Azure Managed Lustre
13+
14+
[!INCLUDE [horz-monitor-intro](~/../azure-stack/reusable-content/ce-skilling/azure/includes/azure-monitor/horizontals/horz-monitor-intro.md)]
15+
16+
[!INCLUDE [horz-monitor-resource-types](~/../azure-stack/reusable-content/ce-skilling/azure/includes/azure-monitor/horizontals/horz-monitor-resource-types.md)]
17+
18+
For more information about the resource types for Azure Managed Lustre, see [Azure Managed Lustre monitoring data reference](monitor-file-system-reference.md).
19+
20+
[!INCLUDE [horz-monitor-data-storage](~/../azure-stack/reusable-content/ce-skilling/azure/includes/azure-monitor/horizontals/horz-monitor-data-storage.md)]
21+
22+
[!INCLUDE [horz-monitor-platform-metrics](~/../azure-stack/reusable-content/ce-skilling/azure/includes/azure-monitor/horizontals/horz-monitor-platform-metrics.md)]
23+
24+
For a list of available metrics for Azure Managed Lustre, see [Azure Managed Lustre monitoring data reference](monitor-file-system-reference.md#metrics).
25+
26+
[!INCLUDE [horz-monitor-resource-logs](~/../azure-stack/reusable-content/ce-skilling/azure/includes/azure-monitor/horizontals/horz-monitor-resource-logs.md)]
27+
28+
For the available resource log categories, their associated Log Analytics tables, and the log schemas for Azure Managed Lustre, see [Azure Managed Lustre monitoring data reference](monitor-file-system-reference.md#supported-resource-logs-for-microsoftstoragecacheamlfilesystems).
29+
30+
[!INCLUDE [horz-monitor-activity-log](~/../azure-stack/reusable-content/ce-skilling/azure/includes/azure-monitor/horizontals/horz-monitor-activity-log.md)]
31+
32+
[!INCLUDE [horz-monitor-analyze-data](~/../azure-stack/reusable-content/ce-skilling/azure/includes/azure-monitor/horizontals/horz-monitor-analyze-data.md)]
33+
34+
[!INCLUDE [horz-monitor-external-tools](~/../azure-stack/reusable-content/ce-skilling/azure/includes/azure-monitor/horizontals/horz-monitor-external-tools.md)]
35+
36+
[!INCLUDE [horz-monitor-kusto-queries](~/../azure-stack/reusable-content/ce-skilling/azure/includes/azure-monitor/horizontals/horz-monitor-kusto-queries.md)]
37+
38+
This section shows queries that you can enter in the **Log search** bar to help you monitor your Managed Lustre file system.
39+
40+
- **Aggregate operations query**: List all the UnsuspendAmlFilesystem requests for a given time duration.
41+
42+
```kusto
43+
AFSAuditLogs
44+
// The OperationName below can be replaced by obtain other operations such as "RebootAmlFilesystemNode" or "AmlFSRefreshHSMToken".
45+
| where OperationName has "UnsuspendAmlFilesystem"
46+
| project TimeGenerated, _ResourceId, ActivityId, ResultSignature, ResultDescription, Location
47+
| sort by TimeGenerated asc
48+
| limit 100
49+
```
50+
51+
- **Unauthorized requests query**: Count of failed AMLFilesystems requests due to unauthorized access.
52+
53+
```kusto
54+
AFSAuditLogs
55+
// 401 below could be replaced by other result signatures to obtain different operation results.
56+
// For example, 'ResultSignature == 202' to obtain accepted requests.
57+
| where ResultSignature == 401
58+
| summarize count() by _ResourceId, OperationName
59+
```
60+
61+
[!INCLUDE [horz-monitor-alerts](~/../azure-stack/reusable-content/ce-skilling/azure/includes/azure-monitor/horizontals/horz-monitor-alerts.md)]
62+
63+
### Azure Managed Lustre alert rules
64+
65+
The following table lists some suggested alert rules for Azure Managed Lustre. The alerts in this table are just examples. You can set alerts for any metric, log entry, or activity log entry listed in the [Azure Managed Lustre monitoring data reference](monitor-file-system-reference.md).
66+
67+
| Alert type | Condition | Description |
68+
| --- | --- | --- |
69+
| Metric | (**OST Bytes Used** / **OST Bytes Total**) > 0.85 | Storage capacity usage for the file system has exceeded 85% of total|
70+
| Metric | (**OST Files Used** / **OST Files Total**) > 0.85 | Number of files in the file system has exceeded 85% of total |
71+
72+
> [!NOTE]
73+
> The threshold value of 85% is used as an example to show an alert before the file system reaches full capacity. You can adjust the threshold based on your requirements.
74+
75+
[!INCLUDE [horz-monitor-advisor-recommendations](~/../azure-stack/reusable-content/ce-skilling/azure/includes/azure-monitor/horizontals/horz-monitor-advisor-recommendations.md)]
76+
77+
## Related content
78+
79+
- See [Azure Managed Lustre monitoring data reference](monitor-file-system-reference.md) for a reference of the metrics, logs, and other important values created for Azure Managed Lustre.
80+
- See [Monitoring Azure resources with Azure Monitor](/azure/azure-monitor/essentials/monitor-azure-resource) for general details on monitoring Azure resources.

0 commit comments

Comments
 (0)