Skip to content

Commit b0e1810

Browse files
authored
Merge pull request #285033 from bwren/ci-monitor
Update to Container insights monitoring
2 parents c71c98c + 910f9df commit b0e1810

7 files changed

+39
-74
lines changed

articles/azure-monitor/containers/container-insights-analyze.md

Lines changed: 17 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,16 +1,14 @@
11
---
2-
title: Kubernetes monitoring with Container insights | Microsoft Docs
2+
title: Monitor your Kubernetes cluster performance with Container insights
33
description: This article describes how you can view and analyze the performance of a Kubernetes cluster with Container insights.
44
ms.topic: conceptual
5-
ms.date: 05/17/2023
5+
ms.date: 08/19/2024
66
ms.reviewer: viviandiec
77
---
88

99
# Monitor your Kubernetes cluster performance with Container insights
1010

11-
Use the workbooks, performance charts, and health status in Container insights to monitor the workload of Kubernetes clusters hosted on Azure Kubernetes Service (AKS), Azure Stack, or another environment.
12-
13-
This article helps you understand the two perspectives and how Azure Monitor helps you quickly assess, investigate, and resolve detected issues.
11+
Use the workbooks, performance charts, and health status in Container insights to monitor the workload of Kubernetes clusters hosted on Azure Kubernetes Service (AKS), Azure Stack, or another environment. This article helps you understand how to use Azure Monitor to help you quickly assess, investigate, and resolve detected issues.
1412

1513

1614

@@ -228,7 +226,7 @@ The icons in the status field indicate the online status of the containers.
228226

229227
| Icon | Status |
230228
|--------|-------------|
231-
| :::image type="content" source="./media/container-insights-analyze/containers-ready-icon.png" alt-text="Ready running status icon.":::|
229+
| :::image type="content" source="./media/container-insights-analyze/containers-ready-icon.png" alt-text="Ready running status icon.":::| Running |
232230
| :::image type="content" source="./media/container-insights-analyze/containers-waiting-icon.png" alt-text="Waiting or Paused status icon."::: | Waiting or Paused|
233231
| :::image type="content" source="./media/container-insights-analyze/containers-grey-icon.png" alt-text="Last reported running status icon."::: | Last reported running but hasn't responded for more than 30 minutes|
234232
| :::image type="content" source="./media/container-insights-analyze/containers-green-icon.png" alt-text="Successful status icon."::: | Successfully stopped or failed to stop|
@@ -261,6 +259,19 @@ The information that's displayed when you view containers is described in the fo
261259
| Uptime | Represents the time since a container was started or rebooted. |
262260
| Trend Min %, Avg %, 50th %, 90th %, 95th %, Max % | Bar graph trend represents the average percentile metric percentage of the container. |
263261

262+
### Other processes
263+
The *Other processes* entry in the **Node** view is intended to help you clearly understand the root cause of the high resource usage on your node. This information helps you to distinguish usage between containerized processes versus noncontainerized processes. These are noncontainerized processes that run on your node and include the following:
264+
265+
- Self-managed or managed Kubernetes noncontainerized processes.
266+
- Container run-time processes.
267+
- Kubelet.
268+
- System processes running on your node.
269+
- Other non-Kubernetes workloads running on node hardware or a VM.
270+
271+
The value of *other processes* is `Total usage from CAdvisor - Usage from containerized process`.
272+
273+
### Status
274+
264275
The icons in the status field indicate the online statuses of pods, as described in the following table.
265276

266277
| Icon | Status |
@@ -275,11 +286,6 @@ The icons in the status field indicate the online statuses of pods, as described
275286

276287
Azure Network Policy Manager includes informative Prometheus metrics that you can use to monitor and better understand your network configurations. It provides built-in visualizations in either the Azure portal or Grafana Labs. For more information, see [Monitor and visualize network configurations with Azure npm](../../virtual-network/kubernetes-network-policies.md#monitor-and-visualize-network-configurations-with-azure-npm).
277288

278-
## Frequently asked questions
279-
280-
This section provides answers to common questions.
281-
282-
[!INCLUDE [container-insights-faq-what-does-other-processes-represent](../includes/container-insights-faq-what-does-other-processes-represent.md)]
283289

284290
## Next steps
285291

articles/azure-monitor/containers/container-insights-deployment-hpa-metrics.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ ms.reviewer: viviandiec
88

99
# Deployment and HPA metrics with Container insights
1010

11-
The Container insights integrated agent now collects metrics for deployments and horizontal pod autoscalers (HPAs) starting with agent version *ciprod08072020*.
11+
The Container insights integrated agent automatically collects metrics for deployments and horizontal pod autoscalers (HPAs).
1212

1313
## Deployment metrics
1414

articles/azure-monitor/containers/container-insights-gpu-monitoring.md

Lines changed: 2 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -2,25 +2,12 @@
22
title: Configure GPU monitoring with Container insights
33
description: This article describes how you can configure monitoring Kubernetes clusters with NVIDIA and AMD GPU enabled nodes with Container insights.
44
ms.topic: conceptual
5-
ms.date: 08/09/2023
5+
ms.date: 08/19/2024
66
ms.reviewer: aul
77
---
88

99
# Configure GPU monitoring with Container insights
1010

11-
Starting with agent version *ciprod03022019*, the Container insights integrated agent now supports monitoring graphical processing unit (GPU) usage on GPU-aware Kubernetes cluster nodes and monitors pods or containers that request and use GPU resources.
12-
13-
>[!NOTE]
14-
> As per the Kubernetes [upstream announcement](https://kubernetes.io/blog/2020/12/16/third-party-device-metrics-reaches-ga/#nvidia-gpu-metrics-deprecated), Kubernetes is deprecating GPU metrics that are being reported by the kubelet, for Kubernetes version 1.20+. As a result, Container insights will no longer be able to collect the following metrics out of the box:
15-
>
16-
> * containerGpuDutyCycle
17-
> * containerGpumemoryTotalBytes
18-
> * containerGpumemoryUsedBytes
19-
>
20-
> To continue collecting GPU metrics through Container insights, migrate to your GPU vendor-specific metrics exporter by December 31, 2022. Configure [Prometheus scraping](./container-insights-prometheus.md) to scrape metrics from the deployed vendor-specific exporter.
21-
22-
## Supported GPU vendors
23-
2411
Container insights supports monitoring GPU clusters from the following GPU vendors:
2512

2613
- [NVIDIA](https://developer.nvidia.com/kubernetes-gpu)
@@ -41,8 +28,7 @@ Container insights automatically starts monitoring GPU usage on nodes and GPU re
4128
|nodeGpuAllocatable |container.azm.ms/clusterId, container.azm.ms/clusterName, gpuVendor |Number of GPUs in a node that can be used by Kubernetes. |
4229
|nodeGpuCapacity |container.azm.ms/clusterId, container.azm.ms/clusterName, gpuVendor |Total number of GPUs in a node. |
4330

44-
\* Based on Kubernetes upstream changes, these metrics are no longer collected out of the box. As a temporary hotfix, for AKS, upgrade your GPU node pool to the latest version or \*-2022.06.08 or higher. For Azure Arc-enabled Kubernetes, enable the feature gate `DisableAcceleratorUsageMetrics=false` in kubelet configuration of the node and restart the kubelet. After the upstream changes reach general availability, this fix will no longer work. Make plans to migrate to using your GPU vendor-specific metrics exporter by December 31, 2022.
45-
31+
\* Based on Kubernetes upstream changes, these metrics are no longer collected out of the box. As a temporary hotfix, for AKS, upgrade your GPU node pool to the latest version or \*-2022.06.08 or higher. For Azure Arc-enabled Kubernetes, enable the feature gate `DisableAcceleratorUsageMetrics=false` in kubelet configuration of the node and restart the kubelet. After the upstream changes reach general availability, this fix will no longer work.
4632
## GPU performance charts
4733

4834
Container insights includes preconfigured charts for the metrics listed earlier in the table as a GPU workbook for every cluster. For a description of the workbooks available for Container insights, see [Workbooks in Container insights](container-insights-reports.md).

articles/azure-monitor/containers/container-insights-persistent-volumes.md

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -8,10 +8,6 @@ ms.reviewer: aul
88

99
# Configure PV monitoring with Container insights
1010

11-
Starting with agent version *ciprod10052020*, the Container insights integrated agent now supports monitoring persistent volume (PV) usage. With agent version *ciprod01112021*, the agent supports monitoring PV inventory, including information about the status, storage class, type, access modes, and other details.
12-
13-
## PV metrics
14-
1511
Container insights automatically starts monitoring PV usage by collecting the following metrics at 60-second intervals and storing them in the **InsightsMetrics** table.
1612

1713
| Metric name | Metric dimension (tags) | Metric description |

articles/azure-monitor/containers/container-insights-syslog.md

Lines changed: 15 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -1,25 +1,28 @@
11
---
2-
title: Syslog collection with Container Insights
3-
description: This article describes how to collect Syslog from AKS nodes using Container insights.
2+
title: Access Syslog data in Container Insights
3+
description: Describes how to access Syslog data collected from AKS nodes using Container insights.
44
ms.topic: conceptual
5-
ms.date: 05/31/2024
5+
ms.date: 08/19/2024
66
ms.reviewer: damendo
77
---
88

9-
# Syslog collection with Container Insights
9+
# Access Syslog data in Container Insights
1010

1111
Container Insights offers the ability to collect Syslog events from Linux nodes in your [Azure Kubernetes Service (AKS)](/azure/aks/intro-kubernetes) clusters. This includes the ability to collect logs from control plane components like kubelet. Customers can also use Syslog for monitoring security and health events, typically by ingesting syslog into a SIEM system like [Microsoft Sentinel](https://azure.microsoft.com/products/microsoft-sentinel/#overview).
1212

1313
## Prerequisites
1414

15-
- Syslog collection needs to be enabled for your cluster using the guidance in [Configure and filter log collection in Container insights](./container-insights-data-collection-configure.md).
15+
- Syslog collection needs to be enabled for your cluster using the guidance in [Configure and filter log collection in Container insights](./container-insights-data-collection-configure.md#configure-data-collection-using-dcr).
1616
- Port 28330 should be available on the host node.
1717

1818

19-
## Access Syslog data using built-in workbooks
19+
## Built-in workbooks
2020

2121
To get a quick snapshot of your syslog data, use the built-in Syslog workbook using one of the following methods:
2222

23+
> [!NOTE]
24+
> The **Reports** tab won't be available if you enable the [Container insights Prometheus experience](./container-insights-experience-v2.md) for your cluster.
25+
2326
- **Reports** tab in Container Insights.
2427
Navigate to your cluster in the Azure portal and open the **Insights**. Open the **Reports** tab and locate the **Syslog** workbook.
2528

@@ -30,26 +33,26 @@ Navigate to your cluster in the Azure portal. Open the **Workbooks** tab and loc
3033

3134
:::image type="content" source="media/container-insights-syslog/syslog-workbook-container-insights-reports-tab.gif" lightbox="media/container-insights-syslog/syslog-workbook-container-insights-reports-tab.gif" alt-text="Video of Syslog workbook being accessed from cluster workbooks tab." border="true":::
3235

33-
### Access Syslog data using a Grafana dashboard
36+
## Grafana dashboard
3437

35-
Customers can use our Syslog dashboard for Grafana to get an overview of their Syslog data. Customers who create a new Azure-managed Grafana instance will have this dashboard available by default. Customers with existing instances or those running their own instance can [import the Syslog dashboard from the Grafana marketplace](https://grafana.com/grafana/dashboards/19866-azure-monitor-container-insights-syslog/).
38+
If you use Grafana, you can use the Syslog dashboard for Grafana to get an overview of your Syslog data. This dashboard is available by default if you create a new Azure-managed Grafana instance. Otherwise, you can [import the Syslog dashboard from the Grafana marketplace](https://grafana.com/grafana/dashboards/19866-azure-monitor-container-insights-syslog/).
3639

3740
> [!NOTE]
38-
> You will need to have the **Monitoring Reader** role on the Subscription containing the Azure Managed Grafana instance to access syslog from Container Insights.
41+
> You need the **Monitoring Reader** role on the Subscription containing the Azure Managed Grafana instance to access syslog from Container Insights.
3942
4043
:::image type="content" source="media/container-insights-syslog/grafana-screenshot.png" lightbox="media/container-insights-syslog/grafana-screenshot.png" alt-text="Screenshot of Syslog Grafana dashboard." border="false":::
4144

42-
### Access Syslog data using log queries
45+
## Log queries
4346

4447
Syslog data is stored in the [Syslog](/azure/azure-monitor/reference/tables/syslog) table in your Log Analytics workspace. You can create your own [log queries](../logs/log-query-overview.md) in [Log Analytics](../logs/log-analytics-overview.md) to analyze this data or use any of the [prebuilt queries](../logs/log-query-overview.md).
4548

4649
:::image type="content" source="media/container-insights-syslog/azmon-3.png" lightbox="media/container-insights-syslog/azmon-3.png" alt-text="Screenshot of Syslog query loaded in the query editor in the Azure Monitor Portal UI." border="false":::
4750

48-
You can open Log Analytics from the **Logs** menu in the **Monitor** menu to access Syslog data for all clusters or from the AKs cluster's menu to access Syslog data for only that cluster.
51+
You can open Log Analytics from the **Logs** menu in the **Monitor** menu to access Syslog data for all clusters or from the AKS cluster's menu to access Syslog data for a single cluster.
4952

5053
:::image type="content" source="media/container-insights-syslog/aks-4.png" lightbox="media/container-insights-syslog/aks-4.png" alt-text="Screenshot of Query editor with Syslog query." border="false":::
5154

52-
#### Sample queries
55+
### Sample queries
5356

5457
The following table provides different examples of log queries that retrieve Syslog records.
5558

@@ -62,24 +65,6 @@ The following table provides different examples of log queries that retrieve Sys
6265
| `Syslog | where ProcessName == "kubelet"` | All Syslog records from the kubelet process |
6366
| `Syslog | where ProcessName == "kubelet" and SeverityLevel == "error"` | Syslog records from kubelet process with errors |
6467

65-
## Editing your Syslog collection settings
66-
67-
To modify the configuration for your Syslog collection, you modify the [data collection rule (DCR)](../essentials/data-collection-rule-overview.md) that was created when you enabled it.
68-
69-
Select **Data Collection Rules** from the **Monitor** menu in the Azure portal.
70-
71-
:::image type="content" source="media/container-insights-syslog/dcr-1.png" lightbox="media/container-insights-syslog/dcr-1.png" alt-text="Screenshot of Data Collection Rules tab in the Azure Monitor portal UI." border="false":::
72-
73-
Select your DCR and then **View data sources**. Select the **Linux Syslog** data source to view the Syslog collection details.
74-
>[!NOTE]
75-
> A DCR is created automatically when you enable syslog. The DCR follows the naming convention `MSCI-<WorkspaceRegion>-<ClusterName>`.
76-
77-
:::image type="content" source="media/container-insights-syslog/dcr-3.png" lightbox="media/container-insights-syslog/dcr-3.png" alt-text="Screenshot of Data Sources tab for Syslog data collection rule." border="false":::
78-
79-
Select the minimum log level for each facility that you want to collect.
80-
81-
:::image type="content" source="media/container-insights-syslog/dcr-4.png" lightbox="media/container-insights-syslog/dcr-4.png" alt-text="Screenshot of Configuration panel for Syslog data collection rule." border="false":::
82-
8368

8469

8570
## Next steps

articles/azure-monitor/containers/kubernetes-metric-alerts.md

Lines changed: 3 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -2,26 +2,18 @@
22
title: Recommended alert rules for Kubernetes clusters
33
description: Describes how to enable recommended metric alerts rules for a Kubernetes cluster in Azure Monitor.
44
ms.topic: conceptual
5-
ms.date: 06/17/2024
5+
ms.date: 08/19/2024
66
ms.reviewer: vdiec
77
---
88

99
# Recommended alert rules for Kubernetes clusters
1010
[Alerts](../alerts/alerts-overview.md) in Azure Monitor proactively identify issues related to the health and performance of your Azure resources. This article describes how to enable and edit a set of recommended metric alert rules that are predefined for your Kubernetes clusters.
1111

12-
## Types of alert rules
13-
There are two types of metric alert rules used with Kubernetes clusters.
14-
15-
| Alert rule type | Description |
16-
|:---|:---|
17-
| [Prometheus metric alert rules](../alerts/alerts-types.md#prometheus-alerts) | Use metric data collected from your Kubernetes cluster in a [Azure Monitor managed service for Prometheus](../essentials/prometheus-metrics-overview.md). These rules require [Prometheus to be enabled on your cluster](./kubernetes-monitoring-enable.md#enable-prometheus-and-grafana) and are stored in a [Prometheus rule group](../essentials/prometheus-rule-groups.md). |
18-
| [Platform metric alert rules](../alerts/alerts-types.md#metric-alerts) | Use metrics that are automatically collected from your AKS cluster and are stored as [Azure Monitor alert rules](../alerts/alerts-overview.md). |
19-
2012
## Enable recommended alert rules
2113
Use one of the following methods to enable the recommended alert rules for your cluster. You can enable both Prometheus and platform metric alert rules for the same cluster.
2214

2315
>[!NOTE]
24-
>To enable recommended alerts on Arc-enabled Kubernetes clusters, ARM templates are the only supported method.
16+
> ARM templates are the only supported method to enable recommended alerts on Arc-enabled Kubernetes clusters.
2517
>
2618
2719
### [Azure portal](#tab/portal)
@@ -189,7 +181,7 @@ The following tables list the details of each recommended alert rule. Source cod
189181

190182
## Legacy Container insights metric alerts (preview)
191183

192-
Metric rules in Container insights will be retired on May 31, 2024 (this was previously announced as March 14, 2026). These rules haven't been available for creation using the portal since August 15, 2023. These rules were in public preview but will be retired without reaching general availability since the new recommended metric alerts described in this article are now available.
184+
Metric rules in Container insights were retired on May 31, 2024. These rules were in public preview but were retired without reaching general availability since the new recommended metric alerts described in this article are now available.
193185

194186
If you already enabled these legacy alert rules, you should disable them and enable the new experience.
195187

articles/azure-monitor/toc.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -536,7 +536,7 @@ items:
536536
href: containers/container-insights-deployment-hpa-metrics.md
537537
- name: Monitor Persistent Volumes (PVs)
538538
href: containers/container-insights-persistent-volumes.md
539-
- name: Monitor Security with Syslog
539+
- name: Monitor Syslog
540540
href: containers/container-insights-syslog.md
541541
- name: Reports tab
542542
href: containers/container-insights-reports.md

0 commit comments

Comments
 (0)