Merge pull request #285033 from bwren/ci-monitor

v-ccolin · web-flow · commit b0e181083a9f · 2024-08-20T09:56:16.000+01:00
Update to Container insights monitoring
diff --git a/articles/azure-monitor/containers/container-insights-analyze.md b/articles/azure-monitor/containers/container-insights-analyze.md
@@ -1,16 +1,14 @@
 ---
-title: Kubernetes monitoring with Container insights | Microsoft Docs
+title: Monitor your Kubernetes cluster performance with Container insights
 description: This article describes how you can view and analyze the performance of a Kubernetes cluster with Container insights.
 ms.topic: conceptual
-ms.date: 05/17/2023
+ms.date: 08/19/2024
 ms.reviewer: viviandiec
 ---
 
 # Monitor your Kubernetes cluster performance with Container insights
 
-Use the workbooks, performance charts, and health status in Container insights to monitor the workload of Kubernetes clusters hosted on Azure Kubernetes Service (AKS), Azure Stack, or another environment.
-
-This article helps you understand the two perspectives and how Azure Monitor helps you quickly assess, investigate, and resolve detected issues.
+Use the workbooks, performance charts, and health status in Container insights to monitor the workload of Kubernetes clusters hosted on Azure Kubernetes Service (AKS), Azure Stack, or another environment. This article helps you understand how to use Azure Monitor to help you quickly assess, investigate, and resolve detected issues.
 
 
 
@@ -228,7 +226,7 @@ The icons in the status field indicate the online status of the containers.
 
 | Icon | Status |
 |--------|-------------|
-| :::image type="content" source="./media/container-insights-analyze/containers-ready-icon.png" alt-text="Ready running status icon.":::|
+| :::image type="content" source="./media/container-insights-analyze/containers-ready-icon.png" alt-text="Ready running status icon.":::| Running |
 | :::image type="content" source="./media/container-insights-analyze/containers-waiting-icon.png" alt-text="Waiting or Paused status icon."::: | Waiting or Paused|
 | :::image type="content" source="./media/container-insights-analyze/containers-grey-icon.png" alt-text="Last reported running status icon."::: | Last reported running but hasn't responded for more than 30 minutes|
 | :::image type="content" source="./media/container-insights-analyze/containers-green-icon.png" alt-text="Successful status icon."::: | Successfully stopped or failed to stop|
@@ -261,6 +259,19 @@ The information that's displayed when you view containers is described in the fo
 | Uptime | Represents the time since a container was started or rebooted. |
 | Trend Min&nbsp;%, Avg&nbsp;%, 50th&nbsp;%, 90th&nbsp;%, 95th&nbsp;%, Max&nbsp;% | Bar graph trend represents the average percentile metric percentage of the container. |
 
+### Other processes
+The *Other processes* entry in the **Node** view is intended to help you clearly understand the root cause of the high resource usage on your node. This information helps you to distinguish usage between containerized processes versus noncontainerized processes. These are noncontainerized processes that run on your node and include the following:
+          
+- Self-managed or managed Kubernetes noncontainerized processes.
+- Container run-time processes.
+- Kubelet.
+- System processes running on your node.
+- Other non-Kubernetes workloads running on node hardware or a VM.
+
+The value of *other processes* is `Total usage from CAdvisor - Usage from containerized process`. 
+
+### Status
+
 The icons in the status field indicate the online statuses of pods, as described in the following table.
 
 | Icon | Status |
@@ -275,11 +286,6 @@ The icons in the status field indicate the online statuses of pods, as described
 
 Azure Network Policy Manager includes informative Prometheus metrics that you can use to monitor and better understand your network configurations. It provides built-in visualizations in either the Azure portal or Grafana Labs. For more information, see [Monitor and visualize network configurations with Azure npm](../../virtual-network/kubernetes-network-policies.md#monitor-and-visualize-network-configurations-with-azure-npm).
 
-## Frequently asked questions
-
-This section provides answers to common questions.
-
-[!INCLUDE [container-insights-faq-what-does-other-processes-represent](../includes/container-insights-faq-what-does-other-processes-represent.md)]
 
 ## Next steps
 
diff --git a/articles/azure-monitor/containers/container-insights-deployment-hpa-metrics.md b/articles/azure-monitor/containers/container-insights-deployment-hpa-metrics.md
@@ -8,7 +8,7 @@ ms.reviewer: viviandiec
 
 # Deployment and HPA metrics with Container insights
 
-The Container insights integrated agent now collects metrics for deployments and horizontal pod autoscalers (HPAs) starting with agent version *ciprod08072020*.
+The Container insights integrated agent automatically collects metrics for deployments and horizontal pod autoscalers (HPAs).
 
 ## Deployment metrics
 
diff --git a/articles/azure-monitor/containers/container-insights-gpu-monitoring.md b/articles/azure-monitor/containers/container-insights-gpu-monitoring.md
@@ -2,25 +2,12 @@
 title: Configure GPU monitoring with Container insights
 description: This article describes how you can configure monitoring Kubernetes clusters with NVIDIA and AMD GPU enabled nodes with Container insights.
 ms.topic: conceptual
-ms.date: 08/09/2023
+ms.date: 08/19/2024
 ms.reviewer: aul
 ---
 
 # Configure GPU monitoring with Container insights
 
-Starting with agent version *ciprod03022019*, the Container insights integrated agent now supports monitoring graphical processing unit (GPU) usage on GPU-aware Kubernetes cluster nodes and monitors pods or containers that request and use GPU resources.
-
->[!NOTE]
-> As per the Kubernetes [upstream announcement](https://kubernetes.io/blog/2020/12/16/third-party-device-metrics-reaches-ga/#nvidia-gpu-metrics-deprecated), Kubernetes is deprecating GPU metrics that are being reported by the kubelet, for Kubernetes version 1.20+. As a result, Container insights will no longer be able to collect the following metrics out of the box:
->
-> * containerGpuDutyCycle
-> * containerGpumemoryTotalBytes
-> * containerGpumemoryUsedBytes
->
-> To continue collecting GPU metrics through Container insights, migrate to your GPU vendor-specific metrics exporter by December 31, 2022. Configure [Prometheus scraping](./container-insights-prometheus.md) to scrape metrics from the deployed vendor-specific exporter.
-
-## Supported GPU vendors
-
 Container insights supports monitoring GPU clusters from the following GPU vendors:
 
 - [NVIDIA](https://developer.nvidia.com/kubernetes-gpu)
@@ -41,8 +28,7 @@ Container insights automatically starts monitoring GPU usage on nodes and GPU re
 |nodeGpuAllocatable |container.azm.ms/clusterId, container.azm.ms/clusterName, gpuVendor |Number of GPUs in a node that can be used by Kubernetes. |
 |nodeGpuCapacity |container.azm.ms/clusterId, container.azm.ms/clusterName, gpuVendor |Total number of GPUs in a node. |
 
-\* Based on Kubernetes upstream changes, these metrics are no longer collected out of the box. As a temporary hotfix, for AKS, upgrade your GPU node pool to the latest version or \*-2022.06.08 or higher. For Azure Arc-enabled Kubernetes, enable the feature gate `DisableAcceleratorUsageMetrics=false` in kubelet configuration of the node and restart the kubelet. After the upstream changes reach general availability, this fix will no longer work. Make plans to migrate to using your GPU vendor-specific metrics exporter by December 31, 2022.
-
+\* Based on Kubernetes upstream changes, these metrics are no longer collected out of the box. As a temporary hotfix, for AKS, upgrade your GPU node pool to the latest version or \*-2022.06.08 or higher. For Azure Arc-enabled Kubernetes, enable the feature gate `DisableAcceleratorUsageMetrics=false` in kubelet configuration of the node and restart the kubelet. After the upstream changes reach general availability, this fix will no longer work.
 ## GPU performance charts
 
 Container insights includes preconfigured charts for the metrics listed earlier in the table as a GPU workbook for every cluster. For a description of the workbooks available for Container insights, see [Workbooks in Container insights](container-insights-reports.md).
diff --git a/articles/azure-monitor/containers/container-insights-persistent-volumes.md b/articles/azure-monitor/containers/container-insights-persistent-volumes.md
@@ -8,10 +8,6 @@ ms.reviewer: aul
 
 # Configure PV monitoring with Container insights
 
-Starting with agent version *ciprod10052020*, the Container insights integrated agent now supports monitoring persistent volume (PV) usage. With agent version *ciprod01112021*, the agent supports monitoring PV inventory, including information about the status, storage class, type, access modes, and other details.
-
-## PV metrics
-
 Container insights automatically starts monitoring PV usage by collecting the following metrics at 60-second intervals and storing them in the **InsightsMetrics** table.
 
 | Metric name | Metric dimension (tags) | Metric description |
diff --git a/articles/azure-monitor/containers/container-insights-syslog.md b/articles/azure-monitor/containers/container-insights-syslog.md
@@ -1,25 +1,28 @@
 ---
-title: Syslog collection with Container Insights
-description: This article describes how to collect Syslog from AKS nodes using Container insights.
+title: Access Syslog data in Container Insights 
+description: Describes how to access Syslog data collected from AKS nodes using Container insights.
 ms.topic: conceptual
-ms.date: 05/31/2024
+ms.date: 08/19/2024
 ms.reviewer: damendo
 ---
 
-# Syslog collection with Container Insights 
+# Access Syslog data in Container Insights 
 
 Container Insights offers the ability to collect Syslog events from Linux nodes in your [Azure Kubernetes Service (AKS)](/azure/aks/intro-kubernetes) clusters. This includes the ability to collect logs from control plane components like kubelet. Customers can also use Syslog for monitoring security and health events, typically by ingesting syslog into a SIEM system like [Microsoft Sentinel](https://azure.microsoft.com/products/microsoft-sentinel/#overview).  
 
 ## Prerequisites 
 
-- Syslog collection needs to be enabled for your cluster using the guidance in [Configure and filter log collection in Container insights](./container-insights-data-collection-configure.md).
+- Syslog collection needs to be enabled for your cluster using the guidance in [Configure and filter log collection in Container insights](./container-insights-data-collection-configure.md#configure-data-collection-using-dcr).
 - Port 28330 should be available on the host node.
 
 
-## Access Syslog data using built-in workbooks
+## Built-in workbooks
 
 To get a quick snapshot of your syslog data, use the built-in Syslog workbook using one of the following methods:
 
+> [!NOTE]
+> The **Reports** tab won't be available if you enable the [Container insights Prometheus experience](./container-insights-experience-v2.md) for your cluster.
+
 - **Reports** tab in Container Insights. 
 Navigate to your cluster in the Azure portal and open the **Insights**. Open the **Reports** tab and locate the **Syslog** workbook. 
 
@@ -30,26 +33,26 @@ Navigate to your cluster in the Azure portal. Open the **Workbooks** tab and loc
 
     :::image type="content" source="media/container-insights-syslog/syslog-workbook-container-insights-reports-tab.gif" lightbox="media/container-insights-syslog/syslog-workbook-container-insights-reports-tab.gif" alt-text="Video of Syslog workbook being accessed from cluster workbooks tab." border="true":::
 
-### Access Syslog data using a Grafana dashboard
+## Grafana dashboard
 
-Customers can use our Syslog dashboard for Grafana to get an overview of their Syslog data. Customers who create a new Azure-managed Grafana instance will have this dashboard available by default. Customers with existing instances or those running their own instance can [import the Syslog dashboard from the Grafana marketplace](https://grafana.com/grafana/dashboards/19866-azure-monitor-container-insights-syslog/). 
+If you use Grafana, you can use the Syslog dashboard for Grafana to get an overview of your Syslog data. This dashboard is available by default if you create a new Azure-managed Grafana instance. Otherwise, you can [import the Syslog dashboard from the Grafana marketplace](https://grafana.com/grafana/dashboards/19866-azure-monitor-container-insights-syslog/). 
 
 > [!NOTE]
-> You will need to have the **Monitoring Reader** role on the Subscription containing the Azure Managed Grafana instance to access syslog from Container Insights. 
+> You need the **Monitoring Reader** role on the Subscription containing the Azure Managed Grafana instance to access syslog from Container Insights. 
 
 :::image type="content" source="media/container-insights-syslog/grafana-screenshot.png" lightbox="media/container-insights-syslog/grafana-screenshot.png" alt-text="Screenshot of Syslog Grafana dashboard." border="false":::
 
-### Access Syslog data  using log queries
+## Log queries
 
 Syslog data is stored in the [Syslog](/azure/azure-monitor/reference/tables/syslog) table in your Log Analytics workspace. You can create your own [log queries](../logs/log-query-overview.md) in [Log Analytics](../logs/log-analytics-overview.md) to analyze this data or use any of the [prebuilt queries](../logs/log-query-overview.md).
 
 :::image type="content" source="media/container-insights-syslog/azmon-3.png" lightbox="media/container-insights-syslog/azmon-3.png" alt-text="Screenshot of Syslog query loaded in the query editor in the Azure Monitor Portal UI." border="false":::    
 
-You can open Log Analytics from the **Logs** menu in the **Monitor** menu to access Syslog data for all clusters or from the AKs cluster's menu to access Syslog data for only that cluster.
+You can open Log Analytics from the **Logs** menu in the **Monitor** menu to access Syslog data for all clusters or from the AKS cluster's menu to access Syslog data for a single cluster.
  
 :::image type="content" source="media/container-insights-syslog/aks-4.png" lightbox="media/container-insights-syslog/aks-4.png" alt-text="Screenshot of Query editor with Syslog query." border="false":::
   
-#### Sample queries
+### Sample queries
   
 The following table provides different examples of log queries that retrieve Syslog records.
 
@@ -62,24 +65,6 @@ The following table provides different examples of log queries that retrieve Sys
 | `Syslog | where ProcessName == "kubelet"` | All Syslog records from the kubelet process |
 | `Syslog | where ProcessName == "kubelet" and  SeverityLevel == "error"` | Syslog records from kubelet process with errors |
 
-## Editing your Syslog collection settings
-
-To modify the configuration for your Syslog collection, you modify the [data collection rule (DCR)](../essentials/data-collection-rule-overview.md) that was created when you enabled it. 
-
-Select **Data Collection Rules** from the **Monitor** menu in the Azure portal. 
-
-:::image type="content" source="media/container-insights-syslog/dcr-1.png" lightbox="media/container-insights-syslog/dcr-1.png" alt-text="Screenshot of Data Collection Rules tab in the Azure Monitor portal UI." border="false":::
-
-Select your DCR and then **View data sources**. Select the **Linux Syslog** data source to view the Syslog collection details.
->[!NOTE]
-> A DCR is created automatically when you enable syslog. The DCR follows the naming convention `MSCI-<WorkspaceRegion>-<ClusterName>`.
-
-:::image type="content" source="media/container-insights-syslog/dcr-3.png" lightbox="media/container-insights-syslog/dcr-3.png" alt-text="Screenshot of Data Sources tab for Syslog data collection rule." border="false":::
-
-Select the minimum log level for each facility that you want to collect.
-
-:::image type="content" source="media/container-insights-syslog/dcr-4.png" lightbox="media/container-insights-syslog/dcr-4.png" alt-text="Screenshot of Configuration panel for Syslog data collection rule." border="false":::
-
 
 
 ## Next steps
diff --git a/articles/azure-monitor/containers/kubernetes-metric-alerts.md b/articles/azure-monitor/containers/kubernetes-metric-alerts.md
@@ -2,26 +2,18 @@
 title: Recommended alert rules for Kubernetes clusters
 description: Describes how to enable recommended metric alerts rules for a Kubernetes cluster in Azure Monitor.
 ms.topic: conceptual
-ms.date: 06/17/2024
+ms.date: 08/19/2024
 ms.reviewer: vdiec
 ---
 
 # Recommended alert rules for Kubernetes clusters
 [Alerts](../alerts/alerts-overview.md) in Azure Monitor proactively identify issues related to the health and performance of your Azure resources. This article describes how to enable and edit a set of recommended metric alert rules that are predefined for your Kubernetes clusters. 
 
-## Types of alert rules
-There are two types of metric alert rules used with Kubernetes clusters.
-
-| Alert rule type | Description |
-|:---|:---|
-| [Prometheus metric alert rules](../alerts/alerts-types.md#prometheus-alerts) | Use metric data collected from your Kubernetes cluster in a [Azure Monitor managed service for Prometheus](../essentials/prometheus-metrics-overview.md). These rules require [Prometheus to be enabled on your cluster](./kubernetes-monitoring-enable.md#enable-prometheus-and-grafana) and are stored in a [Prometheus rule group](../essentials/prometheus-rule-groups.md). |
-| [Platform metric alert rules](../alerts/alerts-types.md#metric-alerts) | Use metrics that are automatically collected from your AKS cluster and are stored as [Azure Monitor alert rules](../alerts/alerts-overview.md). |
-
 ## Enable recommended alert rules
 Use one of the following methods to enable the recommended alert rules for your cluster. You can enable both Prometheus and platform metric alert rules for the same cluster.
 
 >[!NOTE]
->To enable recommended alerts on Arc-enabled Kubernetes clusters, ARM templates are the only supported method.
+> ARM templates are the only supported method to enable recommended alerts on Arc-enabled Kubernetes clusters.
 >
 
 ### [Azure portal](#tab/portal)
@@ -189,7 +181,7 @@ The following tables list the details of each recommended alert rule. Source cod
 
 ## Legacy Container insights metric alerts (preview)
 
-Metric rules in Container insights will be retired on May 31, 2024 (this was previously announced as March 14, 2026). These rules haven't been available for creation using the portal since August 15, 2023. These rules were in public preview but will be retired without reaching general availability since the new recommended metric alerts described in this article are now available.
+Metric rules in Container insights were retired on May 31, 2024. These rules were in public preview but were retired without reaching general availability since the new recommended metric alerts described in this article are now available.
 
 If you already enabled these legacy alert rules, you should disable them and enable the new experience. 
 
diff --git a/articles/azure-monitor/toc.yml b/articles/azure-monitor/toc.yml
@@ -536,7 +536,7 @@ items:
         href: containers/container-insights-deployment-hpa-metrics.md
       - name: Monitor Persistent Volumes (PVs)
         href: containers/container-insights-persistent-volumes.md
-      - name: Monitor Security with Syslog 
+      - name: Monitor Syslog 
         href: containers/container-insights-syslog.md  
       - name: Reports tab
         href: containers/container-insights-reports.md