Skip to content

Commit d16385d

Browse files
authored
Merge pull request #64135 from gracewehner/grwehner/container-insights-pv-usage
grwehner/add pv usage info to container insights docs
2 parents cbc5c56 + 50c60c0 commit d16385d

File tree

3 files changed

+43
-19
lines changed

3 files changed

+43
-19
lines changed

articles/azure-monitor/insights/container-insights-agent-config.md

Lines changed: 13 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
title: Configure Azure Monitor for containers agent data collection | Microsoft Docs
33
description: This article describes how you can configure the Azure Monitor for containers agent to control stdout/stderr and environment variables log collection.
44
ms.topic: conceptual
5-
ms.date: 06/01/2020
5+
ms.date: 10/09/2020
66
---
77

88
# Configure agent data collection for Azure Monitor for containers
@@ -24,7 +24,7 @@ A template ConfigMap file is provided that allows you to easily edit it with you
2424
2525
### Data collection settings
2626

27-
The following are the settings that can be configured to control data collection.
27+
The following table describes the settings you can configure to control data collection:
2828

2929
| Key | Data type | Value | Description |
3030
|--|--|--|--|
@@ -38,16 +38,24 @@ The following are the settings that can be configured to control data collection
3838
| `[log_collection_settings.enrich_container_logs] enabled =` | Boolean | true or false | This setting controls container log enrichment to populate the Name and Image property values<br> for every log record written to the ContainerLog table for all container logs in the cluster.<br> It defaults to `enabled = false` when not specified in ConfigMap. |
3939
| `[log_collection_settings.collect_all_kube_events]` | Boolean | true or false | This setting allows the collection of Kube events of all types.<br> By default the Kube events with type *Normal* are not collected. When this setting is set to `true`, the *Normal* events are no longer filtered and all events are collected.<br> By default, this is set to `false`. |
4040

41+
### Metric collection settings
42+
43+
The following table describes the settings you can configure to control metric collection:
44+
45+
| Key | Data type | Value | Description |
46+
|--|--|--|--|
47+
| `[metric_collection_settings.collect_kube_system_pv_metrics] enabled =` | Boolean | true or false | This setting allows persistent volume (PV) usage metrics to be collected in the kube-system namespace. By default, usage metrics for persistent volumes with persistent volume claims in the kube-system namespace are not collected. When this setting is set to `true`, PV usage metrics for all namespaces are collected. By default, this is set to `false`. |
48+
4149
ConfigMaps is a global list and there can be only one ConfigMap applied to the agent. You cannot have another ConfigMaps overruling the collections.
4250

4351
## Configure and deploy ConfigMaps
4452

4553
Perform the following steps to configure and deploy your ConfigMap configuration file to your cluster.
4654

47-
1. [Download](https://github.com/microsoft/OMS-docker/blob/ci_feature_prod/Kubernetes/container-azm-ms-agentconfig.yaml) the template ConfigMap yaml file and save it as container-azm-ms-agentconfig.yaml.
55+
1. Download the [template ConfigMap YAML file](https://github.com/microsoft/Docker-Provider/blob/ci_prod/kubernetes/container-azm-ms-agentconfig.yaml) and save it as container-azm-ms-agentconfig.yaml.
4856

49-
>[!NOTE]
50-
>This step is not required when working with Azure Red Hat OpenShift since the ConfigMap template already exists on the cluster.
57+
> [!NOTE]
58+
> This step is not required when working with Azure Red Hat OpenShift because the ConfigMap template already exists on the cluster.
5159
5260
2. Edit the ConfigMap yaml file with your customizations to collect stdout, stderr, and/or environmental variables. If you are editing the ConfigMap yaml file for Azure Red Hat OpenShift, first run the command `oc edit configmaps container-azm-ms-agentconfig -n openshift-azure-logging` to open the file in a text editor.
5361

articles/azure-monitor/insights/container-insights-metric-alerts.md

Lines changed: 28 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
title: Metric alerts from Azure Monitor for containers
33
description: This article reviews the recommended metric alerts available from Azure Monitor for containers in public preview.
44
ms.topic: conceptual
5-
ms.date: 09/24/2020
5+
ms.date: 10/09/2020
66

77
---
88

@@ -41,6 +41,7 @@ To alert on what matters, Azure Monitor for containers includes the following me
4141
|Average container working set memory % |Calculates average working set memory used per container.|When average working set memory usage per container is greater than 95%. |
4242
|Average CPU % |Calculates average CPU used per node. |When average node CPU utilization is greater than 80% |
4343
|Average Disk Usage % |Calculates average disk usage for a node.|When disk usage for a node is greater than 80%. |
44+
|Average Persistent Volume Usage % |Calculates average PV usage per pod. |When average PV usage per pod is greater than 80%.|
4445
|Average Working set memory % |Calculates average Working set memory for a node. |When average Working set memory for a node is greater than 80%. |
4546
|Restarting container count |Calculates number of restarting containers. | When container restarts are greater than 0. |
4647
|Failed Pod Counts |Calculates if any pod in failed state.|When a number of pods in failed state are greater than 0. |
@@ -71,6 +72,8 @@ The following alert-based metrics have unique behavior characteristics compared
7172

7273
* *cpuExceededPercentage*, *memoryRssExceededPercentage*, and *memoryWorkingSetExceededPercentage* metrics are sent when the CPU, memory Rss, and Memory Working set values exceed the configured threshold (the default threshold is 95%). These thresholds are exclusive of the alert condition threshold specified for the corresponding alert rule. Meaning, if you want to collect these metrics and analyze them from [Metrics explorer](../platform/metrics-getting-started.md), we recommend you configure the threshold to a value lower than your alerting threshold. The configuration related to the collection settings for their container resource utilization thresholds can be overridden in the ConfigMaps file under the section `[alertable_metrics_configuration_settings.container_resource_utilization_thresholds]`. See the section [Configure alertable metrics ConfigMaps](#configure-alertable-metrics-in-configmaps) for details related to configuring your ConfigMap configuration file.
7374

75+
* *pvUsageExceededPercentage* metric is sent when the persistent volume usage percentage exceeds the configured threshold (the default threshold is 60%). This threshold is exclusive of the alert condition threshold specified for the corresponding alert rule. Meaning, if you want to collect these metrics and analyze them from [Metrics explorer](../platform/metrics-getting-started.md), we recommend you configure the threshold to a value lower than your alerting threshold. The configuration related to the collection settings for persistent volume utilization thresholds can be overridden in the ConfigMaps file under the section `[alertable_metrics_configuration_settings.pv_utilization_thresholds]`. See the section [Configure alertable metrics ConfigMaps](#configure-alertable-metrics-in-configmaps) for details related to configuring your ConfigMap configuration file. Collection of persistent volume metrics with claims in the *kube-system* namespace are excluded by default. To enable collection in this namespace, use the section `[metric_collection_settings.collect_kube_system_pv_metrics]` in the ConfigMap file. See [Metric collection settings](https://docs.microsoft.com/azure/azure-monitor/insights/container-insights-agent-config#metric-collection-settings) for details.
76+
7477
## Metrics collected
7578

7679
The following metrics are enabled and collected, unless otherwise specified, as part of this feature:
@@ -93,6 +96,7 @@ The following metrics are enabled and collected, unless otherwise specified, as
9396
|Insights.container/containers |cpuExceededPercentage |CPU utilization percentage for containers exceeding user configurable threshold (default is 95.0) by container name, controller name, Kubernetes namespace, pod name.<br> Collected |
9497
|Insights.container/containers |memoryRssExceededPercentage |Memory RSS percentage for containers exceeding user configurable threshold (default is 95.0) by container name, controller name, Kubernetes namespace, pod name.|
9598
|Insights.container/containers |memoryWorkingSetExceededPercentage |Memory Working Set percentage for containers exceeding user configurable threshold (default is 95.0) by container name, controller name, Kubernetes namespace, pod name.|
99+
|Insights.container/persistentvolumes |pvUsageExceededPercentage |PV utilization percentage for persistent volumes exceeding user configurable threshold (default is 60.0) by claim name, Kubernetes namespace, volume name, pod name, and node name.
96100

97101
## Enable alert rules
98102

@@ -203,29 +207,40 @@ To view alerts created for the enabled rules, in the **Recommended alerts** pane
203207
204208
## Configure alertable metrics in ConfigMaps
205209
206-
Perform the following steps to configure your ConfigMap configuration file to override the default container resource utilization thresholds. These steps are only applicable for the following alertable metrics.
210+
Perform the following steps to configure your ConfigMap configuration file to override the default utilization thresholds. These steps are applicable only for the following alertable metrics:
207211
208212
* *cpuExceededPercentage*
209213
* *memoryRssExceededPercentage*
210214
* *memoryWorkingSetExceededPercentage*
215+
* *pvUsageExceededPercentage*
211216
212-
1. Edit the ConfigMap yaml file under the section `[alertable_metrics_configuration_settings.container_resource_utilization_thresholds]`.
217+
1. Edit the ConfigMap YAML file under the section `[alertable_metrics_configuration_settings.container_resource_utilization_thresholds]` or `[alertable_metrics_configuration_settings.pv_utilization_thresholds]`.
213218
214-
2. To to modify the *cpuExceededPercentage* threshold to 90% and begin collection of this metric when that threshold is met and exceeded, configure the ConfigMap file using the following example.
219+
- To modify the *cpuExceededPercentage* threshold to 90% and begin collection of this metric when that threshold is met and exceeded, configure the ConfigMap file using the following example:
215220
216-
```
217-
container_cpu_threshold_percentage = 90.0
218-
# Threshold for container memoryRss, metric will be sent only when memory rss exceeds or becomes equal to the following percentage
219-
container_memory_rss_threshold_percentage = 95.0
220-
# Threshold for container memoryWorkingSet, metric will be sent only when memory working set exceeds or becomes equal to the following percentage
221-
container_memory_working_set_threshold_percentage = 95.0
222-
```
221+
```
222+
[alertable_metrics_configuration_settings.container_resource_utilization_thresholds]
223+
# Threshold for container cpu, metric will be sent only when cpu utilization exceeds or becomes equal to the following percentage
224+
container_cpu_threshold_percentage = 90.0
225+
# Threshold for container memoryRss, metric will be sent only when memory rss exceeds or becomes equal to the following percentage
226+
container_memory_rss_threshold_percentage = 95.0
227+
# Threshold for container memoryWorkingSet, metric will be sent only when memory working set exceeds or becomes equal to the following percentage
228+
container_memory_working_set_threshold_percentage = 95.0
229+
```
230+
231+
- To modify the *pvUsageExceededPercentage* threshold to 80% and begin collection of this metric when that threshold is met and exceeded, configure the ConfigMap file using the following example:
232+
233+
```
234+
[alertable_metrics_configuration_settings.pv_utilization_thresholds]
235+
# Threshold for persistent volume usage bytes, metric will be sent only when persistent volume utilization exceeds or becomes equal to the following percentage
236+
pv_usage_threshold_percentage = 80.0
237+
```
223238
224-
3. Run the following kubectl command: `kubectl apply -f <configmap_yaml_file.yaml>`.
239+
2. Run the following kubectl command: `kubectl apply -f <configmap_yaml_file.yaml>`.
225240
226241
Example: `kubectl apply -f container-azm-ms-agentconfig.yaml`.
227242
228-
The configuration change can take a few minutes to finish before taking effect, and all omsagent pods in the cluster will restart. The restart is a rolling restart for all omsagent pods, not all restart at the same time. When the restarts are finished, a message is displayed that's similar to the following and includes the result: `configmap "container-azm-ms-agentconfig" created`.
243+
The configuration change can take a few minutes to finish before taking effect, and all omsagent pods in the cluster will restart. The restart is a rolling restart for all omsagent pods; they don't all restart at the same time. When the restarts are finished, a message is displayed that's similar to the following example and includes the result: `configmap "container-azm-ms-agentconfig" created`.
229244
230245
## Next steps
231246

articles/azure-monitor/insights/container-insights-update-metrics.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
title: How to update Azure Monitor for containers for metrics | Microsoft Docs
33
description: This article describes how you update Azure Monitor for containers to enable the custom metrics feature that supports exploring and alerting on aggregated metrics.
44
ms.topic: conceptual
5-
ms.date: 09/24/2020
5+
ms.date: 10/09/2020
66
ms.custom: devx-track-azurecli
77

88
---
@@ -22,6 +22,7 @@ The following metrics are enabled as part of this feature:
2222
| Insights.container/nodes | cpuUsageMillicores, cpuUsagePercentage, memoryRssBytes, memoryRssPercentage, memoryWorkingSetBytes, memoryWorkingSetPercentage, nodesCount, diskUsedPercentage, | As *node* metrics, they include *host* as a dimension. They also include the<br> node's name as value for the *host* dimension. |
2323
| Insights.container/pods | podCount, completedJobsCount, restartingContainerCount, oomKilledContainerCount, podReadyPercentage | As *pod* metrics, they include the following as dimensions - ControllerName, Kubernetes namespace, name, phase. |
2424
| Insights.container/containers | cpuExceededPercentage, memoryRssExceededPercentage, memoryWorkingSetExceededPercentage | |
25+
| Insights.container/persistentvolumes | pvUsageExceededPercentage | |
2526

2627
To support these new capabilities, a new containerized agent is included in the release, version **microsoft/oms:ciprod05262020** for AKS and version **microsoft/oms:ciprod09252020** for Azure Arc enabled Kubernetes clusters. New deployments of AKS automatically include this configuration change and capabilities. Updating your cluster to support this feature can be performed from the Azure portal, Azure PowerShell, or with Azure CLI. With Azure PowerShell and CLI. You can enable this per-cluster or for all clusters in your subscription.
2728

0 commit comments

Comments
 (0)