Skip to content

Commit 10e0017

Browse files
committed
fixes
1 parent fa2afa1 commit 10e0017

File tree

1 file changed

+27
-26
lines changed

1 file changed

+27
-26
lines changed

articles/virtual-network/kubernetes-network-policies.md

Lines changed: 27 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,7 @@ Azure Network Policy Management implementation works with the Azure CNI that pro
2121
When implementing security for your cluster, use network security groups (NSGs) to filter traffic entering and leaving your cluster subnet (North-South traffic). Use Azure Network Policy Manager for traffic between pods in your cluster (East-West traffic).
2222

2323
## Using Azure Network Policy Manager
24+
2425
Azure Network Policy Manager can be used in the following ways to provide micro-segmentation for pods.
2526

2627
### Azure Kubernetes Service (AKS)
@@ -49,7 +50,7 @@ For Windows:
4950

5051
The solution is also open source and the code is available on the [Azure Container Networking repository](https://github.com/Azure/azure-container-networking/tree/master/Network Policy Manager).
5152

52-
## Monitor and visualize network configurations with Azure Network Policy Manager
53+
## Monitor and visualize network configurations with Azure NPM
5354

5455
Azure Network Policy Manager includes informative Prometheus metrics that allow you to monitor and better understand your configurations. It provides built-in visualizations in either the Azure portal or Grafana Labs. You can start collecting these metrics using either Azure Monitor or a Prometheus server.
5556

@@ -81,25 +82,25 @@ See a [configuration for these alerts](#set-up-alerts-for-alertmanager) as follo
8182

8283
2. Correlate cluster counts (for example, ACLs) to execution times.
8384

84-
3. Get the human-friendly name of an ipset in a given IPTables rule (for example, `azure-Network Policy Manager-487392` represents `podlabel-role:database`).
85+
3. Get the human-friendly name of an ipset in a given IPTables rule (for example, `azure-npm-487392` represents `podlabel-role:database`).
8586

8687
### All supported metrics
8788

8889
The following list is of supported metrics. Any `quantile` label has possible values `0.5`, `0.9`, and `0.99`. Any `had_error` label has possible values `false` and `true`, representing whether the operation succeeded or failed.
8990

9091
| Metric Name | Description | Prometheus Metric Type | Labels |
9192
| ----- | ----- | ----- | ----- |
92-
| `Network Policy Manager_num_policies` | number of network policies | Gauge | - |
93-
| `Network Policy Manager_num_iptables_rules` | number of IPTables rules | Gauge | - |
94-
| `Network Policy Manager_num_ipsets` | number of IPSets | Gauge | - |
95-
| `Network Policy Manager_num_ipset_entries` | number of IP address entries in all IPSets | Gauge | - |
96-
| `Network Policy Manager_add_iptables_rule_exec_time` | runtime for adding an IPTables rule | Summary | `quantile` |
97-
| `Network Policy Manager_add_ipset_exec_time` | runtime for adding an IPSet | Summary | `quantile` |
98-
| `Network Policy Manager_ipset_counts` (advanced) | number of entries within each individual IPSet | GaugeVec | `set_name` & `set_hash` |
99-
| `Network Policy Manager_add_policy_exec_time` | runtime for adding a network policy | Summary | `quantile` & `had_error` |
100-
| `Network Policy Manager_controller_policy_exec_time` | runtime for updating/deleting a network policy | Summary | `quantile` & `had_error` & `operation` (with values `update` or `delete`) |
101-
| `Network Policy Manager_controller_namespace_exec_time` | runtime for creating/updating/deleting a namespace | Summary | `quantile` & `had_error` & `operation` (with values `create`, `update`, or `delete`) |
102-
| `Network Policy Manager_controller_pod_exec_time` | runtime for creating/updating/deleting a pod | Summary | `quantile` & `had_error` & `operation` (with values `create`, `update`, or `delete`) |
93+
| `npm_num_policies` | number of network policies | Gauge | - |
94+
| `npm_num_iptables_rules` | number of IPTables rules | Gauge | - |
95+
| `npm_num_ipsets` | number of IPSets | Gauge | - |
96+
| `npm_num_ipset_entries` | number of IP address entries in all IPSets | Gauge | - |
97+
| `npm_add_iptables_rule_exec_time` | runtime for adding an IPTables rule | Summary | `quantile` |
98+
| `npm_add_ipset_exec_time` | runtime for adding an IPSet | Summary | `quantile` |
99+
| `npm_ipset_counts` (advanced) | number of entries within each individual IPSet | GaugeVec | `set_name` & `set_hash` |
100+
| `npm_add_policy_exec_time` | runtime for adding a network policy | Summary | `quantile` & `had_error` |
101+
| `npm_controller_policy_exec_time` | runtime for updating/deleting a network policy | Summary | `quantile` & `had_error` & `operation` (with values `update` or `delete`) |
102+
| `npm_controller_namespace_exec_time` | runtime for creating/updating/deleting a namespace | Summary | `quantile` & `had_error` & `operation` (with values `create`, `update`, or `delete`) |
103+
| `npm_controller_pod_exec_time` | runtime for creating/updating/deleting a pod | Summary | `quantile` & `had_error` & `operation` (with values `create`, `update`, or `delete`) |
103104

104105
There are also "exec_time_count" and "exec_time_sum" metrics for each "exec_time" Summary metric.
105106

@@ -142,7 +143,7 @@ Besides viewing the workbook, you can also directly query the Prometheus metrics
142143

143144
```query
144145
| where TimeGenerated > ago(5h)
145-
| where Name contains "Network Policy Manager_"
146+
| where Name contains "npm_"
146147
```
147148

148149
You can also query log analytics directly for the metrics. Learn more about it with [Getting Started with Log Analytics Queries](../azure-monitor/containers/container-insights-log-query.md)
@@ -175,7 +176,7 @@ helm install prometheus stable/prometheus -n monitoring \
175176
where `prometheus-server-scrape-config.yaml` consists of
176177

177178
```
178-
- job_name: "azure-Network Policy Manager-node-metrics"
179+
- job_name: "azure-npm-node-metrics"
179180
metrics_path: /node-metrics
180181
kubernetes_sd_configs:
181182
- role: node
@@ -185,7 +186,7 @@ where `prometheus-server-scrape-config.yaml` consists of
185186
regex: ([^:]+)(?::\d+)?
186187
replacement: "$1:10091"
187188
target_label: __address__
188-
- job_name: "azure-Network Policy Manager-cluster-metrics"
189+
- job_name: "azure-npm-cluster-metrics"
189190
metrics_path: /cluster-metrics
190191
kubernetes_sd_configs:
191192
- role: service
@@ -194,19 +195,19 @@ where `prometheus-server-scrape-config.yaml` consists of
194195
regex: kube-system
195196
action: keep
196197
- source_labels: [__meta_kubernetes_service_name]
197-
regex: Network Policy Manager-metrics-cluster-service
198+
regex: npm-metrics-cluster-service
198199
action: keep
199200
# Comment from here to the end to collect advanced metrics: number of entries for each IPSet
200201
metric_relabel_configs:
201202
- source_labels: [__name__]
202-
regex: Network Policy Manager_ipset_counts
203+
regex: npm_ipset_counts
203204
action: drop
204205
```
205206

206-
You can also replace the `azure-Network Policy Manager-node-metrics` job with the following content or incorporate it into a pre-existing job for Kubernetes pods:
207+
You can also replace the `azure-npm-node-metrics` job with the following content or incorporate it into a pre-existing job for Kubernetes pods:
207208

208209
```
209-
- job_name: "azure-Network Policy Manager-node-metrics-from-pod-config"
210+
- job_name: "azure-npm-node-metrics-from-pod-config"
210211
metrics_path: /node-metrics
211212
kubernetes_sd_configs:
212213
- role: pod
@@ -229,21 +230,21 @@ If you use a Prometheus server, you can set up an AlertManager like so. Here's a
229230

230231
```
231232
groups:
232-
- name: Network Policy Manager.rules
233+
- name: npm.rules
233234
rules:
234235
# fire when Network Policy Manager has a new failure with an OS call or when translating a Network Policy (suppose there's a scraping interval of 5m)
235236
- alert: AzureNetwork Policy ManagerFailureCreatePolicy
236237
# this expression says to grab the current count minus the count 5 minutes ago, or grab the current count if there was no data 5 minutes ago
237-
expr: (Network Policy Manager_add_policy_exec_time_count{had_error='true'} - (Network Policy Manager_add_policy_exec_time_count{had_error='true'} offset 5m)) or Network Policy Manager_add_policy_exec_time_count{had_error='true'}
238+
expr: (npm_add_policy_exec_time_count{had_error='true'} - (npm_add_policy_exec_time_count{had_error='true'} offset 5m)) or npm_add_policy_exec_time_count{had_error='true'}
238239
labels:
239240
severity: warning
240-
addon: azure-Network Policy Manager
241+
addon: azure-npm
241242
annotations:
242243
summary: "Azure Network Policy Manager failed to handle a policy create event"
243244
description: "Current failure count since Network Policy Manager started: {{ $value }}"
244245
# fire when the median time to apply changes for a pod create event is more than 100 milliseconds.
245-
- alert: AzureNetwork Policy ManagerHighControllerPodCreateTimeMedian
246-
expr: topk(1, Network Policy Manager_controller_pod_exec_time{operation="create",quantile="0.5",had_error="false"}) > 100.0
246+
- alert: AzurenpmHighControllerPodCreateTimeMedian
247+
expr: topk(1, npm_controller_pod_exec_time{operation="create",quantile="0.5",had_error="false"}) > 100.0
247248
labels:
248249
severity: warning
249250
addon: azure-Network Policy Manager
@@ -252,7 +253,7 @@ groups:
252253
# could have a simpler description like the one for the alert above,
253254
# but this description includes the number of pod creates that were handled in the past 10 minutes,
254255
# which is the retention period for observations when calculating quantiles for a Prometheus Summary metric
255-
description: "value: [{{ $value }}] and observation count: [{{ printf `(Network Policy Manager_controller_pod_exec_time_count{operation='create',pod='%s',had_error='false'} - (Network Policy Manager_controller_pod_exec_time_count{operation='create',pod='%s',had_error='false'} offset 10m)) or Network Policy Manager_controller_pod_exec_time_count{operation='create',pod='%s',had_error='false'}` $labels.pod $labels.pod $labels.pod | query | first | value }}] for pod: [{{ $labels.pod }}]"
256+
description: "value: [{{ $value }}] and observation count: [{{ printf `(npm_controller_pod_exec_time_count{operation='create',pod='%s',had_error='false'} - (npm_controller_pod_exec_time_count{operation='create',pod='%s',had_error='false'} offset 10m)) or npm_controller_pod_exec_time_count{operation='create',pod='%s',had_error='false'}` $labels.pod $labels.pod $labels.pod | query | first | value }}] for pod: [{{ $labels.pod }}]"
256257
```
257258

258259
### Visualization options for Prometheus

0 commit comments

Comments
 (0)