Skip to content

Commit dd747ab

Browse files
Merge pull request #2 from Khushbu-Parekh/deanwe-patch-1
Update advanced-network-observability-concepts.md
2 parents 6571d5f + eb01c9b commit dd747ab

File tree

1 file changed

+30
-32
lines changed

1 file changed

+30
-32
lines changed

articles/aks/advanced-network-observability-concepts.md

Lines changed: 30 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
22
title: Advanced Network Observability - Advanced Container Networking Services for Azure Kubernetes Service (AKS)
3-
description: An overview of Advanced Network Observability - Advanced Container Networking Services for Azure Kubernetes Service (AKS).
3+
description: An overview of Advanced Container Networking Services'a Advanced Network Observability capabilities Azure Kubernetes Service (AKS).
44
author: Khushbu-Parekh
55
ms.author: kparekh
66
ms.service: azure-kubernetes-service
@@ -13,61 +13,59 @@ ms.date: 05/10/2024
1313

1414
Advanced Network Observability is the foundation of the [Advanced Container Networking Services](advanced-container-networking-services-overview.md) suite. It equips you with next-level monitoring and diagnostics tools, providing unparalleled visibility into your containerized workloads. These tools empower you to pinpoint and troubleshoot network issues with ease, ensuring optimal performance for your applications.
1515

16-
Advanced Network Observability offers compatibility across all Linux workloads. It seamlessly integrates with Hubble, regardless of the underlying data planes.
16+
Advanced Network Observability is compatible with all Linux workloads seamlessly integrating with Hubble regardless of whether the underlying data plane is Cilium or non-Cilium (both are supported) ensuring flexibility for your container networking needs.
1717

18-
Advanced Container Networking Services offers support for both Cilium and non-Cilium data planes, ensuring flexibility for your container networking needs.
18+
* Cilium data plane: A high-performance, OSS (open-source), eBPF-based data plane specifically designed for Kubernetes environments. For more information, see [Cilium](https://cilium.io/).
1919

20-
* Cilium Data plane: This is a high-performance, eBPF-based data plane specifically designed for Kubernetes environments. This data plane is Powered by Open-source project [Cilium](https://cilium.io/).
21-
22-
* Non-Cilium Data plane: For non-cilium data plane users, we are using an ebpf based open-source project [Retina](https://retina.sh) to collect network related metrics.
20+
* Non-Cilium Data plane: For non-cilium data plane scenarios, Advanced Network Observability uses an OSS, eBPF-based data plane originally built by Microsoft known as Retina. For more information, see [Retina](https://retina.sh).
2321

2422
:::image type="content" source="./media/advanced-container-networking-services/advanced-network-observability.png" alt-text="Diagram of Advanced Network Observability.":::
2523

2624
> [!NOTE]
27-
> For deployments leveraging Cilium data planes, Advanced Network Observability is readily available starting with Kubernetes version 1.29.
28-
> For Non-Cilium Linux data planes, Advanced Network Observability is supported on all Linux distributions. Azure Linux is supported starting with version 2.0 and greater.
25+
> For Cilium data plane scenarios, Advanced Network Observability is available beginning with Kubernetes version 1.29.
26+
> For non-Cilium data plane scenarios, Advanced Network Observability is supported on all Linux distributions including Azure Linux beginning with version 2.0.
2927
3028
## Features of Advanced Network Observability
3129

3230
Advanced Network Observability offers the following capabilities to monitor network-related issues in your cluster:
3331

34-
* **Node-Level Metrics:** Understanding the health of your container network at the node-level is crucial for maintaining optimal application performance. These metrics indicate traffic volume, dropped packets, number of connections, etc. by node. Since they are Prometheus metrics, you can view them in Grafana or create custom alerts.
32+
* **Node-Level Metrics:** Understanding the health of your container network at the node-level is crucial for maintaining optimal application performance. These metrics provides insights into traffic volume, dropped packets, number of connections, etc. by node. The metrics are stored in Prometheus format and, as such, you can view them in Grafana.
3533

36-
* **Hubble Metrics (DNS and Pod-Level Metrics):** These Prometheus metrics include source/destination Pod information, empowering you to pinpoint network-related issues at a granular level. Metrics cover traffic volume, dropped packets, TCP resets, L4/L7 packet flows, etc. There are also DNS metrics (currently only for Non-Cilium data planes), covering DNS errors and DNS requests missing responses.
34+
* **Hubble Metrics (DNS and Pod-Level Metrics):** These Prometheus metrics include source and destination pod information allowing you to pinpoint network-related issues at a granular level. Metrics cover traffic volume, dropped packets, TCP resets, L4/L7 packet flows, etc. There are also DNS metrics (currently only for Non-Cilium data planes), covering DNS errors and DNS requests missing responses.
3735

38-
* **Hubble Flow Logs:** Flow logs unlock deep visibility into your cluster's network activity. All communications to/from Pods are logged, allowing you to investigate connectivity issues and more. Flow logs help answer questions such as: did the server receive the client's request? What is the round-trip latency between the client's request and server's response?
36+
* **Hubble Flow Logs:** Flow logs provide deep visibility into your cluster's network activity. All communications to and from pods are logged allowing you to investigate connectivity issues over time. Flow logs help answer questions such as: did the server receive the client's request? What is the round-trip latency between the client's request and server's response?
3937

40-
* **Hubble CLI:** The Hubble Command-Line Interface (CLI) provides a means to retrieve flow logs from across the cluster with customizable filtering and formatting.
38+
* **Hubble CLI:** The Hubble Command-Line Interface (CLI) can retrieve flow logs across the entire cluster with customizable filtering and formatting.
4139

42-
* **Hubble UI:** Hubble UI is a user-friendly web-browser interface for exploring your cluster's network activity. It creates a service-connection graph based on Flow logs, and it also displays flow logs for the selected namespace. You're responsible for provisioning and managing the infrastructure required to run Hubble UI.
40+
* **Hubble UI:** Hubble UI is a user-friendly browser-based interface for exploring cluster network activity. It creates a service-connection graph based on flow logs, and displays flow logs for the selected namespace. Users are responsible for provisioning and managing the infrastructure required to run Hubble UI.
4341

4442
## Key Benefits of Advanced Network Observability
4543

46-
* **CNI-Agnostic**: Supported on kubenet and all Azure CNI modes.
44+
* **CNI-Agnostic**: Supported on all Azure CNI variants including kubenet.
4745

48-
* **Cilium and Non-Cilium**: Uniform and seamless experience across Cilium and Non-Cilium data planes.
46+
* **Cilium and Non-Cilium**: Provides a uniform, seamless experience across both Cilium and non-Cilium data planes.
4947

50-
* **eBPF-Based Network Observability:** Identify potential bottlenecks and congestion issues before they impact application performance. Gain insights into key network health indicators, including traffic volume, dropped packets, and connection information.
48+
* **eBPF-Based Network Observability:** Leverages eBPF (extended Berkeley Packet Filter) for performance and scalability to identify potential bottlenecks and congestion issues before they impact application performance. Gain insights into key network health indicators, including traffic volume, dropped packets, and connection information.
5149

5250
* **Deep Visibility into Network Activity:** Understand how your applications are communicating with each other through detailed network flow logs.
5351

54-
* **Simplified monitoring options**: Choose between:
55-
* **Azure Managed Prometheus and Grafana**: With this option, Azure manages the infrastructure and maintenance, allowing you to focus on configuring and visualizing metrics.
56-
* **Bring your own (BYO) Prometheus and Grafana**: With this option, you set up your own instances and manage the underlying infrastructure.
52+
* **Simplified Metrics Storage and Visualization Options**: Choose between:
53+
* **Azure Managed Prometheus and Grafana**: Azure manages the infrastructure and maintenance, allowing users to focus on configuring metrics and visualizing metrics.
54+
* **Bring Your Own (BYO) Prometheus and Grafana**: Users deploy and configure their own instances and manage the underlying infrastructure.
5755

5856
## Metrics
5957

6058
### Node-Level Metrics
6159

62-
The following metrics are aggregated per Node. All metrics include the labels:
60+
The following metrics are aggregated per node. All metrics include labels:
6361

6462
* `cluster`
6563
* `instance` (Node name)
6664

6765
#### [**Non-Cilium**](#tab/non-cilium)
6866

69-
On Non-Cilium data plane, the Network Observability add-on provides metrics in both Linux and Windows platforms.
70-
The below table outlines the different metrics generated.
67+
For non-Cilium data plane scenarios, Advanced Network Observability provides metrics for both Linux and Windows operating systems.
68+
The table below outlines the different metrics generated.
7169

7270
| Metric Name | Description | Extra Labels | Linux | Windows |
7371
|------------------------------------------------|-------------|--------------|-------|---------|
@@ -86,47 +84,47 @@ The below table outlines the different metrics generated.
8684

8785
#### [**Cilium**](#tab/cilium)
8886

89-
Cilium currently only supports Linux nodes.
90-
It exposes several metrics including the following for network observability.
87+
For Cilium data plane scenarios, Advanced Network Observability provides metrics only for Linux, Windows is currently not supported.
88+
Cilium exposes several metrics including the following used by Advanced Network Observability.
9189

9290
| Metric Name | Description | Extra Labels |Linux | Windows |
9391
|--------------------------------|------------------------------|-----------------------|-------|---------|
9492
| **cilium_forward_count_total** | Total forwarded packet count | `direction` |||
95-
| **cilium_forward_bytes_total** | Total forwarded byte count | `direction` |||
93+
| **cilium_forward_bytes_total** | Total forwarded byte count | `direction` | ||
9694
| **cilium_drop_count_total** | Total dropped packet count | `direction`, `reason` |||
9795
| **cilium_drop_bytes_total** | Total dropped byte count | `direction`, `reason` |||
9896

9997
---
10098

10199
### Pod-Level Metrics (Hubble Metrics)
102100

103-
The following metrics are aggregated per Pod (still containing Node information). All metrics include the labels:
101+
The following metrics are aggregated per pod (node information is preserved). All metrics include labels:
104102

105103
* `cluster`
106104
* `instance` (Node name)
107105
* `source` or `destination`
108106

109-
For *outgoing traffic*, there will be a `source` label with source Pod namespace/name.
110-
For *incoming traffic*, there will be a `destination` label with destination Pod namespace/name.
107+
For *outgoing traffic*, there will be a `source` label with source pod namespace/name.
108+
For *incoming traffic*, there will be a `destination` label with destination pod namespace/name.
111109

112110
| Metric Name | Description | Extra Labels | Linux | Windows |
113111
|----------------------------------|------------------------------|-----------------------|-------|---------|
114112
| **hubble_dns_queries_total** | Total DNS requests by query | `source` or `destination`, `query`, `qtypes` (query type) |||
115113
| **hubble_dns_responses_total** | Total DNS responses by query/response | `source` or `destination`, `query`, `qtypes` (query type), `rcode` (return code), `ips_returned` (number of IPs) |||
116114
| **hubble_drop_total** | Total dropped packet count | `source` or `destination`, `protocol`, `reason` |||
117-
| **hubble_tcp_flags_total** | Toctal TCP packets count by flag. | `source` or `destination`, `flag` |||
115+
| **hubble_tcp_flags_total** | Total TCP packets count by flag. | `source` or `destination`, `flag` |||
118116
| **hubble_flows_processed_total** | Total network flows processed (L4/L7 traffic) | `source` or `destination`, `protocol`, `verdict`, `type`, `subtype` |||
119117

120118
### Limitations
121119

122-
* Pod-level metrics available only on Linux.
120+
* Pod-level metrics are available only on Linux.
123121
* Cilium data plane is supported starting with Kubernetes version 1.29.
124-
* Metric labels may have subtle differences between Cilium and Non-Cilium clusters.
122+
* Metric labels may have subtle differences between Cilium and non-Cilium clusters.
125123
* Cilium data plane does not currently support DNS metrics.
126124

127125
### Scale
128126

129-
Certain scale limitations apply when you use Azure managed Prometheus and Grafana. For more information, see [Scrape Prometheus metrics at scale in Azure Monitor](/azure/azure-monitor/essentials/prometheus-metrics-scrape-scale)
127+
Azure managed Prometheus and Grafana impose service-specific scale limitations. For more information, see [Scrape Prometheus metrics at scale in Azure Monitor](/azure/azure-monitor/essentials/prometheus-metrics-scrape-scale)
130128

131129
## Next steps
132130

0 commit comments

Comments
 (0)