|
| 1 | +--- |
| 2 | +title: Azure Operator Nexus Observability Metrics |
| 3 | +description: Observability metrics in Azure Operator Nexus |
| 4 | +ms.topic: article |
| 5 | +ms.date: 02/27/2024 |
| 6 | +author: joemarshallmsft |
| 7 | +ms.author: joemarshall |
| 8 | +ms.service: azure-operator-nexus |
| 9 | +--- |
| 10 | + |
| 11 | +# Azure Operator Nexus Observability Metrics |
| 12 | + |
| 13 | +In Operator Nexus Network Fabric (NNF), Ethernet monitoring is a critical component in maintaining optimal network performance, ensuring availability, and proactively addressing potential issues before they cause disruptions in the fabric. Monitoring includes traffic analysis, device health, security, and details specific to individual Ethernet interfaces. By closely monitoring the fabric infrastructure, we can ensure that NNF operates smoothly and efficiently, and that any potential problems are identified and addressed early on. |
| 14 | + |
| 15 | +The following aspects of NNF devices are monitored: |
| 16 | + |
| 17 | +- **Availability:** Monitoring the connectivity of devices ensures that the network is available and prevents downtime |
| 18 | + |
| 19 | +- **Performance**: Tracking metrics such as interface bandwidth utilization, packet loss, latency, and jitter, lets us evaluate network performance and pinpoint any bottlenecks |
| 20 | + |
| 21 | +- **Security**: Monitoring helps to identify any suspicious activity, unauthorized access attempts, or potential security threats on the network |
| 22 | + |
| 23 | +- **Health**: Monitoring device CPU, memory, temperature, fan, power supply status, and interface operational status, lets us identify any potential failures |
| 24 | + |
| 25 | +## ACL state counters |
| 26 | + |
| 27 | +State counters for Access Control Lists (ACLs) in a network device help you oversee and control network traffic. They offer data on the number of packets that matched to each ACL entry. These counters can be examined on a global scale, or per interface, and by incoming and outgoing traffic. |
| 28 | + |
| 29 | + |
| 30 | +| Metrics Category | Description/Usage | Collection interval | Measure unit | |
| 31 | +|--|--|--|--| |
| 32 | +| ACL (Access List) Matched Packets | The total count of network packets that match the criteria set by the current Access Control List (ACL) entry in a network device. This count helps in monitoring and managing network traffic. | 5 min | Number of packets. | |
| 33 | + |
| 34 | +## BGP status |
| 35 | + |
| 36 | +Border Gateway Protocol (BGP) connections are essential to effective communication between BGP peers, and optimal network performance. Network administrators can detect network problems or disruptions by observing these states. For example, a connection remaining in the 'Idle' state could suggest a configuration problem. The 'Established' state, which indicates a successful routing information exchange between BGP peers, is essential for the network to function correctly. |
| 37 | + |
| 38 | +| Metrics Category | Description/Usage | Collection interval | Measured unit | |
| 39 | +|--|--|--|--| |
| 40 | +| BGP Peer Status | The BGP peer status, as defined by [RFC 4271](https://datatracker.ietf.org/doc/html/rfc4271), and summarized after this table. | 5 mins and on demand | N/A | |
| 41 | + |
| 42 | +The BGP connection states are: |
| 43 | + |
| 44 | +- **Idle (1):** The initial state of a BGP connection. |
| 45 | +- **Connect (2):** The system is waiting for the TCP connection to be completed. |
| 46 | +- **Active (3):** The system is trying to initiate a TCP connection with the peer. |
| 47 | +- **OpenSent (4):** The system is waiting to receive an OPEN message from the peer. |
| 48 | +- **OpenConfirm (5):** The system is waiting for a KEEPALIVE or NOTIFICATION message from the peer. |
| 49 | +- **Established (6):** The BGP connection is fully established and the peers can exchange UPDATE messages. |
| 50 | + |
| 51 | +## Component operational state |
| 52 | + |
| 53 | +The operational state of a hardware or software component shows its current functioning state. |
| 54 | + |
| 55 | +| Metrics Category | Description/Usage | Collection interval | Measured unit | |
| 56 | +|--|--|--|--| |
| 57 | +| Component Operation Status | Operational Status of the entities that can be part of the device's inventory, such as line cards, transceivers, fans, power supplies, etc. The possible values are described after this table. | 5 mins and on demand | N/A | |
| 58 | + |
| 59 | +The possible operational states are: |
| 60 | + |
| 61 | +- **Active (0):** The component is enabled and active (up) |
| 62 | +- **Inactive (1):** The component is enabled but inactive (down) |
| 63 | +- **Disabled (2):** The component is administratively disabled |
| 64 | + |
0 commit comments