Skip to content

Commit 504b7f3

Browse files
amitslavinguyarb
andauthored
[USM] Enhance USM debugging: add netstat command (#44610)
### What does this PR do? 1. Enhanced usm sysinfo command - Now displays detected service names alongside process information 2. New usm netstat command - Shows network connections similar to netstat -antpu with process information usm sysinfo Enhancements - Added "Service" column showing detected service names - Detects DD_SERVICE environment variable from Docker containers and Kubernetes pods - Falls back to generated service names using the same logic as process-agent - Added --max-service-length flag (default: 20) to control service name display width - Refactored to use service discovery infrastructure (envs.Variables, kernel.HostProc()) - Removed all magic numbers and strings, replaced with well-named constants New usm netstat Command - Displays active network connections (TCP/UDP, IPv4/IPv6) - Maps connections to processes showing PID and process name - Supports filtering: --tcp/-t, --udp/-u, --listening/-l flags - Shows connection state (ESTABLISHED, LISTEN, TIME_WAIT, etc.) - Optimized inode-to-PID mapping for performance - Uses kernel.HostProc() for proper container/namespace support ### Motivation When debugging USM issues, these commands provide essential diagnostic information: For usm sysinfo: - Verify service discovery is working correctly for target applications - Confirm expected services are running and properly identified - Validate service names match what's expected in USM monitoring - Debug DD_SERVICE environment variable propagation in containers For usm netstat: - Identify which processes own specific network connections - Debug USM connectivity issues - Verify which processes are listening on ports - Check established connections for monitored services - Troubleshoot port conflicts ### Describe how you validated your changes Manual Testing: - Verified DD_SERVICE detection from Docker containers (-e DD_SERVICE=mysrv) - Tested netstat with various flags (--tcp, --udp, --listening) - Confirmed TCP state display (TIME_WAIT, ESTABLISHED, etc.) - Validated process name and PID mapping for network connections Testing: - Added unit test for --max-service-length flag (default: 20) ### Additional Notes Co-authored-by: guyarb <guy20495@gmail.com> Co-authored-by: guy.arbitman <guy.arbitman@datadoghq.com>
1 parent 876e309 commit 504b7f3

File tree

9 files changed

+425
-19
lines changed

9 files changed

+425
-19
lines changed

cmd/system-probe/subcommands/usm/README.md

Lines changed: 71 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -34,25 +34,32 @@ service_monitoring_config:
3434

3535
### `usm sysinfo`
3636

37-
Shows system information relevant to USM debugging.
37+
Shows system information relevant to USM debugging, including detected services and programming languages.
3838

3939
**Usage:**
4040
```bash
4141
sudo ./system-probe usm sysinfo
4242
sudo ./system-probe usm sysinfo --max-cmdline-length 100 # Extended command line display
4343
sudo ./system-probe usm sysinfo --max-name-length 50 # Extended process name display
44-
sudo ./system-probe usm sysinfo --max-cmdline-length 0 --max-name-length 0 # Unlimited
44+
sudo ./system-probe usm sysinfo --max-service-length 30 # Extended service name display
45+
sudo ./system-probe usm sysinfo --max-cmdline-length 0 --max-name-length 0 --max-service-length 0 # Unlimited
4546
```
4647

4748
**Options:**
4849
- `--max-cmdline-length` - Maximum command line length to display (default: 50, 0 for unlimited)
4950
- `--max-name-length` - Maximum process name length to display (default: 25, 0 for unlimited)
51+
- `--max-service-length` - Maximum service name length to display (default: 20, 0 for unlimited)
5052

5153
**Output:**
5254
- Kernel version
5355
- OS type and architecture
5456
- Hostname
55-
- List of all running processes with PIDs, PPIDs, names, and command lines
57+
- List of all running processes with:
58+
- PIDs and PPIDs
59+
- Process names
60+
- Detected service names (using the same logic as the process-agent)
61+
- Detected programming language and version
62+
- Command lines
5663

5764
**Output Example:**
5865
```
@@ -65,14 +72,54 @@ Hostname: agent-dev-ubuntu-22
6572
6673
Running Processes: 127
6774
68-
PID | PPID | Name | Command
69-
--------|---------|---------------------------|--------------------------------------------------
70-
1 | 0 | systemd | /sbin/init
71-
156 | 1 | sshd | /usr/sbin/sshd -D
75+
PID | PPID | Name | Service | Language | Command
76+
--------|---------|---------------------------|----------------------|--------------|--------------------------------------------------
77+
1 | 0 | systemd | systemd | - | /sbin/init autoinstall
78+
774 | 1 | containerd | containerd | go/go1.23.7 | /usr/bin/containerd
79+
1046 | 1 | dockerd | dockerd | go/go1.23.8 | /usr/bin/dockerd -H fd:// --containerd=/run/con...
7280
...
7381
```
7482

7583

84+
### `usm netstat`
85+
86+
Shows network connections similar to `netstat -antpu`. Displays TCP and UDP connections with process information.
87+
88+
**Usage:**
89+
```bash
90+
sudo ./system-probe usm netstat # Show all TCP and UDP connections
91+
sudo ./system-probe usm netstat --tcp=false # Show only UDP connections
92+
sudo ./system-probe usm netstat --udp=false # Show only TCP connections
93+
```
94+
95+
**Options:**
96+
- `--tcp` / `-t` - Show TCP connections (default: true)
97+
- `--udp` / `-u` - Show UDP connections (default: true)
98+
99+
**Output:**
100+
- Protocol (tcp, tcp6, udp, udp6)
101+
- Local address and port
102+
- Remote address and port
103+
- Connection state (ESTABLISHED, LISTEN, etc. for TCP)
104+
- PID and process name
105+
106+
**Output Example:**
107+
```
108+
Proto | Local Address | Foreign Address | State | PID/Program
109+
------|-------------------------|-------------------------|-------------|------------------
110+
tcp | 0.0.0.0:22 | 0.0.0.0:0 | LISTEN | 1234/sshd
111+
tcp | 127.0.0.1:8080 | 127.0.0.1:45678 | ESTABLISHED | 5678/python
112+
tcp6 | :::80 | :::0 | LISTEN | 9012/nginx
113+
udp | 0.0.0.0:53 | 0.0.0.0:0 | | 3456/systemd-resolved
114+
```
115+
116+
**Use Cases:**
117+
- Debug USM connectivity issues
118+
- Verify which processes are listening on ports
119+
- Check established connections for monitored services
120+
- Identify which processes own specific network connections
121+
- Troubleshoot port conflicts
122+
76123
### `usm symbols ls`
77124

78125
Lists symbols from ELF binaries, similar to the Unix `nm` utility. Useful for analyzing symbol visibility, library versions, and linkage in monitored applications.
@@ -140,6 +187,8 @@ This provides complete context about the USM configuration and system environmen
140187
Use `usm sysinfo` to see what processes are running that USM might be monitoring, helping to:
141188
- Verify target applications are running
142189
- Check if applications are running with expected command line arguments
190+
- Identify detected service names for processes (e.g., "nginx", "postgres", "node")
191+
- See which programming languages are detected and their versions
143192
- Identify processes by PID for further investigation
144193

145194
### Inspecting eBPF Maps
@@ -164,10 +213,22 @@ See the [eBPF subcommands README](../ebpf/README.md) for full documentation on e
164213
### Sysinfo Command
165214
- Collects process information using `procutil.NewProcessProbe()` (same as process-agent)
166215
- Uses `kernel.Release()` for kernel version detection
216+
- Detects service names using `parser.NewServiceExtractor()` (same logic as process-agent service discovery)
167217
- Processes are sorted by PID
168-
- Output truncates long process names (default 25 chars) and command lines (default 50 chars) for readability
169-
- Both truncation limits are configurable via flags
170-
- Use `--max-cmdline-length 0` and `--max-name-length 0` for unlimited display
218+
- Output truncates long process names (default 25 chars), service names (default 20 chars), and command lines (default 50 chars) for readability
219+
- All truncation limits are configurable via flags
220+
- Use `--max-cmdline-length 0`, `--max-name-length 0`, and `--max-service-length 0` for unlimited display
221+
222+
### Netstat Command
223+
- Uses `procnet.GetTCPConnections()` for robust TCP connection parsing with PID/FD mapping
224+
- Reads UDP connections from `/proc/net/udp` and `/proc/net/udp6` (manual parsing)
225+
- Maps socket inodes to processes by reading `/proc/*/fd/*` symlinks for UDP
226+
- Parses hexadecimal IP addresses and ports to human-readable format
227+
- Shows TCP connection states (ESTABLISHED, LISTEN, TIME_WAIT, etc.)
228+
- Filters connections based on protocol flags (`--tcp`, `--udp`)
229+
- Connections sorted by protocol and local port
230+
- Use standard Unix tools like `grep` for additional filtering (e.g., `| grep LISTEN`)
231+
- Linux with eBPF support only (requires `linux_bpf` build tag)
171232

172233
### Symbols Ls Command
173234
- Parses ELF binaries using `pkg/util/safeelf` package for safe symbol table reading

cmd/system-probe/subcommands/usm/command.go

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,11 @@ func Commands(globalParams *command.GlobalParams) []*cobra.Command {
2727
usmCmd.AddCommand(sysinfoCmd)
2828
}
2929

30+
// Add netstat command if available on this platform
31+
if netstatCmd := makeNetstatCommand(globalParams); netstatCmd != nil {
32+
usmCmd.AddCommand(netstatCmd)
33+
}
34+
3035
// Add check-maps command if available on this platform
3136
if checkMapsCmd := makeCheckMapsCommand(globalParams); checkMapsCmd != nil {
3237
usmCmd.AddCommand(checkMapsCmd)

0 commit comments

Comments
 (0)