Skip to content

Commit bc7816f

Browse files
Merge pull request #39615 from mikemckiernan/feat-must-gath-tcpdump
OSDOCS-2171: run tcpdump from must gather
2 parents 2370e17 + 21f2851 commit bc7816f

File tree

3 files changed

+113
-0
lines changed

3 files changed

+113
-0
lines changed
Lines changed: 73 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,73 @@
1+
// Module included in the following assemblies:
2+
//
3+
// * support/gathering-cluster-data.adoc
4+
5+
[id="support-collecting-host-network-trace_{context}"]
6+
= Collecting a host network trace
7+
8+
Sometimes, troubleshooting a network-related issue is simplified by tracing network communication and capturing packets on multiple nodes at the same time.
9+
10+
You can use a combination of the `oc adm must-gather` command and the `registry.redhat.io/openshift4/network-tools-rhel8` container image to gather packet captures from nodes.
11+
Analyzing packet captures can help you troubleshoot network communication issues.
12+
13+
The `oc adm must-gather` command is used to run the `tcpdump` command in pods on specific nodes.
14+
The `tcpdump` command records the packet captures in the pods.
15+
When the `tcpdump` command exits, the `oc adm must-gather` command transfers the files with the packet captures from the pods to your client machine.
16+
17+
[TIP]
18+
====
19+
The sample command in the following procedure demonstrates performing a packet capture with the `tcpdump` command.
20+
However, you can run any command in the container image that is specified in the `--image` argument to gather troubleshooting information from multiple nodes at the same time.
21+
====
22+
23+
.Prerequisites
24+
25+
* You have access to the cluster as a user with the `cluster-admin` role.
26+
27+
* You have installed the OpenShift CLI (`oc`).
28+
29+
.Procedure
30+
31+
. Run a packet capture from the host network on some nodes by running the following command:
32+
+
33+
[source,terminal]
34+
----
35+
$ oc adm must-gather \
36+
--dest-dir /tmp/captures \ <.>
37+
--source-dir '/tmp/tcpdump/' \ <.>
38+
--image registry.redhat.io/openshift4/network-tools-rhel8:latest \ <.>
39+
--node-selector 'node-role.kubernetes.io/worker' \ <.>
40+
--host-network=true \ <.>
41+
--timeout 30s \ <.>
42+
-- \
43+
tcpdump -i any \ <.>
44+
-w /tmp/tcpdump/%Y-%m-%dT%H:%M:%S.pcap -W 1 -G 300
45+
----
46+
<.> The `--dest-dir` argument specifies that `oc adm must-gather` stores the packet captures in directories that are relative to `/tmp/captures` on the client machine. You can specify any writable directory.
47+
<.> When `tcpdump` is run in the debug pod that `oc adm must-gather` starts, the `--source-dir` argument specifies that the packet captures are temporarily stored in the `/tmp/tcpdump` directory on the pod.
48+
<.> The `--image` argument specifies a container image that includes the `tcpdump` command.
49+
<.> The `--node-selector` argument and example value specifies to perform the packet captures on the worker nodes. As an alternative, you can specify the `--node-name` argument instead to run the packet capture on a single node. If you omit both the `--node-selector` and the `--node-name` argument, the packet captures are performed on all nodes.
50+
<.> The `--host-network=true` argument is required so that the packet captures are performed on the network interfaces of the node.
51+
<.> The `--timeout` argument and value specify to run the debug pod for 30 seconds. If you do not specify the `--timeout` argument and a duration, the debug pod runs for 10 minutes.
52+
<.> The `-i any` argument for the `tcpdump` command specifies to capture packets on all network interfaces. As an alternative, you can specify a network interface name.
53+
54+
. Perform the action, such as accessing a web application, that triggers the network communication issue while the network trace captures packets.
55+
56+
. Review the packet capture files that `oc adm must-gather` transferred from the pods to your client machine:
57+
+
58+
[source,text]
59+
----
60+
tmp/captures
61+
├── event-filter.html
62+
├── ip-10-0-192-217-ec2-internal <1>
63+
│ └── registry-redhat-io-openshift4-network-tools-rhel8-sha256-bca...
64+
│ └── 2022-01-13T19:31:31.pcap
65+
├── ip-10-0-201-178-ec2-internal <1>
66+
│ └── registry-redhat-io-openshift4-network-tools-rhel8-sha256-bca...
67+
│ └── 2022-01-13T19:31:30.pcap
68+
├── ip-...
69+
└── timestamp
70+
----
71+
+
72+
<1> The packet captures are stored in directories that identify the hostname, container, and file name.
73+
If you did not specify the `--node-selector` argument, then the directory level for the hostname is not present.
Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
// Module included in the following assemblies:
2+
//
3+
// * support/gathering-cluster-data.adoc
4+
5+
[id="support-network-trace-methods_{context}"]
6+
= Network trace methods
7+
8+
Collecting network traces, in the form of packet capture records, can assist Red Hat Support with troubleshooting network issues.
9+
10+
{product-title} supports two ways of performing a network trace.
11+
Review the following table and choose the method that meets your needs.
12+
13+
.Supported methods of collecting a network trace
14+
[cols="1,4a",options="header"]
15+
|===
16+
17+
|Method
18+
|Benefits and capabilities
19+
20+
|Collecting a host network trace
21+
|You perform a packet capture for a duration that you specify on one or more nodes at the same time.
22+
The packet capture files are transferred from nodes to the client machine when the specified duration is met.
23+
24+
You can troubleshoot why a specific action triggers network communication issues. Run the packet capture, perform the action that triggers the issue, and use the logs to diagnose the issue.
25+
26+
|Collecting a network trace from an {product-title} node or container
27+
|You perform a packet capture on one node or one container.
28+
You run the `tcpdump` command interactively, so you can control the duration of the packet capture.
29+
30+
You can start the packet capture manually, trigger the network communication issue, and then stop the packet capture manually.
31+
32+
This method uses the `cat` command and shell redirection to copy the packet capture data from the node or container to the client machine.
33+
34+
|===

support/gathering-cluster-data.adoc

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -41,6 +41,12 @@ include::modules/querying-bootstrap-node-journal-logs.adoc[leveloffset=+1]
4141
// Querying cluster node journal logs
4242
include::modules/querying-cluster-node-journal-logs.adoc[leveloffset=+1]
4343

44+
// Network trace methods
45+
include::modules/support-network-trace-methods.adoc[leveloffset=+1]
46+
47+
// Collecting a host network trace
48+
include::modules/support-collecting-host-network-trace.adoc[leveloffset=+1]
49+
4450
// Collecting a network trace from an {product-title} node or container
4551
include::modules/support-collecting-network-trace.adoc[leveloffset=+1]
4652

0 commit comments

Comments
 (0)