Skip to content

Commit d8597ac

Browse files
alexandra5000naemono
authored andcommitted
Add troubleshooting guide for OTel connectivity issues (elastic#3030)
This PR adds a new troubleshooting page covering connectivity issues between OpenTelemetry SDKs, the Collector, and Elastic endpoints. The page includes: * Common error symptoms * Differentiation of root causes (firewall, endpoint errors, proxy misconfiguration, SDK vs Collector failures) * Step-by-step resolution checks (DNS, reachability, port testing, TLS validation) * Links to related existing pages (proxy settings, debug logging) This content is intended to help users diagnose networking problems even when proxy settings appear correct but ports or firewalls block traffic. Resolves elastic#2351
1 parent b25c4ef commit d8597ac

File tree

2 files changed

+160
-0
lines changed

2 files changed

+160
-0
lines changed
Lines changed: 159 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,159 @@
1+
---
2+
navigation_title: Connectivity issues
3+
description: Troubleshoot connectivity issues between EDOT SDKs, the EDOT Collector, and Elastic.
4+
applies_to:
5+
serverless: all
6+
product:
7+
edot_collector: ga
8+
products:
9+
- id: observability
10+
- id: edot-collector
11+
- id: edot-sdk
12+
---
13+
14+
# Connectivity issues with EDOT
15+
16+
Connectivity problems occur when the EDOT SDKs or the EDOT Collector can't reach Elastic. Even with correct proxy settings, network restrictions such as blocked ports or firewalls can prevent data from flowing.
17+
18+
19+
## Symptoms
20+
21+
You might see one or more of the following error messages:
22+
23+
- `connection refused`
24+
- `network unreachable`
25+
- `i/o timeout`
26+
- `tls: handshake failure`
27+
28+
These errors might appear either in application logs (from the SDK) or in the Collector logs.
29+
30+
Example (Collector):
31+
32+
```text
33+
2024-09-15T12:44:30Z error exporterhelper/queued_retry.go:149 Exporting failed. Rejecting data. Error: context deadline exceeded
34+
```
35+
36+
Example (Python SDK):
37+
38+
```text
39+
opentelemetry.sdk ERROR OTLPSpanExporter - Failed to export spans: [Errno 111] Connection refused
40+
```
41+
42+
## Causes
43+
44+
Connectivity errors usually trace back to one of the following issues:
45+
46+
- **Firewall or port blocking**
47+
48+
Outbound traffic may be blocked by corporate firewalls or network policies.
49+
50+
Check that the required protocol and port combination is allowed:
51+
52+
- OTLP/HTTP: TCP 4318
53+
- OTLP/gRPC: TCP 4317
54+
- {{es}} (over HTTPS): 443
55+
- {{es}}: 9200
56+
57+
Also confirm whether your environment uses IPv4 or IPv6, as routing and firewall rules may differ.
58+
59+
60+
- **Endpoint errors**
61+
62+
The endpoint is unreachable or not listening on the specified port:
63+
64+
- `connection refused`: endpoint not listening
65+
- `network unreachable`: VPN, routing, or DNS failure
66+
- `timeout`: traffic dropped by firewall, proxy, or load balancer
67+
68+
- **Proxy misconfiguration**
69+
70+
Proxy environment variables (`HTTP_PROXY`, `HTTPS_PROXY`) might be set correctly but the proxy itself lacks access to Elastic or restricts ports. Refer to [Proxy settings](opentelemetry://reference/edot-collector/config/proxy.md) for more information.
71+
72+
73+
### Differences between SDK and Collector issues
74+
75+
Errors can look similar whether they come from an SDK or the Collector. Identifying the source helps you isolate the problem.
76+
77+
:::{note}
78+
Note: Some SDKs support setting a proxy directly (for example, using `HTTPS_PROXY`). Refer to [Proxy settings for EDOT SDKs](../opentelemetry/edot-sdks/proxy.md) for details.
79+
:::
80+
81+
#### SDK
82+
83+
Application logs report failures when the SDK cannot send data to the Collector or directly to Elastic. These often appear as `connection refused` or `timeout` messages. If seen, verify that the Collector endpoint is reachable.
84+
85+
For guidance on enabling logs in your SDK, see [Enable SDK debug logging](../opentelemetry/edot-sdks/enable-debug-logging.md).
86+
87+
Example (Java SDK):
88+
89+
```text
90+
io.opentelemetry.exporter.otlp.internal.grpc.OkHttpGrpcExporter - Failed to export spans. Error: UNAVAILABLE: io exception
91+
```
92+
93+
#### The Collector
94+
95+
Collector logs show export failures when it cannot forward data to Elastic. Look for messages like `cannot send spans` or `failed to connect to <endpoint>`. If present, confirm the Collector’s exporters configuration and network access.
96+
97+
98+
## Resolution
99+
100+
Before you dig into SDK or Collector configuration, confirm that your environment can reach the Elastic endpoint.
101+
102+
:::{note}
103+
The examples below use command syntax from Linux and macOS. On Windows or when testing IPv6, the equivalent tooling or syntax may differ (for example, `Test-NetConnection` in PowerShell).
104+
:::
105+
106+
:::::{stepper}
107+
108+
::::{step} Verify DNS resolution
109+
110+
Make sure the hostname for your Elastic endpoint resolves correctly:
111+
112+
```bash
113+
nslookup <your-endpoint>
114+
```
115+
116+
::::
117+
118+
::::{step} Test network reachability
119+
120+
```bash
121+
ping <your-endpoint>
122+
```
123+
124+
::::
125+
126+
::::{step} Check open ports
127+
128+
Test whether the required OTLP ports are open (default `443` for HTTPS):
129+
130+
```bash
131+
nc -vz <your-endpoint> 443
132+
```
133+
134+
::::
135+
136+
::::{step} Verify TLS/SSL
137+
138+
Check that TLS certificates can be validated:
139+
140+
```bash
141+
openssl s_client -connect <your-endpoint>:443
142+
```
143+
144+
::::
145+
146+
:::::
147+
148+
If any of these steps fail, the issue is likely caused by network infrastructure rather than your SDK or Collector configuration.
149+
150+
151+
### Next steps
152+
153+
If basic checks and configuration look correct but issues persist, collect more details before escalating:
154+
155+
* Review proxy settings. For more information, refer to [Proxy settings](opentelemetry://reference/edot-collector/config/proxy.md).
156+
157+
* If ports are confirmed open but errors persist, [enable debug logging in the SDK](../opentelemetry/edot-sdks/enable-debug-logging.md) or [in the Collector](../opentelemetry/edot-collector/enable-debug-logging.md) for more detail.
158+
159+
* Contact your network administrator with test results if you suspect firewall restrictions.

troubleshoot/ingest/opentelemetry/toc.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,4 +26,5 @@ toc:
2626
- file: edot-sdks/proxy.md
2727
- file: edot-sdks/misconfigured-sampling-sdk.md
2828
- file: no-data-in-kibana.md
29+
- file: connectivity.md
2930
- file: contact-support.md

0 commit comments

Comments
 (0)