Skip to content

Commit 888f7c6

Browse files
Add troubleshooting guide for 429 errors when using the Elastic Cloud Managed OTLP Endpoint (#3669)
This PR adds a new troubleshooting topic under that explains how to diagnose and resolve HTTP 429 Too Many Requests errors when sending data to the mOTLP endpoint in both Elastic Cloud Serverless and Elastic Cloud Hosted environments. Also includes an update to the mOTLP quickstart page that replaces the inline “Error: too many requests” section with a link to this troubleshooting guide. Closes [#6054](elastic/ingest-dev#6054)
1 parent e54e73d commit 888f7c6

File tree

3 files changed

+128
-1
lines changed

3 files changed

+128
-1
lines changed

solutions/observability/get-started/quickstart-elastic-cloud-otel-endpoint.md

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -162,7 +162,10 @@ You must format your API key as `"Authorization": "ApiKey <api-key-value-here>"`
162162
163163
### Error: too many requests
164164
165-
The Managed OTLP endpoint has per-project rate limits in place. If you reach this limit, reach out to our [support team](https://support.elastic.co). Refer to [Rate limiting](opentelemetry://reference/motlp.md#rate-limiting) for more information.
165+
If you see HTTP `429 Too Many Requests` errors when sending data through the Elastic Cloud Managed OTLP Endpoint (mOTLP) endpoint, your project might be hitting ingest rate limits.
166+
167+
Refer to the dedicated [429 errors when using the Elastic Cloud Managed OTLP Endpoint](/troubleshoot/ingest/opentelemetry/429-errors-motlp.md) troubleshooting guide for details on causes, rate limits, and solutions.
168+
166169

167170
## Provide feedback
168171

Lines changed: 123 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,123 @@
1+
---
2+
navigation_title: 429 errors when using the mOTLP endpoint
3+
description: Resolve HTTP 429 `Too Many Requests` errors when sending data through the Elastic Cloud Managed OTLP (mOTLP) endpoint in Elastic Cloud Serverless or Elastic Cloud Hosted (ECH).
4+
applies_to:
5+
stack:
6+
serverless:
7+
observability:
8+
product:
9+
edot_collector:
10+
products:
11+
- id: cloud-serverless
12+
- id: cloud-hosted
13+
- id: observability
14+
- id: edot-collector
15+
---
16+
17+
# 429 errors when using the Elastic Cloud Managed OTLP Endpoint
18+
19+
When sending telemetry data through the {{motlp}} (mOTLP), you might encounter HTTP `429 Too Many Requests` errors. These indicate that your ingest rate has temporarily exceeded the rate or burst limits configured for your {{ecloud}} project.
20+
21+
This issue can occur in both {{serverless-full}} and {{ech}} (ECH) environments.
22+
23+
## Symptoms
24+
25+
You might see log messages similar to the following in your EDOT Collector output or SDK logs:
26+
27+
```json
28+
{
29+
"code": 8,
30+
"message": "error exporting items, request to <ingest endpoint> responded with HTTP Status Code 429"
31+
}
32+
```
33+
34+
In some cases, you may also see warnings or backpressure metrics increase in your Collector’s internal telemetry (for example, queue length or failed send count).
35+
36+
## Causes
37+
38+
A 429 status means that the rate of requests sent to the Managed OTLP endpoint has exceeded allowed thresholds. This can happen for several reasons:
39+
40+
* Your telemetry pipeline is sending data faster than the allowed ingest rate.
41+
* Bursts of telemetry data exceed the short-term burst limit, even if your sustained rate is within limits.
42+
43+
The specific limits depend on your environment:
44+
45+
| Deployment type | Rate limit | Burst limit |
46+
|-----------------|------------|-------------|
47+
| Serverless | 15 MB/s | 30 MB/s |
48+
| ECH | Depends on deployment size and available {{es}} capacity | Depends on deployment size and available {{es}} capacity |
49+
50+
Exact limits depend on your subscription tier.
51+
Refer to the [Rate limiting section](opentelemetry://reference/motlp.md#rate-limiting) in the mOTLP reference documentation for details.
52+
53+
* In {{ech}}, the {{es}} capacity for your deployment might be underscaled for the current ingest rate.
54+
* In {{serverless-full}}, rate limiting should not result from {{es}} capacity, since the platform automatically scales ingest capacity. If you suspect a scaling issue, [contact Elastic Support](contact-support.md).
55+
* Multiple Collectors or SDKs are sending data concurrently without load balancing or backoff mechanisms.
56+
57+
## Resolution
58+
59+
To resolve 429 errors, identify whether the bottleneck is caused by ingest limits or {{es}} capacity.
60+
61+
### Scale your deployment or request higher limits
62+
63+
If you’ve confirmed that your ingest configuration is stable but still encounter 429 errors:
64+
65+
* {{serverless-full}}: [Contact Elastic Support](contact-support.md) to request an increase in ingest limits.
66+
* {{ech}} (ECH): Increase your {{es}} capacity by scaling or resizing your deployment:
67+
* [Scaling considerations](../../../deploy-manage/production-guidance/scaling-considerations.md)
68+
* [Resize deployment](../../../deploy-manage/deploy/cloud-enterprise/resize-deployment.md)
69+
* [Autoscaling in ECE and ECH](../../../deploy-manage/autoscaling/autoscaling-in-ece-and-ech.md)
70+
71+
After scaling, monitor your ingest metrics to verify that the rate of accepted requests increases and 429 responses stop appearing.
72+
73+
### Reduce ingest rate or enable backpressure
74+
75+
Lower the telemetry export rate by enabling batching and retry mechanisms in your EDOT Collector or SDK configuration. For example:
76+
77+
```yaml
78+
processors:
79+
batch:
80+
send_batch_size: 1000
81+
timeout: 5s
82+
83+
exporters:
84+
otlp:
85+
retry_on_failure:
86+
enabled: true
87+
initial_interval: 1s
88+
max_interval: 30s
89+
max_elapsed_time: 300s
90+
```
91+
92+
These settings help smooth out spikes and automatically retry failed exports after rate-limit responses.
93+
94+
### Enable retry logic and queueing
95+
96+
To minimize data loss during temporary throttling, configure your exporter to use a sending queue and retry logic. For example:
97+
98+
```yaml
99+
exporters:
100+
otlp:
101+
sending_queue:
102+
enabled: true
103+
num_consumers: 10
104+
queue_size: 1000
105+
retry_on_failure:
106+
enabled: true
107+
```
108+
109+
This ensures the Collector buffers data locally while waiting for the ingest endpoint to recover from throttling.
110+
111+
## Best practices
112+
113+
To prevent 429 errors and maintain reliable telemetry data flow, implement these best practices:
114+
115+
* Monitor internal Collector metrics (such as `otelcol_exporter_send_failed` and `otelcol_exporter_queue_capacity`) to detect backpressure early.
116+
* Distribute telemetry load evenly across multiple Collectors instead of sending all data through a single instance.
117+
* When possible, enable batching and compression to reduce payload size.
118+
* Keep retry and backoff intervals conservative to avoid overwhelming the endpoint after a temporary throttle.
119+
120+
## Resources
121+
122+
* [{{motlp}} reference](opentelemetry://reference/motlp.md)
123+
* [Quickstart: Send OTLP data to Elastic Serverless or {{ech}}](../../../solutions/observability/get-started/quickstart-elastic-cloud-otel-endpoint.md)

troubleshoot/toc.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -171,6 +171,7 @@ toc:
171171
- file: ingest/opentelemetry/edot-sdks/misconfigured-sampling-sdk.md
172172
- file: ingest/opentelemetry/no-data-in-kibana.md
173173
- file: ingest/opentelemetry/connectivity.md
174+
- file: ingest/opentelemetry/429-errors-motlp.md
174175
- file: ingest/opentelemetry/contact-support.md
175176
- file: ingest/logstash.md
176177
children:

0 commit comments

Comments
 (0)