You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Remove batcher and related config in favor of sending queue (open-telemetry#42767)
<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->
#### Description
Removes the deprecated `batcher` configuration and makes `sending_queue`
as default. To adhere to the previous behaviour, the sending queue
defaults are kept as close to the current defaults as possible. The PR
also drops `asyncBulkIndexer` and related code, a future PR will follow
to refactor the code around bulk indexer sessions.
<!-- Issue number (e.g. open-telemetry#1234) or full URL to issue, if applicable. -->
#### Link to tracking issue
Part of
open-telemetry#42718
<!--Describe what testing was performed and which tests were added.-->
#### Testing
Updated
<!--Describe the documentation added.-->
#### Documentation
Updated
<!--Please delete paragraphs that you did not use before submitting.-->
---------
Co-authored-by: Carson Ip <[email protected]>
# Use this changelog template to create an entry for release notes.
2
+
3
+
# One of 'breaking', 'deprecation', 'new_component', 'enhancement', 'bug_fix'
4
+
change_type: breaking
5
+
6
+
# The name of the component, or a single word describing the area of concern, (e.g. filelogreceiver)
7
+
component: elasticsearchexporter
8
+
9
+
# A brief description of the change. Surround your text with quotes ("") if it needs to start with a backtick (`).
10
+
note: Remove batcher and related config in favor of sending queue
11
+
12
+
# Mandatory: One or more tracking issues related to the change. You can use the PR number here if no issue exists.
13
+
issues: [42718]
14
+
15
+
# (Optional) One or more lines of additional information to render under the primary note.
16
+
# These lines will be padded with 2 spaces and then inserted directly into the document.
17
+
# Use pipe (|) for multiline entries.
18
+
subtext: Previously deprecated `batcher` configuration is removed. `num_consumers` and `flush` are now deprecated as they conflict with `sending_queue` configurations.
19
+
20
+
# If your change doesn't affect end users or the exported elements of any package,
21
+
# you should instead start your pull request title with [chore] or use the "Skip Changelog" label.
22
+
# Optional: The change log or logs in which this entry should be included.
23
+
# e.g. '[user]' or '[user, api]'
24
+
# Include 'user' if the change is relevant to end users.
25
+
# Include 'api' if there is a change to a library API.
Copy file name to clipboardExpand all lines: exporter/elasticsearchexporter/README.md
+31-50Lines changed: 31 additions & 50 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -82,52 +82,24 @@ All other defaults are as defined by [confighttp].
82
82
83
83
### Queuing and batching
84
84
85
-
The exporter is transitioning from its own internal batching to OpenTelemetry's standard
86
-
queueing and batching. The below sections describe the current default and the latest
87
-
configuration option for queueing and batching available via the `sending_queue` configuration.
88
-
89
-
#### Internal batching by Elasticsearch exporter
90
-
91
-
By default, the exporter will perform its own buffering and batching, as configured through the
92
-
`flush`config. In this case both `sending_queue` and `batcher` will be unused. The exporter
93
-
will perform its own buffering and batching and will issue async requests to Elasticsearch in
94
-
all cases other than if any of the following conditions are met:
95
-
96
-
- `sending_queue::batch`is defined (irrespective of `sending_queue` being enabled or not)
97
-
- `batcher::enabled`is defined (set to `true` or `false`)
98
-
99
-
In a future release when the `sending_queue` config is stable, and has feature parity
100
-
with the exporter's existing `flush` config, it will be enabled by default.
101
-
102
-
Using the `sending_queue` functionality provides several benefits over the default behavior:
103
-
- With a persistent queue, or no queue at all, `sending_queue` enables at least once delivery.
104
-
On the other hand, with the default behavior, the exporter will accept data and process it
105
-
asynchronously, which interacts poorly with queueing.
106
-
- By ensuring the exporter makes requests to Elasticsearch synchronously (batching disabled),
107
-
client metadata can be passed through to Elasticsearch requests,
108
-
e.g. by using the [`headers_setter` extension](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/extension/headerssetterextension/README.md).
109
-
110
-
#### Queueing and batching using sending queue
111
-
112
85
The Elasticsearch exporter supports the common [`sending_queue` settings][exporterhelper] which
113
-
supports both queueing and batching. However, the sending queue is currently disabled by
114
-
default. Sending queue can be enabled by setting `sending_queue::enabled` to `true`. The batching support in sending queue is also disabled by default. Batching can be enabled by defining `sending_queue::batch`.
86
+
supports both queueing and batching. The default sending queue is configured to do async batching
87
+
with the following configuration:
115
88
116
-
The [`exporterhelper` documentation][exporterhelper] provides more details on the `sending_queue` settings.
117
-
118
-
#### Deprecated batcher config
119
-
120
-
> [!WARNING]
121
-
> The `batcher` config is now deprecated and will be removed in an upcoming version. Check the [queueing and batching](#queueing-and-batching) section for using the `sending_queue` setting that supersedes `batcher`. In the interim, `batcher` configurations are still valid, however, they will be ignored if `sending_queue::batch` is defined even if `sending_queue` is not enabled.
122
-
123
-
The Elasticsearch exporter supports the [common `batcher` settings](https://github.com/open-telemetry/opentelemetry-collector/blob/main/exporter/exporterhelper/internal/queue_sender.go).
89
+
```yaml
90
+
sending_queue:
91
+
enabled: true
92
+
sizer: requests
93
+
num_consumers: 10
94
+
queue_size: 10
95
+
batch:
96
+
flush_timeout: 10s
97
+
min_size: 1e+6 // 1MB
98
+
max_size: 5e+6 // 5MB
99
+
sizer: bytes
100
+
```
124
101
125
-
- `batcher`:
126
-
- `enabled` (default=unset): Enable batching of requests into 1 or more bulk requests. On a batcher flush, it is possible for a batched request to be translated to more than 1 bulk request due to `flush::bytes`.
127
-
- `sizer` (default=items): Unit of `min_size` and `max_size`. Currently supports only "items", in the future will also support "bytes".
128
-
- `min_size` (default=5000): Minimum batch size to be exported to Elasticsearch, measured in units according to `batcher::sizer`.
129
-
- `max_size` (default=0): Maximum batch size to be exported to Elasticsearch, measured in units according to `batcher::sizer`. To limit bulk request size, configure `flush::bytes` instead. :warning: It is recommended to keep `max_size` as 0 as a non-zero value may lead to broken metrics grouping and indexing rejections.
130
-
- `flush_timeout` (default=10s): Maximum time of the oldest item spent inside the batcher buffer, aka "max age of batcher buffer". A batcher flush will happen regardless of the size of content in batcher buffer.
102
+
The default configurations are chosen to be closer to the defaults with the exporter's previous inbuilt batching feature. The [`exporterhelper` documentation][exporterhelper] provides more details on the `sending_queue` settings.
131
103
132
104
### Elasticsearch document routing
133
105
@@ -321,20 +293,29 @@ This can be configured through the following settings:
321
293
The Elasticsearch exporter uses the [Elasticsearch Bulk API] for indexing documents.
322
294
The behaviour of this bulk indexing can be configured with the following settings:
323
295
324
-
- `num_workers` (default=runtime.NumCPU()): Number of workers publishing bulk requests concurrently. Note this is not applicable if `batcher::enabled` is `true` or `false`.
- `bytes` (default=5000000): Write buffer flush size limit before compression. A bulk request will be sent immediately when its buffer exceeds this limit. This value should be much lower than [Elasticsearch's `http.max_content_length`](https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-network.html#http-settings) config to avoid HTTP 413 Entity Too Large error. It is recommended to keep this value under 5MB.
327
-
- `interval` (default=10s): Write buffer flush time limit.
296
+
- `num_workers` (DEPRECATED, use `sending_queue::num_consumers` instead): This config is deprecated and will be used to configure `sending_queue::num_consumers` if `sending_queue::num_consumers` is not explicitly defined. Number of workers publishing bulk requests concurrently.
297
+
- `flush` (DEPRECATED, use `sending_queue` instead): This config is deprecated and will be used to configure different options for `sending_queue` if `sending_queue` options are not explicitly defined. Event bulk indexer buffer flush settings
298
+
- `bytes` (DEPRECATED, use `sending_queue::batch::max_size` instead): This config is deprecated and will be used to configure `sending_queue::batch::max_size` if `sending_queue::batch::max_size` is not explicitly defined. See the `sending_queue::batch::max_size` for more details.
299
+
- `interval` (DEPRECATED, use `sending_queue::batch::flush_timeout` instead): This config is deprecated and will be used to configure `sending_queue::batch::flush_timeout` if `sending_queue::batch::flush_timeout` is not explicitly defined. See the `sending_queue::batch::flush_timeout` for more details.
- `enabled` (default=true): Enable/Disable request retry on error. Failed requests are retried with exponential backoff.
330
302
- `max_requests` (DEPRECATED, use retry::max_retries instead): Number of HTTP request retries including the initial attempt. If used, `retry::max_retries` will be set to `max_requests - 1`.
331
303
- `max_retries` (default=2): Number of HTTP request retries. To disable retries, set `retry::enabled` to `false` instead of setting `max_retries` to `0`.
332
304
- `initial_interval` (default=100ms): Initial waiting time if a HTTP request failed.
333
305
- `max_interval` (default=1m): Max waiting time if a HTTP request failed.
334
306
- `retry_on_status` (default=[429]): Status codes that trigger request or document level retries. Request level retry and document level retry status codes are shared and cannot be configured separately. To avoid duplicates, it defaults to `[429]`.
335
-
336
-
> [!NOTE]
337
-
> The `flush::interval` config will be ignored when `batcher::enabled` config is explicitly set to `true` or `false`.
307
+
- `sending_queue`: Configures the queueing and batching behaviour. Below are the defaults (which may vary from standard defaults), for full configuration check the [exporterheler docs][exporterhelper].
308
+
- `enabled` (default=true): Enable queueing and batching behaviour.
309
+
- `num_consumers` (default=10): Number of consumers that dequeue batches.
310
+
- `wait_for_result` (default=false): If `true`, blocks incoming requests until processed.
311
+
- `block_on_overflow` (default=false): If `true`, blocks the request until the queue has space.
312
+
- `sizer` (default=requests): Measure queueing by requests.
313
+
- `queue_size` (default=10): Maximum size the queue can accept.
314
+
- `batch`:
315
+
- `flush_timeout` (default=10s): Time after which batch is exported irrespective of other settings.
316
+
- `sizer` (default=bytes): Size batches by bytes. Note that bytes here are based on the pdata model and not on the NDJSON docs that will constitute the bulk indexer requests. To address this discrepency the bulk indexers could also flush when their size exceeds the configured max_size due to size of pdata model being smaller than their corresponding NDJSON encoding.
317
+
- `min_size` (default=1MB): Min size of the batch.
318
+
- `max_size` (default=5MB): Max size of the batch. This value should be much lower than [Elasticsearch's `http.max_content_length`](https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-network.html#http-settings) config to avoid HTTP 413 Entity Too Large error. It is recommended to keep this value under 5MB.
0 commit comments