Skip to content

Commit f2e9a02

Browse files
authored
backpressure: fix mem_buf_limit vs storage.max_chunks_up (#1124)
Signed-off-by: Wesley Pettit <[email protected]>
1 parent 8d0bb6a commit f2e9a02

File tree

1 file changed

+30
-6
lines changed

1 file changed

+30
-6
lines changed

administration/backpressure.md

Lines changed: 30 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -2,17 +2,17 @@
22

33
Under certain scenarios it is possible for logs or data to be ingested or created faster than the ability to flush it to some destinations. One such common scenario is when reading from big log files, especially with a large backlog, and dispatching the logs to a backend over the network, which takes time to respond. This generates backpressure leading to high memory consumption in the service.
44

5-
In order to avoid backpressure, Fluent Bit implements a mechanism in the engine that restricts the amount of data that an input plugin can ingest, this is done through the configuration parameter **Mem\_Buf\_Limit**.
5+
In order to avoid backpressure, Fluent Bit implements a mechanism in the engine that restricts the amount of data that an input plugin can ingest, this is done through the configuration parameters **Mem\_Buf\_Limit** and **storage.Max\_Chunks\_Up**.
66

7-
As described in the [Buffering](../concepts/buffering.md) concepts section, Fluent Bit offers a hybrid mode for data handling: in-memory and filesystem \(optional\).
7+
As described in the [Buffering](../concepts/buffering.md) concepts section, Fluent Bit offers two modes for data handling: in-memory only (default) and in-memory + filesystem \(optional\).
88

9-
In `memory` is always available and can be restricted with **Mem\_Buf\_Limit**. If memory reaches this limit and you reach a backpressure scenario, you will not be able to ingest more data until the data chunks that are in memory can be flushed.
9+
The default `storage.type memory` buffer can be restricted with **Mem\_Buf\_Limit**. If memory reaches this limit and you reach a backpressure scenario, you will not be able to ingest more data until the data chunks that are in memory can be flushed. The input will be paused and Fluent Bit will [emit](https://github.com/fluent/fluent-bit/blob/v2.0.0/src/flb_input_chunk.c#L1334) a `[warn] [input] {input name or alias} paused (mem buf overlimit)` log message. Depending on the input plugin in use, this might lead to discard incoming data \(e.g: TCP input plugin\). The tail plugin can handle pause without data loss; it will store its current file offset and resume reading later. When buffer memory is available, the input will resume collecting/accepting logs and Fluent Bit will [emit](https://github.com/fluent/fluent-bit/blob/v2.0.0/src/flb_input_chunk.c#L1277) a `[info] [input] {input name or alias} paused (mem buf overlimit)` message.
1010

11-
Depending on the input plugin in use, this might lead to discard incoming data \(e.g: TCP input plugin\). This can be mitigated by configuring secondary storage on the filesystem using the `storage.type` of `filesystem` \(as described in [Buffering & Storage](buffering-and-storage.md)\). When the limit is reached, all the new data will be stored safely in the filesystem.
11+
This risk of data loss can be mitigated by configuring secondary storage on the filesystem using the `storage.type` of `filesystem` \(as described in [Buffering & Storage](buffering-and-storage.md)\). Initially, logs will be buffered to *both* memory and filesystem. When the `storage.max_chunks_up` limit is reached, all the new data will be stored safely only in the filesystem. Fluent Bit will stop enqueueing new data in memory and will only buffer to the filesystem. Please note that when `storage.type filesystem` is set, the `Mem_Buf_Limit` setting no longer has any effect, instead, the `[SERVICE]` level `storage.max_chunks_up` setting controls the size of the memory buffer.
1212

1313
## Mem\_Buf\_Limit
1414

15-
This option is disabled by default and can be applied to all input plugins. Let's explain its behavior using the following scenario:
15+
This option is disabled by default and can be applied to all input plugins. Please note that `Mem_Buf_Limit` only applies with the default `storage.type memory`. Let's explain its behavior using the following scenario:
1616

1717
* Mem\_Buf\_Limit is set to 1MB \(one megabyte\)
1818
* input plugin tries to append 700KB
@@ -36,8 +36,32 @@ After some time, usually measured in seconds, if the scheduler was able to flush
3636
* If the plugin is paused, it invokes a **resume** callback
3737
* input plugin can continue appending more data
3838

39+
## storage.max\_chunks\_up
40+
41+
Please note that when `storage.type filesystem` is set, the `Mem_Buf_Limit` setting no longer has any effect, instead, the `[SERVICE]` level `storage.max_chunks_up` setting controls the size of the memory buffer.
42+
43+
The setting behaves similarly to the above scenario with `Mem_Buf_Limit` when the non-default `storage.pause_on_chunks_overlimit` is enabled.
44+
45+
When (default) `storage.pause_on_chunks_overlimit` is disabled, the input will not pause when the memory limit is reached. Instead, it will switch to only buffering logs in the filesystem. The disk spaced used for filesystem buffering can be limited with `storage.total_limit_size`.
46+
47+
Please consule the [Buffering & Storage](buffering-and-storage.md) docs for more information.
48+
3949
## About pause and resume Callbacks
4050

4151
Each plugin is independent and not all of them implements the **pause** and **resume** callbacks. As said, these callbacks are just a notification mechanism for the plugin.
4252

43-
One example of a plugin that implements these callbacks and keeps state correctly is the [Tail Input](../pipeline/inputs/tail.md) plugin. When the **pause** callback is triggered, it pauses its collectors and stops appending data. Upon **resume**, it resumes the collectors and continues ingesting data.
53+
One example of a plugin that implements these callbacks and keeps state correctly is the [Tail Input](../pipeline/inputs/tail.md) plugin. When the **pause** callback is triggered, it pauses its collectors and stops appending data. Upon **resume**, it resumes the collectors and continues ingesting data. Tail will track the current file offset when it pauses and resume at the same position. If the file has not been deleted or moved, it can still be read.
54+
55+
With the default `storage.type memory` and `Mem_Buf_Limit`, the following log messages will be emitted for pause and resume:
56+
57+
```
58+
[warn] [input] {input name or alias} paused (mem buf overlimit)
59+
[info] [input] {input name or alias} resume (mem buf overlimit)
60+
```
61+
62+
With `storage.type filesystem` and `storage.max_chunks_up`, the following log messages will be emitted for pause and resume:
63+
64+
```
65+
[input] {input name or alias} paused (storage buf overlimit
66+
[input] {input name or alias} resume (storage buf overlimit
67+
```

0 commit comments

Comments
 (0)