diff --git a/administration/troubleshooting.md b/administration/troubleshooting.md index 033ad23d1..a0b0c93d5 100644 --- a/administration/troubleshooting.md +++ b/administration/troubleshooting.md @@ -2,17 +2,17 @@ -* [Tap Functionality: generate events or records](troubleshooting.md#tap-functionality) -* [Dump Internals Signal](troubleshooting#dump-internals-signal) +- [Tap: generate events or records](troubleshooting.md#tap) +- [Dump internals signal](troubleshooting#dump-internals-signal) -## Tap Functionality +## Tap Tap can be used to generate events or records detailing what messages pass through Fluent Bit, at what time and what filters affect them. -### Simple example +### Basic Tap example -First, we will make sure that the container image we are going to use actually supports Fluent Bit Tap (available in Fluent Bit 2.0+): +Ensure that the container image supports Fluent Bit Tap (available in Fluent Bit 2.0+): ```shell $ docker run --rm -ti fluent/fluent-bit:latest --help | grep trace @@ -23,9 +23,11 @@ $ docker run --rm -ti fluent/fluent-bit:latest --help | grep trace --trace setup a trace pipeline on startup. Uses a single line, ie: "input=dummy.0 output=stdout output.format='json'" ``` -If the `--enable-chunk-trace` option is present it means Fluent Bit has support for Fluent Bit Tap but it is disabled by default, so remember to enable it with this option. +If the `--enable-chunk-trace` option is present, your Fluent Bit version supports +Fluent Bit Tap, but it's disabled by default. Use this option to enable it. -You can start fluent-bit with tracing activated from the beginning by using the `trace-input` and `trace-output` properties, like so: +You can start Fluent Bit with tracing activated from the beginning by using the +`trace-input` and `trace-output` properties: ```bash $ fluent-bit -Z -i dummy -o stdout -f 1 --trace-input=dummy.0 --trace-output=stdout @@ -75,13 +77,13 @@ Fluent Bit v2.1.8 [2023/07/21 16:27:07] [ info] [output:stdout:stdout.0] thread worker #0 stopped ``` -If you see the following warning then the `-Z` or `--enable-chunk-tracing` option is missing: +The following warning indicates the `-Z` or `--enable-chunk-tracing` option is missing: -```bash +```text [2023/07/21 16:26:42] [ warn] [chunk trace] enable chunk tracing via the configuration or command line to be able to activate tracing. ``` -Properties can be set for the output using the `--trace-output-property` option: +Set properties for the output using the `--trace-output-property` option: ```bash $ fluent-bit -Z -i dummy -o stdout -f 1 --trace-input=dummy.0 --trace-output=stdout --trace-output-property=format=json_lines @@ -113,25 +115,25 @@ Fluent Bit v2.1.8 [0] dummy.0: [[1689971342.068613646, {}], {"message"=>"dummy"}] ``` -With that options set the stdout plugin is now emitting traces in `json_lines` format: +With that option set, the stdout plugin emits traces in `json_lines` format: ```json {"date":1689971340.068745,"type":1,"trace_id":"0","plugin_instance":"dummy.0","records":[{"timestamp":1689971340,"record":{"message":"dummy"}}],"start_time":1689971340,"end_time":1689971340} ``` -All three options can also be defined using the much more flexible `--trace` option: +All three options can also be defined using the more flexible `--trace` option: ```bash -$ fluent-bit -Z -i dummy -o stdout -f 1 --trace="input=dummy.0 output=stdout output.format=json_lines" +fluent-bit -Z -i dummy -o stdout -f 1 --trace="input=dummy.0 output=stdout output.format=json_lines" ``` -We defined the entire tap pipeline using this configuration: `input=dummy.0 output=stdout output.format=json_lines` which defines the following: +This example defines the Tap pipeline using this configuration: `input=dummy.0 output=stdout output.format=json_lines` which defines the following: - * input: dummy.0 (listens to the tag and/or alias `dummy.0`) - * output: stdout (outputs to a stdout plugin) - * output.format: json_lines (sets the stdout format o `json_lines`) +- `input`: `dummy.0` listens to the tag or alias `dummy.0`. +- `output`: `stdout` outputs to a stdout plugin. +- `output.format`: `json_lines` sets the stdout format to `json_lines`. -Tap support can also be activated and deactivated via the embedded web server: +Tap support can also be activated and deactivated using the embedded web server: ```shell $ docker run --rm -ti -p 2020:2020 fluent/fluent-bit:latest -Z -H -i dummy -p alias=input_dummy -o stdout -f 1 @@ -154,17 +156,16 @@ Fluent Bit v2.0.0 ``` -In another terminal we can activate Tap by either using the instance id of the input; `dummy.0` or its alias. - -Since the alias is more predictable that is what we will use: - +In another terminal, activate Tap by either using the instance id of the input +(`dummy.0`) or its alias. The alias is more predictable, and is used here: ```shell $ curl 127.0.0.1:2020/api/v1/trace/input_dummy {"status":"ok"} ``` -This response means we have activated Tap, the terminal with Fluent Bit running should now look like this: +This response means Tap is active. The terminal with Fluent Bit running should now +look like this: ```shell [0] dummy.0: [1666346615.203253156, {"message"=>"dummy"}] @@ -185,38 +186,42 @@ This response means we have activated Tap, the terminal with Fluent Bit running ``` -All the records that now appear are those emitted by the activities of the dummy plugin. +All the records that display are those emitted by the activities of the dummy plugin. -### Complex example +### Complex Tap example -This example takes the same steps but demonstrates the same mechanism works with more complicated configurations. -In this example we will follow a single input of many which passes through several filters. +This example takes the same steps but demonstrates how the mechanism works with more +complicated configurations. -``` +This example follows a single input, out of many, and which passes through several +filters. + +```shell $ docker run --rm -ti -p 2020:2020 \ - fluent/fluent-bit:latest \ - -Z -H \ - -i dummy -p alias=dummy_0 -p \ - dummy='{"dummy": "dummy_0", "key_name": "foo", "key_cnt": "1"}' \ - -i dummy -p alias=dummy_1 -p dummy='{"dummy": "dummy_1"}' \ - -i dummy -p alias=dummy_2 -p dummy='{"dummy": "dummy_2"}' \ - -F record_modifier -m 'dummy.0' -p record="powered_by fluent" \ - -F record_modifier -m 'dummy.1' -p record="powered_by fluent-bit" \ - -F nest -m 'dummy.0' \ - -p operation=nest -p wildcard='key_*' -p nest_under=data \ - -o null -m '*' -f 1 + fluent/fluent-bit:latest \ + -Z -H \ + -i dummy -p alias=dummy_0 -p \ + dummy='{"dummy": "dummy_0", "key_name": "foo", "key_cnt": "1"}' \ + -i dummy -p alias=dummy_1 -p dummy='{"dummy": "dummy_1"}' \ + -i dummy -p alias=dummy_2 -p dummy='{"dummy": "dummy_2"}' \ + -F record_modifier -m 'dummy.0' -p record="powered_by fluent" \ + -F record_modifier -m 'dummy.1' -p record="powered_by fluent-bit" \ + -F nest -m 'dummy.0' \ + -p operation=nest -p wildcard='key_*' -p nest_under=data \ + -o null -m '*' -f 1 ``` -To make sure the window is not cluttered by the actual records generated by the input plugins we send all of it to `null`. +To ensure the window isn't cluttered by the records generated by the input plugins, +send all of it to `null`. -We activate with the following 'curl' command: +Activate with the following `curl` command: ```shell $ curl 127.0.0.1:2020/api/v1/trace/dummy_0 {"status":"ok"} ``` -Now we should start seeing output similar to the following: +You should start seeing output similar to the following: ```shell [0] trace: [1666349359.325597543, {"type"=>1, "trace_id"=>"trace.0", "plugin_instance"=>"dummy.0", "plugin_alias"=>"dummy_0", "records"=>[{"timestamp"=>1666349359, "record"=>{"dummy"=>"dummy_0", "key_name"=>"foo", "key_cnt"=>"1"}}], "start_time"=>1666349359, "end_time"=>1666349359}] @@ -251,12 +256,18 @@ Now we should start seeing output similar to the following: [2022/10/21 10:49:25] [ info] [output:null:null.0] thread worker #0 stopping... [2022/10/21 10:49:25] [ info] [output:null:null.0] thread worker #0 stopped ``` -### Parameters for the output in Tap -When activating Tap, any plugin parameter can be given. These can be used to modify, for example, the output format, the name of the time key, the format of the date, etc. -In the next example we will use the parameter ```"format": "json"``` to demonstrate how in Tap, stdout can be shown in Json format. +### Parameters for the output in Tap + +When activating Tap, any plugin parameter can be given. These parameters can be used +to modify the output format, the name of the time key, the format of the date, and +other details. + +The following example uses the parameter `"format": "json"` to demonstrate how +to show `stdout` in JSON format. First, run Fluent Bit enabling Tap: + ```shell $ docker run --rm -ti -p 2020:2020 fluent/fluent-bit:latest -Z -H -i dummy -p alias=input_dummy -o stdout -f 1 Fluent Bit v2.0.8 @@ -277,108 +288,102 @@ Fluent Bit v2.0.8 [0] dummy.0: [1674805466.973669512, {"message"=>"dummy"}] ... ``` -Next, in another terminal, we activate Tap including the output, in this case stdout, and the parameters wanted, in this case ```"format": "json"```: + +In another terminal, activate Tap including the output (`stdout`), and the +parameters wanted (`"format": "json"`): ```shell $ curl 127.0.0.1:2020/api/v1/trace/input_dummy -d '{"output":"stdout", "params": {"format": "json"}}' {"status":"ok"} ``` -In the first terminal, we should be seeing the output similar to the following: + +In the first terminal, you should see the output similar to the following: + ```shell [0] dummy.0: [1674805635.972373840, {"message"=>"dummy"}] [{"date":1674805634.974457,"type":1,"trace_id":"0","plugin_instance":"dummy.0","plugin_alias":"input_dummy","records":[{"timestamp":1674805634,"record":{"message":"dummy"}}],"start_time":1674805634,"end_time":1674805634},{"date":1674805634.974605,"type":3,"trace_id":"0","plugin_instance":"dummy.0","plugin_alias":"input_dummy","records":[{"timestamp":1674805634,"record":{"message":"dummy"}}],"start_time":1674805634,"end_time":1674805634},{"date":1674805635.972398,"type":1,"trace_id":"1","plugin_instance":"dummy.0","plugin_alias":"input_dummy","records":[{"timestamp":1674805635,"record":{"message":"dummy"}}],"start_time":1674805635,"end_time":1674805635},{"date":1674805635.972413,"type":3,"trace_id":"1","plugin_instance":"dummy.0","plugin_alias":"input_dummy","records":[{"timestamp":1674805635,"record":{"message":"dummy"}}],"start_time":1674805635,"end_time":1674805635}] [0] dummy.0: [1674805636.973970215, {"message"=>"dummy"}] [{"date":1674805636.974008,"type":1,"trace_id":"2","plugin_instance":"dummy.0","plugin_alias":"input_dummy","records":[{"timestamp":1674805636,"record":{"message":"dummy"}}],"start_time":1674805636,"end_time":1674805636},{"date":1674805636.974034,"type":3,"trace_id":"2","plugin_instance":"dummy.0","plugin_alias":"input_dummy","records":[{"timestamp":1674805636,"record":{"message":"dummy"}}],"start_time":1674805636,"end_time":1674805636}] ``` -This parameter shows stdout in Json format, however, as mentioned before, parameters can be passed to any plugin. -Please visit the following link for more information on other output plugins: -https://docs.fluentbit.io/manual/pipeline/outputs +This parameter shows stdout in JSON format. + +See [output plugins](https://docs.fluentbit.io/manual/pipeline/outputs) for +additional information. -### Analysis of a single Tap record +### Analyze a single Tap record -Here we analyze a single record from a filter event to explain the meaning of each field in detail. -We chose a filter record since it includes the most details of all the record types. +This filter record is an example to explain the details of a Tap record: ```json { - "type": 2, - "start_time": 1666349231, - "end_time": 1666349231, - "trace_id": "trace.1", - "plugin_instance": "nest.2", - "records": [{ - "timestamp": 1666349231, - "record": { - "dummy": "dummy_0", - "powered_by": "fluent", - "data": { - "key_name": "foo", - "key_cnt": "1" - } - } - }] + "type": 2, + "start_time": 1666349231, + "end_time": 1666349231, + "trace_id": "trace.1", + "plugin_instance": "nest.2", + "records": [{ + "timestamp": 1666349231, + "record": { + "dummy": "dummy_0", + "powered_by": "fluent", + "data": { + "key_name": "foo", + "key_cnt": "1" + } + } + }] } ``` -### type - -The type defines at what stage the event is generated: - -- type=1: input record - - this is the unadulterated input record -- type=2: filtered record - - this is a record once it has been filtered. One record is generated per filter. -- type=3: pre-output record - - this is the record right before it is sent for output. - -Since this is a record generated by the manipulation of a record by a filter is has the type `2`. - -### start_time and end_time - -This records the start and end of an event, it is a bit different for each event type: - -- type 1: when the input is received, both the start and end time. -- type 2: the time when filtering is matched until it has finished processing. -- type 3: the time when the input is received and when it is finally slated for output. - -### trace_id - -This is a string composed of a prefix and a number which is incremented with each record received by the input during the Tap session. - -### plugin_instance - -This is the plugin instance name as it is generated by Fluent Bit at runtime. - -### plugin_alias - -If an alias is set this field will contain the alias set for a plugin. - -### records - -This is an array of all the records being sent. Since Fluent Bit handles records in chunks of multiple records and chunks are indivisible the same is done in the Tap output. Each record consists of its timestamp followed by the actual data which is a composite type of keys and values. +- `type`: Defines the stage the event is generated: + - `1`: Input record. This is the unadulterated input record. + - `2`: Filtered record. This is a record after it was filtered. One record is + generated per filter. + - `3`: Pre-output record. This is the record right before it's sent for output. + + This example is a record generated by the manipulation of a record by a filter so + it has the type `2`. +- `start_time` and `end_time`: Records the start and end of an event, and is + different for each event type: + - type 1: When the input is received, both the start and end time. + - type 2: The time when filtering is matched until it has finished processing. + - type 3: The time when the input is received and when it's finally slated for output. +- `trace_id`: A string composed of a prefix and a number which is incremented with + each record received by the input during the Tap session. +- `plugin_instance`: The plugin instance name as generated by Fluent Bit at runtime. +- `plugin_alias`: If an alias is set this field will contain the alias set for a plugin. +- `records`: An array of all the records being sent. Fluent Bit handles records in + chunks of multiple records and chunks are indivisible, the same is done in the Tap + output. Each record consists of its timestamp followed by the actual data which is + a composite type of keys and values. ## Dump Internals / Signal -When the service is running we can export [metrics](monitoring.md) to see the overall status of the data flow of the service. But there are other use cases where we would like to know the current status of the internals of the service, specifically to answer questions like _what's the current status of the internal buffers ?_ , the Dump Internals feature is the answer. +When the service is running, you can export [metrics](monitoring.md) to see the +overall status of the data flow of the service. There are other use cases where +you might need to know the current status of the service internals, like the current +status of the internal buffers. Dump Internals can help provide this information. -Fluent Bit v1.4 introduces the Dump Internals feature that can be triggered easily from the command line triggering the `CONT` Unix signal. +Fluent Bit v1.4 introduced the Dump Internals feature, which can be triggered from +the command line triggering the `CONT` Unix signal. {% hint style="info" %} -note: this feature is only available on Linux and BSD family operating systems +This feature is only available on Linux and BSD operating systems. {% endhint %} ### Usage Run the following `kill` command to signal Fluent Bit: -```text +```shell kill -CONT `pidof fluent-bit` ``` -> The command `pidof` aims to lookup the Process ID of Fluent Bit. You can replace the +The command `pidof` aims to identify the Process ID of Fluent Bit. -Fluent Bit will dump the following information to the standard output interface \(stdout\): +Fluent Bit will dump the following information to the standard output interface +(`stdout`): ```text [engine] caught signal (SIGCONT) @@ -414,9 +419,9 @@ total chunks : 92 └─ down : 57 ``` -### Input Plugins Dump +### Input plugins -The dump provides insights for every input instance configured. +The input plugins dump provides insights for every input instance configured. ### Status @@ -424,46 +429,52 @@ Overall ingestion status of the plugin. | Entry | Sub-entry | Description | | :--- | :--- | :--- | -| overlimit | | If the plugin has been configured with [Mem\_Buf\_Limit](backpressure.md), this entry will report if the plugin is over the limit or not at the moment of the dump. If it is overlimit, it will print `yes`, otherwise `no`. | -| | mem\_size | Current memory size in use by the input plugin in-memory. | -| | mem\_limit | Limit set by Mem\_Buf\_Limit. | +| `overlimit` | | If the plugin has been configured with [`Mem_Buf_Limit`](backpressure.md), this entry will report if the plugin is over the limit or not at the moment of the dump. Over the limit prints `yes`, otherwise `no`. | +| | `mem_size` | Current memory size in use by the input plugin in-memory. | +| | `mem_limit` | Limit set by `Mem_Buf_Limit`. | ### Tasks -When an input plugin ingest data into the engine, a Chunk is created. A Chunk can contains multiple records. Upon flush time, the engine creates a Task that contains the routes for the Chunk associated in question. +When an input plugin ingests data into the engine, a Chunk is created. A Chunk can +contains multiple records. At flush time, the engine creates a Task that contains the +routes for the Chunk associated in question. The Task dump describes the tasks associated to the input plugin: | Entry | Description | | :--- | :--- | -| total\_tasks | Total number of active tasks associated to data generated by the input plugin. | -| new | Number of tasks not assigned yet to an output plugin. Tasks are in `new` status for a very short period of time \(most of the time this value is very low or zero\). | -| running | Number of active tasks being processed by output plugins. | -| size | Amount of memory used by the Chunks being processed \(Total chunks size\). | +| `total_tasks` | Total number of active tasks associated to data generated by the input plugin. | +| `new` | Number of tasks not yet assigned to an output plugin. Tasks are in `new` status for a very short period of time. This value is normally very low or zero. | +| `running` | Number of active tasks being processed by output plugins. | +| `size` | Amount of memory used by the Chunks being processed (total chunk size). | ### Chunks -The Chunks dump tells more details about all the chunks that the input plugin has generated and are still being processed. +The Chunks dump tells more details about all the chunks that the input plugin has +generated and are still being processed. -Depending of the buffering strategy and limits imposed by configuration, some Chunks might be `up` \(in memory\) or `down` \(filesystem\). +Depending of the buffering strategy and limits imposed by configuration, some Chunks +might be `up` (in memory) or `down` (filesystem). | Entry | Sub-entry | Description | | :--- | :--- | :--- | -| total\_chunks | | Total number of Chunks generated by the input plugin that are still being processed by the engine. | -| up\_chunks | | Total number of Chunks that are loaded in memory. | -| down\_chunks | | Total number of Chunks that are stored in the filesystem but not loaded in memory yet. | -| busy\_chunks | | Chunks marked as busy \(being flushed\) or locked. Busy Chunks are immutable and likely are ready to \(or being\) processed. | -| | size | Amount of bytes used by the Chunk. | -| | size err | Number of Chunks in an error state where it size could not be retrieved. | +| `total_chunks` | | Total number of Chunks generated by the input plugin that are still being processed by the engine. | +| `up_chunks` | | Total number of Chunks loaded in memory. | +| `down_chunks` | | Total number of Chunks stored in the filesystem but not loaded in memory yet. | +| `busy_chunks` | | Chunks marked as busy (being flushed) or locked. Busy Chunks are immutable and likely are ready to be or are being processed. | +| | `size` | Amount of bytes used by the Chunk. | +| | `size err` | Number of Chunks in an error state where its size couldn't be retrieved. | -### Storage Layer Dump +### Storage Layer -Fluent Bit relies on a custom storage layer interface designed for hybrid buffering. The `Storage Layer` entry contains a total summary of Chunks registered by Fluent Bit: +Fluent Bit relies on a custom storage layer interface designed for hybrid buffering. +The `Storage Layer` entry contains a total summary of Chunks registered by Fluent +Bit: | Entry | Sub-Entry | Description | | :--- | :--- | :--- | -| total chunks | | Total number of Chunks | -| mem chunks | | Total number of Chunks memory-based | -| fs chunks | | Total number of Chunks filesystem based | -| | up | Total number of filesystem chunks up in memory | -| | down | Total number of filesystem chunks down \(not loaded in memory\) | +| `total chunks` | | Total number of Chunks. | +| `mem chunks` | | Total number of Chunks memory-based. | +| `fs chunks` | | Total number of Chunks filesystem based. | +| | `up` | Total number of filesystem chunks up in memory. | +| | `down` | Total number of filesystem chunks down (not loaded in memory). |