You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
| /api/v1/metrics/prometheus | Display internal metrics per loaded plugin in Prometheus Server format. | Prometheus Text 0.0.4 |
106
-
| /api/v1/storage | Get internal metrics of the storage layer / buffered data. This option is enabled only if in the `SERVICE` section of the property `storage.metrics` is enabled. | JSON |
107
-
| /api/v1/health | Display the Fluent Bit health check result. | String |
108
-
| /api/v2/metrics | Display internal metrics per loaded plugin. |[cmetrics text format](https://github.com/fluent/cmetrics)|
109
-
| /api/v2/metrics/prometheus | Display internal metrics per loaded plugin ready in Prometheus Server format. | Prometheus Text 0.0.4 |
110
-
| /api/v2/reload | Execute hot reloading or get the status of hot reloading. See the [hot-reloading documentation](hot-reload.md). | JSON |
101
+
|`/`| Fluent Bit build information. | JSON |
102
+
|`/api/v1/uptime`| Return uptime information in seconds. | JSON |
103
+
|`/api/v1/metrics`| Display internal metrics per loaded plugin. | JSON |
104
+
|`/api/v1/metrics/prometheus`| Display internal metrics per loaded plugin in Prometheus Server format. | Prometheus Text 0.0.4 |
105
+
|`/api/v1/storage`| Get internal metrics of the storage layer / buffered data. This option is enabled only if in the `SERVICE` section of the property `storage.metrics` is enabled. | JSON |
106
+
|`/api/v1/health`| Display the Fluent Bit health check result. | String |
107
+
|`/api/v2/metrics`| Display internal metrics per loaded plugin. |[cmetrics text format](https://github.com/fluent/cmetrics)|
108
+
|`/api/v2/metrics/prometheus`| Display internal metrics per loaded plugin ready in Prometheus Server format. | Prometheus Text 0.0.4 |
109
+
|`/api/v2/reload | Execute hot reloading or get the status of hot reloading. See the [hot-reloading documentation](hot-reload.md). | JSON |
111
110
112
111
### v1 metrics
113
112
@@ -131,14 +130,14 @@ The following terms are key to understanding how Fluent Bit processes metrics:
131
130
as successful, or it can fail the chunk entirely if an unrecoverable error is
132
131
encountered, or it can ask for the chunk to be retried.
133
132
134
-
| Metric name | Labels | Description | Type | Unit|
| Metric name | Labels | Description | Type | Unit |
134
+
|-----------|------|-----------|----|----|
136
135
|`fluentbit_input_bytes_total`| name: the name or alias for the input instance | The number of bytes of log records that this input instance has ingested successfully. | counter | bytes |
137
136
|`fluentbit_input_records_total`| name: the name or alias for the input instance | The number of log records this input ingested successfully. | counter | records |
138
137
|`fluentbit_output_dropped_records_total`| name: the name or alias for the output instance | The number of log records dropped by the output. These records hit an unrecoverable error or retries expired for their chunk. | counter | records |
139
138
|`fluentbit_output_errors_total`| name: the name or alias for the output instance | The number of chunks with an error that's either unrecoverable or unable to retry. This metric represents the number of times a chunk failed, and doesn't correspond with the number of error messages visible in the Fluent Bit log output. | counter | chunks |
140
-
|`fluentbit_output_proc_bytes_total`| name: the name or alias for the output instance | The number of bytes of log records that this output instance sent successfully. This metric represents the total byte size of all unique chunks sent by this output. If a record is not sent due to some error, it doesn't count towards this metric. | counter | bytes |
141
-
|`fluentbit_output_proc_records_total`| name: the name or alias for the output instance | The number of log records that this output instance sent successfully. This metric represents the total record count of all unique chunks sent by this output. If a record is not sent successfully, it doesn't count towards this metric. | counter | records |
139
+
|`fluentbit_output_proc_bytes_total`| name: the name or alias for the output instance | The number of bytes of log records that this output instance sent successfully. This metric represents the total byte size of all unique chunks sent by this output. If a record isn't sent due to some error, it doesn't count towards this metric. | counter | bytes |
140
+
|`fluentbit_output_proc_records_total`| name: the name or alias for the output instance | The number of log records that this output instance sent successfully. This metric represents the total record count of all unique chunks sent by this output. If a record isn't sent successfully, it doesn't count towards this metric. | counter | records |
142
141
|`fluentbit_output_retried_records_total`| name: the name or alias for the output instance | The number of log records that experienced a retry. This metric is calculated at the chunk level, the count increased when an entire chunk is marked for retry. An output plugin might perform multiple actions that generate many error messages when uploading a single chunk. | counter | records |
143
142
|`fluentbit_output_retries_failed_total`| name: the name or alias for the output instance | The number of times that retries expired for a chunk. Each plugin configures a `Retry_Limit`, which applies to chunks. When the `Retry_Limit` is exceeded, the chunk is discarded and this metric is incremented. | counter | chunks |
144
143
|`fluentbit_output_retries_total`| name: the name or alias for the output instance | The number of times this output instance requested a retry for a chunk. | counter | chunks |
@@ -163,7 +162,7 @@ The following descriptions apply to metrics outputted in JSON format by the
163
162
|`input_chunks.{plugin name}.chunks.total`| The current total number of chunks owned by this input instance. | chunks |
164
163
|`input_chunks.{plugin name}.chunks.up`| The current number of chunks that are in memory for this input. If file system storage is enabled, chunks that are "up" are also stored in the filesystem layer. | chunks |
165
164
|`input_chunks.{plugin name}.chunks.down`| The current number of chunks that are "down" in the filesystem for this input. | chunks |
166
-
|`input_chunks.{plugin name}.chunks.busy`| Chunks are that are being processed or sent by outputs and are not eligible to have new data appended. | chunks |
165
+
|`input_chunks.{plugin name}.chunks.busy`| Chunks are that are being processed or sent by outputs and aren't eligible to have new data appended. | chunks |
167
166
|`input_chunks.{plugin name}.chunks.busy_size`| The sum of the byte size of each chunk which is currently marked as busy. | bytes |
168
167
169
168
### v2 metrics
@@ -198,8 +197,8 @@ The following terms are key to understanding how Fluent Bit processes metrics:
198
197
|`fluentbit_filter_drop_records_total`| name: the name or alias for the filter instance | The number of log records dropped by the filter and removed from the data pipeline. | counter | records |
199
198
|`fluentbit_output_dropped_records_total`| name: the name or alias for the output instance | The number of log records dropped by the output. These records hit an unrecoverable error or retries expired for their chunk. | counter | records |
200
199
|`fluentbit_output_errors_total`| name: the name or alias for the output instance | The number of chunks with an error that's either unrecoverable or unable to retry. This metric represents the number of times a chunk failed, and doesn't correspond with the number of error messages visible in the Fluent Bit log output. | counter | chunks |
201
-
|`fluentbit_output_proc_bytes_total`| name: the name or alias for the output instance | The number of bytes of log records that this output instance sent successfully. This metric represents the total byte size of all unique chunks sent by this output. If a record is not sent due to some error, it doesn't count towards this metric. | counter | bytes |
202
-
|`fluentbit_output_proc_records_total`| name: the name or alias for the output instance | The number of log records that this output instance sent successfully. This metric represents the total record count of all unique chunks sent by this output. If a record is not sent successfully, it doesn't count towards this metric. | counter | records |
200
+
|`fluentbit_output_proc_bytes_total`| name: the name or alias for the output instance | The number of bytes of log records that this output instance sent successfully. This metric represents the total byte size of all unique chunks sent by this output. If a record isn't sent due to some error, it doesn't count towards this metric. | counter | bytes |
201
+
|`fluentbit_output_proc_records_total`| name: the name or alias for the output instance | The number of log records that this output instance sent successfully. This metric represents the total record count of all unique chunks sent by this output. If a record isn't sent successfully, it doesn't count towards this metric. | counter | records |
203
202
|`fluentbit_output_retried_records_total`| name: the name or alias for the output instance | The number of log records that experienced a retry. This metric is calculated at the chunk level, the count increased when an entire chunk is marked for retry. An output plugin might perform multiple actions that generate many error messages when uploading a single chunk. | counter | records |
204
203
|`fluentbit_output_retries_failed_total`| name: the name or alias for the output instance | The number of times that retries expired for a chunk. Each plugin configures a `Retry_Limit`, which applies to chunks. When the `Retry_Limit` is exceeded, the chunk is discarded and this metric is incremented. | counter | chunks |
205
204
|`fluentbit_output_retries_total`| name: the name or alias for the output instance | The number of times this output instance requested a retry for a chunk. | counter | chunks |
@@ -227,7 +226,7 @@ layer.
227
226
|`fluentbit_input_storage_chunks`| name: the name or alias for the input instance | The current total number of chunks owned by this input instance. | gauge | chunks |
228
227
|`fluentbit_input_storage_chunks_up`| name: the name or alias for the input instance | The current number of chunks that are in memory for this input. If file system storage is enabled, chunks that are "up" are also stored in the filesystem layer. | gauge | chunks |
229
228
|`fluentbit_input_storage_chunks_down`| name: the name or alias for the input instance | The current number of chunks that are "down" in the filesystem for this input. | gauge | chunks |
230
-
|`fluentbit_input_storage_chunks_busy`| name: the name or alias for the input instance | Chunks are that are being processed or sent by outputs and are not eligible to have new data appended. | gauge | chunks |
229
+
|`fluentbit_input_storage_chunks_busy`| name: the name or alias for the input instance | Chunks are that are being processed or sent by outputs and aren't eligible to have new data appended. | gauge | chunks |
231
230
|`fluentbit_input_storage_chunks_busy_bytes`| name: the name or alias for the input instance | The sum of the byte size of each chunk which is currently marked as busy. | gauge | bytes |
232
231
|`fluentbit_output_upstream_total_connections`| name: the name or alias for the output instance | The sum of the connection count of each output plugins. | gauge | bytes |
233
232
|`fluentbit_output_upstream_busy_connections`| name: the name or alias for the output instance | The sum of the connection count in a busy state of each output plugins. | gauge | bytes |
@@ -236,8 +235,8 @@ layer.
236
235
237
236
Query the service uptime with the following command:
|`Health_Check`|enable Health check feature | Off |
379
-
|`HC_Errors_Count`| the error count to meet the unhealthy requirement, this is a sum for all output plugins in a defined HC_Period, example for output error: `[2022/02/16 10:44:10] [ warn] [engine] failed to flush chunk '1-1645008245.491540684.flb', retry in 7 seconds: task_id=0, input=forward.1 > output=cloudwatch_logs.3 (out_id=3)`|5|
380
-
|`HC_Retry_Failure_Count`| the retry failure count to meet the unhealthy requirement, this is a sum for all output plugins in a defined HC_Period, example for retry failure: `[2022/02/16 20:11:36] [ warn] [engine] chunk '1-1645042288.260516436.flb' cannot be retried: task_id=0, input=tcp.3 > output=cloudwatch_logs.1`|5|
381
-
|`HC_Period`| The time period by second to count the error and retry failure data point |60|
377
+
|`Health_Check`|Enable Health check feature |`Off`|
378
+
|`HC_Errors_Count`| the error count to meet the unhealthy requirement, this is a sum for all output plugins in a defined `HC_Period`, example for output error: `[2022/02/16 10:44:10] [ warn] [engine] failed to flush chunk '1-1645008245.491540684.flb', retry in 7 seconds: task_id=0, input=forward.1 > output=cloudwatch_logs.3 (out_id=3)`|`5`|
379
+
|`HC_Retry_Failure_Count`| the retry failure count to meet the unhealthy requirement, this is a sum for all output plugins in a defined `HC_Period`, example for retry failure: `[2022/02/16 20:11:36] [ warn] [engine] chunk '1-1645042288.260516436.flb' cannot be retried: task_id=0, input=tcp.3 > output=cloudwatch_logs.1`|`5`|
380
+
|`HC_Period`| The time period by second to count the error and retry failure data point |`60`|
382
381
383
382
Not every error log means an error to be counted. The error retry failures count only
384
383
on specific errors, which is the example in configuration table description.
@@ -425,7 +424,7 @@ Use the following command to call the health endpoint:
425
424
curl -s http://127.0.0.1:2020/api/v1/health
426
425
```
427
426
428
-
With the example config, the health status is determined by the following equation:
427
+
With the example configuration, the health status is determined by the following equation:
429
428
430
429
```text
431
430
Health status = (HC_Errors_Count > 5) OR (HC_Retry_Failure_Count > 5) IN 5 seconds
@@ -437,5 +436,5 @@ Health status = (HC_Errors_Count > 5) OR (HC_Retry_Failure_Count > 5) IN 5 secon
437
436
## Telemetry Pipeline
438
437
439
438
[Telemetry Pipeline](https://chronosphere.io/platform/telemetry-pipeline/) is a
440
-
hosted service that allows you to monitor your Fluent Bit agents including data flow,
439
+
hosted service that lets you monitor your Fluent Bit agents including data flow,
0 commit comments