fluent · tarruda · Jul 22, 2022 · Oct 27, 2025 · Oct 27, 2025
@@ -476,6 +476,11 @@
 | `HC_Errors_Count`       | the error count to meet the unhealthy requirement, this is a sum for all output plugins in a defined `HC_Period`, example for output error: `[2022/02/16 10:44:10] [ warn] [engine] failed to flush chunk '1-1645008245.491540684.flb', retry in 7 seconds: task_id=0, input=forward.1 > output=cloudwatch_logs.3 (out_id=3)` | `5` |
 | `HC_Retry_Failure_Count` | the retry failure count to meet the unhealthy requirement, this is a sum for all output plugins in a defined `HC_Period`, example for retry failure: `[2022/02/16 20:11:36] [ warn] [engine] chunk '1-1645042288.260516436.flb' cannot be retried: task_id=0, input=tcp.3 > output=cloudwatch_logs.1` | `5` |
 | `HC_Period` | The time period by second to count the error and retry failure data point | `60` |
+| `HC_Throughput` | Enable throughput health checking. In this context, throughput means `OUTPUT_RATE/INPUT_RATE` ratio, and the check happens in accordance to `Hc_Period`. If this is `On`, all related options must be set since there are no default values. | `Off` |
+| `HC_Throughput_Input_Plugins`    | Comma separated list of input plugins used for the purposes of calculating input rate. | _none_ |
+| `HC_Throughput_Output_Plugins`   | Comma separated list of output plugins used for the purposes of calculating output rate. | _none_ |
+| `HC_Throughput_Ratio_Threshold`  | `OUTPUT_RATE/INPUT_RATE` ratio failure threshold. If the ratio is under this number, then the current check fails. A single check is not enough to trigger a health error, see `Hc_Throughput_Min_Failures` for details.| _none_ |
+| `HC_Throughput_Min_Failures`     | Minimum number of consecutive ratio check failures required before the health endpoint will return an error. For example, if this is `60` and the default `Hc_Period`, the ratio must be below threshold for 1 minute before an error is returned. | _none_ |
 
 Not every error log means an error to be counted. The error retry failures count only on specific errors, which is the example in configuration table description.
 
@@ -527,18 +532,34 @@
   HC_Errors_Count 5
   HC_Retry_Failure_Count 5
   HC_Period 5
-
+
+
 [INPUT]
   Name  cpu
 
+
 [OUTPUT]
   Name  stdout
   Match *
 ```
 
+### Throughput health check
+
+If `Hc_Throughput` and other related options are set, Fluent Bit will monitor output/input ratio, and the health endpoint will return error if ratio is beneath the configured threshold. For example:
+
+```text
+hc_throughput                 On
+hc_throughput_input_plugins   tail.0
+hc_throughput_output_plugins  http.0
+hc_throughput_ratio_threshold 0.1
+hc_throughput_min_failures    60
+```
+
 {% endtab %}
 {% endtabs %}
 
+In the previous example, if the HTTP output rate is below 1/10 of the tail input rate for 1 consecutive minute, then the `/api/v1/health` endpoint will return `error`. If the ratio goes above threshold, it will restore the `OK` status until another minute of consecutive failed checks.
+
 Use the following command to call the health endpoint:
 
 ```shell
@@ -556,6 +577,4 @@
 
 ## Telemetry Pipeline
 
-[Telemetry Pipeline](https://chronosphere.io/platform/telemetry-pipeline/) is a
-hosted service that lets you monitor your Fluent Bit agents including data flow,
-metrics, and configurations.
+[Telemetry Pipeline](https://chronosphere.io/platform/telemetry-pipeline/) is a hosted service that lets you monitor your Fluent Bit agents including data flow, metrics, and configurations.