diff --git a/pipeline/outputs/nats.md b/pipeline/outputs/nats.md index 10d17a004..e20c71b59 100644 --- a/pipeline/outputs/nats.md +++ b/pipeline/outputs/nats.md @@ -4,11 +4,11 @@ The **nats** output plugin, allows to flush your records into a [NATS Server](ht ## Configuration parameters -| parameter | description | default | -| :--- | :--- | :--- | -| host | IP address or hostname of the NATS Server | 127.0.0.1 | -| port | TCP port of the target NATS Server | 4222 | -| workers | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `0` | +| parameter | description | default | +|:----------|:---------------------------------------------------------------------------------------------------------------------|:----------| +| host | IP address or hostname of the NATS Server | 127.0.0.1 | +| port | TCP port of the target NATS Server | 4222 | +| workers | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `0` | In order to override the default configuration values, the plugin uses the optional Fluent Bit network address format, e.g: @@ -20,14 +20,10 @@ nats://host:port [Fluent Bit](http://fluentbit.io) only requires to know that it needs to use the **nats** output plugin, if no extra information is given, it will use the default values specified in the above table. -```bash -$ bin/fluent-bit -i cpu -o nats -V -f 5 -Fluent Bit v1.x.x -* Copyright (C) 2019-2020 The Fluent Bit Authors -* Copyright (C) 2015-2018 Treasure Data -* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd -* https://fluentbit.io +```shell +$ fluent-bit -i cpu -o nats -V -f 5 +... [2016/03/04 10:17:33] [ info] Configuration flush time : 5 seconds input plugins : cpu @@ -46,9 +42,9 @@ As described above, the target service and storage point can be changed, e.g: ## Data format -For every set of records flushed to a NATS Server, Fluent Bit uses the following JSON format: +For every set of records flushed to a NATS Server, Fluent Bit uses the following format: -```javascript +```text [ [UNIX_TIMESTAMP, JSON_MAP_1], [UNIX_TIMESTAMP, JSON_MAP_2], @@ -56,12 +52,12 @@ For every set of records flushed to a NATS Server, Fluent Bit uses the following ] ``` -Each record is an individual entity represented in a JSON array that contains a UNIX\_TIMESTAMP and a JSON map with a set of key/values. A summarized output of the CPU input plugin will looks as this: +Each record is an individual entity represented in a JSON array that contains a UNIX\_TIMESTAMP and a JSON map with a set of key/values. A summarized output of the CPU input plugin will look as this: -```text +```json [ [1457108504,{"tag":"fluentbit","cpu_p":1.500000,"user_p":1,"system_p":0.500000}], [1457108505,{"tag":"fluentbit","cpu_p":4.500000,"user_p":3,"system_p":1.500000}], [1457108506,{"tag":"fluentbit","cpu_p":6.500000,"user_p":4.500000,"system_p":2}] ] -``` +``` \ No newline at end of file diff --git a/pipeline/outputs/postgresql.md b/pipeline/outputs/postgresql.md index 6ce6d1b2e..3cd18e803 100644 --- a/pipeline/outputs/postgresql.md +++ b/pipeline/outputs/postgresql.md @@ -10,7 +10,11 @@ PostgreSQL 9.4 or higher is required. According to the parameters you have set in the configuration file, the plugin will create the table defined by the `table` option in the database defined by the `database` option hosted on the server defined by the `host` option. It will use the PostgreSQL user defined by the `user` option, which needs to have the right privileges to create such a table in that database. -> **NOTE:** If you are not familiar with how PostgreSQL's users and grants system works, you might find useful reading the recommended links in the "References" section at the bottom. +{% hint style="info" %} + +If you are not familiar with how PostgreSQL's users and grants system works, you might find useful reading the recommended links in the "References" section at the bottom. + +{% endhint %} A typical installation normally consists of a self-contained database for Fluent Bit in which you can store the output of one or more pipelines. Ultimately, it is your choice to store them in the same table, or in separate tables, or even in separate databases based on several factors, including workload, scalability, data protection and security. @@ -20,7 +24,7 @@ In this example, for the sake of simplicity, we use a single table called `fluen Generate a robust random password \(e.g. `pwgen 20 1`\) and store it safely. Then, as `postgres` system user on the server where PostgreSQL is installed, execute: -```bash +```shell createuser -P fluentbit ``` @@ -34,7 +38,7 @@ If you prefer, instead of the `createuser` application, you can directly use the As `postgres` system user, please run: -```bash +```shell createdb -O fluentbit fluentbit ``` @@ -48,21 +52,21 @@ Make sure that the `fluentbit` user can connect to the `fluentbit` database on t ## Configuration Parameters -| Key | Description | Default | -| :--- | :--- | :--- | -| `Host` | Hostname/IP address of the PostgreSQL instance | - \(127.0.0.1\) | -| `Port` | PostgreSQL port | - \(5432\) | -| `User` | PostgreSQL username | - \(current user\) | -| `Password` | Password of PostgreSQL username | - | -| `Database` | Database name to connect to | - \(current user\) | -| `Table` | Table name where to store data | - | -| `Connection_Options` | Specifies any valid [PostgreSQL connection options](https://www.postgresql.org/docs/devel/libpq-connect.html#LIBPQ-CONNECT-OPTIONS) | - | -| `Timestamp_Key` | Key in the JSON object containing the record timestamp | date | -| `Async` | Define if we will use async or sync connections | false | -| `min_pool_size` | Minimum number of connection in async mode | 1 | -| `max_pool_size` | Maximum amount of connections in async mode | 4 | -| `cockroachdb` | Set to `true` if you will connect the plugin with a CockroachDB | false | -| `workers` | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `0` | +| Key | Description | Default | +|:---------------------|:------------------------------------------------------------------------------------------------------------------------------------|:-------------------| +| `Host` | Hostname/IP address of the PostgreSQL instance | - \(127.0.0.1\) | +| `Port` | PostgreSQL port | - \(5432\) | +| `User` | PostgreSQL username | - \(current user\) | +| `Password` | Password of PostgreSQL username | - | +| `Database` | Database name to connect to | - \(current user\) | +| `Table` | Table name where to store data | - | +| `Connection_Options` | Specifies any valid [PostgreSQL connection options](https://www.postgresql.org/docs/devel/libpq-connect.html#LIBPQ-CONNECT-OPTIONS) | - | +| `Timestamp_Key` | Key in the JSON object containing the record timestamp | date | +| `Async` | Define if we will use async or sync connections | false | +| `min_pool_size` | Minimum number of connection in async mode | 1 | +| `max_pool_size` | Maximum amount of connections in async mode | 4 | +| `cockroachdb` | Set to `true` if you will connect the plugin with a CockroachDB | false | +| `workers` | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `0` | ### Libpq @@ -74,20 +78,45 @@ For security reasons, it is advised to follow the directives included in the [pa In your main configuration file add the following section: +{% tabs %} +{% tab title="fluent-bit.yaml" %} + +```yaml +pipeline: + + outputs: + - name: pgsql + match: '*' + host: 172.17.0.2 + port: 5432 + user: fluentbit + password: YourCrazySecurePassword + database: fluentbit + table: fluentbit + connection_options: '-c statement_timeout=0' + timestamp_key: ts +``` + +{% endtab %} +{% tab title="fluent-bit.conf" %} + ```text [OUTPUT] - Name pgsql - Match * - Host 172.17.0.2 - Port 5432 - User fluentbit - Password YourCrazySecurePassword - Database fluentbit - Table fluentbit - Connection_Options -c statement_timeout=0 - Timestamp_Key ts + Name pgsql + Match * + Host 172.17.0.2 + Port 5432 + User fluentbit + Password YourCrazySecurePassword + Database fluentbit + Table fluentbit + Connection_Options -c statement_timeout=0 + Timestamp_Key ts ``` +{% endtab %} +{% endtabs %} + ## The output table The output plugin automatically creates a table with the name specified by the `table` configuration option and made up of the following fields: @@ -96,7 +125,7 @@ The output plugin automatically creates a table with the name specified by the ` * `time TIMESTAMP WITHOUT TIMEZONE` * `data JSONB` -As you can see, the timestamp does not contain any information about the time zone and it is therefore referred to the time zone used by the connection to PostgreSQL \(`timezone` setting\). +As you can see, the timestamp does not contain any information about the time zone, and it is therefore referred to the time zone used by the connection to PostgreSQL \(`timezone` setting\). For more information on the `JSONB` data type in PostgreSQL, please refer to the [JSON types](https://www.postgresql.org/docs/current/datatype-json.html) page in the official documentation, where you can find instructions on how to index or query the objects \(including `jsonpath` introduced in PostgreSQL 12\). @@ -129,4 +158,4 @@ Here follows a list of useful resources from the PostgreSQL documentation: * [libpq - C API for PostgreSQL](https://www.postgresql.org/docs/current/libpq.html) * [libpq - Environment variables](https://www.postgresql.org/docs/current/libpq-envars.html) * [libpq - password file](https://www.postgresql.org/docs/current/libpq-pgpass.html) -* [Trigger functions](https://www.postgresql.org/docs/current/plpgsql-trigger.html) +* [Trigger functions](https://www.postgresql.org/docs/current/plpgsql-trigger.html) \ No newline at end of file diff --git a/pipeline/outputs/prometheus-exporter.md b/pipeline/outputs/prometheus-exporter.md index feac59d76..1198a7508 100644 --- a/pipeline/outputs/prometheus-exporter.md +++ b/pipeline/outputs/prometheus-exporter.md @@ -8,20 +8,21 @@ The prometheus exporter allows you to take metrics from Fluent Bit and expose th Important Note: The prometheus exporter only works with metric plugins, such as Node Exporter Metrics -| Key | Description | Default | -| :--- | :--- | :--- | -| host | This is address Fluent Bit will bind to when hosting prometheus metrics. Note: `listen` parameter is deprecated from v1.9.0. | 0.0.0.0 | -| port | This is the port Fluent Bit will bind to when hosting prometheus metrics | 2021 | -| add\_label | This allows you to add custom labels to all metrics exposed through the prometheus exporter. You may have multiple of these fields | | -| workers | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `1` | +| Key | Description | Default | +|:-----------|:-----------------------------------------------------------------------------------------------------------------------------------|:--------| +| host | This is address Fluent Bit will bind to when hosting prometheus metrics. Note: `listen` parameter is deprecated from v1.9.0. | 0.0.0.0 | +| port | This is the port Fluent Bit will bind to when hosting prometheus metrics | 2021 | +| add\_label | This allows you to add custom labels to all metrics exposed through the prometheus exporter. You may have multiple of these fields | | +| workers | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `1` | ## Getting Started The Prometheus exporter only works with metrics captured from metric plugins. In the following example, host metrics are captured by the node exporter metrics plugin and then are routed to prometheus exporter. Within the output plugin two labels are added `app="fluent-bit"`and `color="blue"` {% tabs %} -{% tab title="fluent-bit.conf" %} -```text +{% tab title="fluent-bit.yaml" %} + +```yaml # Node Exporter Metrics + Prometheus Exporter # ------------------------------------------- # The following example collect host metrics on Linux and expose @@ -31,28 +32,31 @@ The Prometheus exporter only works with metrics captured from metric plugins. In # # $ curl http://127.0.0.1:2021/metrics # -[SERVICE] - flush 1 - log_level info +service: + flush: 1 + log_level: info -[INPUT] - name node_exporter_metrics - tag node_metrics - scrape_interval 2 +pipeline: + inputs: + - name: node_exporter_metrics + tag: node_metrics + scrape_interval: 2 -[OUTPUT] - name prometheus_exporter - match node_metrics - host 0.0.0.0 - port 2021 - # add user-defined labels - add_label app fluent-bit - add_label color blue + outputs: + - name: prometheus_exporter + match: node_metrics + host: 0.0.0.0 + port: 2021 + # add user-defined labels + add_label: + - app fluent-bit + - color blue ``` + {% endtab %} +{% tab title="fluent-bit.conf" %} -{% tab title="fluent-bit.yaml" %} -```yaml +```text # Node Exporter Metrics + Prometheus Exporter # ------------------------------------------- # The following example collect host metrics on Linux and expose @@ -62,23 +66,24 @@ The Prometheus exporter only works with metrics captured from metric plugins. In # # $ curl http://127.0.0.1:2021/metrics # -service: - flush: 1 - log_level: info -pipeline: - inputs: - - name: node_exporter_metrics - tag: node_metrics - scrape_interval: 2 - outputs: - - name: prometheus_exporter - match: node_metrics - host: 0.0.0.0 - port: 2021 - # add user-defined labels - add_label: - - app fluent-bit - - color blue +[SERVICE] + flush 1 + log_level info + +[INPUT] + name node_exporter_metrics + tag node_metrics + scrape_interval 2 + +[OUTPUT] + name prometheus_exporter + match node_metrics + host 0.0.0.0 + port 2021 + # add user-defined labels + add_label app fluent-bit + add_label color blue ``` + {% endtab %} -{% endtabs %} +{% endtabs %} \ No newline at end of file diff --git a/pipeline/outputs/prometheus-remote-write.md b/pipeline/outputs/prometheus-remote-write.md index b866f7193..08be0ec3d 100644 --- a/pipeline/outputs/prometheus-remote-write.md +++ b/pipeline/outputs/prometheus-remote-write.md @@ -8,62 +8,106 @@ The prometheus remote write plugin allows you to take metrics from Fluent Bit an Important Note: The prometheus exporter only works with metric plugins, such as Node Exporter Metrics -| Key | Description | Default | -| -------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --------- | -| host | IP address or hostname of the target HTTP Server | 127.0.0.1 | -| http_user | Basic Auth Username | | -| http_passwd | Basic Auth Password. Requires HTTP_user to be set | | -| AWS\_Auth | Enable AWS SigV4 authentication | false | -| AWS\_Service | For Amazon Managed Service for Prometheus, the service name is aps | aps | -| AWS\_Region | Region of your Amazon Managed Service for Prometheus workspace | | -| AWS\_STS\_Endpoint | Specify the custom sts endpoint to be used with STS API, used with the AWS_Role_ARN option, used by SigV4 authentication | | -| AWS\_Role\_ARN | AWS IAM Role to assume, used by SigV4 authentication | | -| AWS\_External\_ID | External ID for the AWS IAM Role specified with `aws_role_arn`, used by SigV4 authentication | | -| port | TCP port of the target HTTP Server | 80 | +| Key | Description | Default | +|----------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------| +| host | IP address or hostname of the target HTTP Server | 127.0.0.1 | +| http_user | Basic Auth Username | | +| http_passwd | Basic Auth Password. Requires HTTP_user to be set | | +| AWS\_Auth | Enable AWS SigV4 authentication | false | +| AWS\_Service | For Amazon Managed Service for Prometheus, the service name is aps | aps | +| AWS\_Region | Region of your Amazon Managed Service for Prometheus workspace | | +| AWS\_STS\_Endpoint | Specify the custom sts endpoint to be used with STS API, used with the AWS_Role_ARN option, used by SigV4 authentication | | +| AWS\_Role\_ARN | AWS IAM Role to assume, used by SigV4 authentication | | +| AWS\_External\_ID | External ID for the AWS IAM Role specified with `aws_role_arn`, used by SigV4 authentication | | +| port | TCP port of the target HTTP Server | 80 | | proxy | Specify an HTTP Proxy. The expected format of this value is `http://HOST:PORT`. Note that HTTPS is **not** currently supported. It is recommended not to set this and to configure the [HTTP proxy environment variables](https://docs.fluentbit.io/manual/administration/http-proxy) instead as they support both HTTP and HTTPS. | | -| uri | Specify an optional HTTP URI for the target web server, e.g: /something | / | -| header | Add a HTTP header key/value pair. Multiple headers can be set. | | -| log_response_payload | Log the response payload within the Fluent Bit log | false | -| add_label | This allows you to add custom labels to all metrics exposed through the prometheus exporter. You may have multiple of these fields | | -| workers | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `2` | +| uri | Specify an optional HTTP URI for the target web server, e.g: /something | / | +| header | Add a HTTP header key/value pair. Multiple headers can be set. | | +| log_response_payload | Log the response payload within the Fluent Bit log | false | +| add_label | This allows you to add custom labels to all metrics exposed through the prometheus exporter. You may have multiple of these fields | | +| workers | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `2` | ## Getting Started -The Prometheus remote write plugin only works with metrics collected by one of the from metric input plugins. In the following example, host metrics are collected by the node exporter metrics plugin and then delivered by the prometheus remote write output plugin. +The Prometheus remote write plugin only works with metrics collected by one of the metric input plugins. In the following example, host metrics are collected by the node exporter metrics plugin and then delivered by the prometheus remote write output plugin. +{% tabs %} +{% tab title="fluent-bit.yaml" %} + +```yaml +# Node Exporter Metrics + Prometheus remote write output plugin +# ------------------------------------------- +# The following example collects host metrics on Linux and delivers +# them through the Prometheus remote write plugin to new relic : +# +service: + flush: 1 + log_level: info + +pipeline: + inputs: + - name: node_exporter_metrics + tag: node_metrics + scrape_interval: 2 + + outputs: + - name: prometheus_remote_write + match: node_metrics + host: metric-api.newrelic.com + port: 443 + uri: /prometheus/v1/write?prometheus_server=YOUR_DATA_SOURCE_NAME + header: 'Authorization Bearer YOUR_LICENSE_KEY' + log_response_payload: true + tls: on + tls.verify: on + # add user-defined labels + add_label: + - app fluent-bit + - color blue + +# Note : it would be necessary to replace both YOUR_DATA_SOURCE_NAME and YOUR_LICENSE_KEY +# with real values for this example to work. ``` + +{% endtab %} +{% tab title="fluent-bit.conf" %} + +```text # Node Exporter Metrics + Prometheus remote write output plugin # ------------------------------------------- # The following example collects host metrics on Linux and delivers # them through the Prometheus remote write plugin to new relic : # [SERVICE] - Flush 1 - Log_level info + Flush 1 + Log_level info [INPUT] - Name node_exporter_metrics - Tag node_metrics - Scrape_interval 2 + Name node_exporter_metrics + Tag node_metrics + Scrape_interval 2 [OUTPUT] - Name prometheus_remote_write - Match node_metrics - Host metric-api.newrelic.com - Port 443 - Uri /prometheus/v1/write?prometheus_server=YOUR_DATA_SOURCE_NAME - Header Authorization Bearer YOUR_LICENSE_KEY - Log_response_payload True - Tls On - Tls.verify On - # add user-defined labels - add_label app fluent-bit - add_label color blue + Name prometheus_remote_write + Match node_metrics + Host metric-api.newrelic.com + Port 443 + Uri /prometheus/v1/write?prometheus_server=YOUR_DATA_SOURCE_NAME + Header Authorization Bearer YOUR_LICENSE_KEY + Log_response_payload True + Tls On + Tls.verify On + # add user-defined labels + add_label app fluent-bit + add_label color blue # Note : it would be necessary to replace both YOUR_DATA_SOURCE_NAME and YOUR_LICENSE_KEY # with real values for this example to work. ``` +{% endtab %} +{% endtabs %} + ## Examples The following are examples of using Prometheus remote write with hosted services below @@ -72,86 +116,208 @@ The following are examples of using Prometheus remote write with hosted services With [Grafana Cloud](https://grafana.com/products/cloud/) hosted metrics you will need to use the specific host that is mentioned as well as specify the HTTP username and password given within the Grafana Cloud page. + +{% tabs %} +{% tab title="fluent-bit.yaml" %} + +```yaml +pipeline: + + outputs: + - name: prometheus_remote_write + match: '*' + host: prometheus-us-central1.grafana.net + uri: /api/prom/push + port: 443 + tls: on + tls.verify: on + http_user: + http_passwd: ``` + +{% endtab %} +{% tab title="fluent-bit.conf" %} + +```text [OUTPUT] - name prometheus_remote_write - host prometheus-us-central1.grafana.net - match * - uri /api/prom/push - port 443 - tls on - tls.verify on - http_user - http_passwd + name prometheus_remote_write + match * + host prometheus-us-central1.grafana.net + uri /api/prom/push + port 443 + tls on + tls.verify on + http_user + http_passwd ``` +{% endtab %} +{% endtabs %} + ### Logz.io Infrastructure Monitoring With Logz.io [hosted prometheus](https://logz.io/solutions/infrastructure-monitoring/) you will need to make use of the header option and add the Authorization Bearer with the proper key. The host and port may also differ within your specific hosted instance. +{% tabs %} +{% tab title="fluent-bit.yaml" %} + +```yaml +pipeline: + + outputs: + - name: prometheus_remote_write + match: '*' + host: listener.logz.io + port: 8053 + tls: on + tls.verify: on + log_response_payload: true ``` + +{% endtab %} +{% tab title="fluent-bit.conf" %} + +```text [OUTPUT] - name prometheus_remote_write - host listener.logz.io - port 8053 - match * - header Authorization Bearer - tls on - tls.verify on - log_response_payload true + name prometheus_remote_write + match * + host listener.logz.io + port 8053 + header Authorization Bearer + tls on + tls.verify on + log_response_payload true ``` +{% endtab %} +{% endtabs %} + ### Coralogix With [Coralogix Metrics](https://coralogix.com/platform/metrics/) you may need to customize the URI. Additionally, you will make use of the header key with Coralogix private key. +{% tabs %} +{% tab title="fluent-bit.yaml" %} + +```yaml +pipeline: + + outputs: + - name: prometheus_remote_write + match: '*' + host: metrics-api.coralogix.com + uri: prometheus/api/v1/write?appLabelName=path&subSystemLabelName=path&severityLabelName=severity + port: 443 + header: 'Authorization Bearer ' + tls: on + tls.verify: on ``` + +{% endtab %} +{% tab title="fluent-bit.conf" %} + +```text [OUTPUT] - name prometheus_remote_write - host metrics-api.coralogix.com - uri prometheus/api/v1/write?appLabelName=path&subSystemLabelName=path&severityLabelName=severity - match * - port 443 - tls on - tls.verify on - header Authorization Bearer + name prometheus_remote_write + match * + host metrics-api.coralogix.com + uri prometheus/api/v1/write?appLabelName=path&subSystemLabelName=path&severityLabelName=severity + port 443 + header Authorization Bearer + tls on + tls.verify on ``` +{% endtab %} +{% endtabs %} + ### Levitate With [Levitate](https://last9.io/levitate-tsdb), you must use the Levitate cluster-specific write URL and specify the HTTP username and password for the token created for your Levitate cluster. +{% tabs %} +{% tab title="fluent-bit.yaml" %} + +```yaml +pipeline: + + outputs: + - name: prometheus_remote_write + match: '*' + host: app-tsdb.last9.io + uri: /v1/metrics/82xxxx/sender/org-slug/write + port: 443 + tls: on + tls.verify: on + http_user: + http_passwd: ``` + +{% endtab %} +{% tab title="fluent-bit.conf" %} + +```text [OUTPUT] - name prometheus_remote_write - host app-tsdb.last9.io - match * - uri /v1/metrics/82xxxx/sender/org-slug/write - port 443 - tls on - tls.verify on - http_user - http_passwd + name prometheus_remote_write + match * + host app-tsdb.last9.io + uri /v1/metrics/82xxxx/sender/org-slug/write + port 443 + tls on + tls.verify on + http_user + http_passwd ``` +{% endtab %} +{% endtabs %} + ### Add Prometheus like Labels Ordinary prometheus clients add some of the labels as below: +{% tabs %} +{% tab title="fluent-bit.yaml" %} + +```yaml +pipeline: + + outputs: + - name: prometheus_remote_write + match: your.metric + host: xxxxxxx.yyyyy.zzzz + port: 443 + uri: /api/v1/write + header: 'Authorization Bearer YOUR_LICENSE_KEY' + log_response_payload: true + tls: on + tls.verify: on + # add user-defined labels + add_label: + - instance ${HOSTNAME} + - job fluent-bit ``` + +{% endtab %} +{% tab title="fluent-bit.conf" %} + +```text [OUTPUT] - Name prometheus_remote_write - Match your.metric - Host xxxxxxx.yyyyy.zzzz - Port 443 - Uri /api/v1/write - Header Authorization Bearer YOUR_LICENSE_KEY - Log_response_payload True - Tls On - Tls.verify On - # add user-defined labels - add_label instance ${HOSTNAME} - add_label job fluent-bit + Name prometheus_remote_write + Match your.metric + Host xxxxxxx.yyyyy.zzzz + Port 443 + Uri /api/v1/write + Header Authorization Bearer YOUR_LICENSE_KEY + Log_response_payload True + Tls On + Tls.verify On + # add user-defined labels + add_label instance ${HOSTNAME} + add_label job fluent-bit ``` -`instance` label can be emulated with `add_label instance ${HOSTNAME}`. And other labels can be added with `add_label ` setting. +{% endtab %} +{% endtabs %} + +`instance` label can be emulated with `add_label instance ${HOSTNAME}`. And other labels can be added with `add_label ` setting. \ No newline at end of file diff --git a/pipeline/outputs/skywalking.md b/pipeline/outputs/skywalking.md index 1d0f42925..f1f3b99bc 100644 --- a/pipeline/outputs/skywalking.md +++ b/pipeline/outputs/skywalking.md @@ -1,17 +1,17 @@ # Apache SkyWalking -The **Apache SkyWalking** output plugin, allows to flush your records to a [Apache SkyWalking](https://skywalking.apache.org/) OAP. The following instructions assumes that you have a fully operational Apache SkyWalking OAP in place. +The **Apache SkyWalking** output plugin, allows to flush your records to an [Apache SkyWalking](https://skywalking.apache.org/) OAP. The following instructions assumes that you have a fully operational Apache SkyWalking OAP in place. ## Configuration Parameters -| parameter | description | default | -| :--- | :--- | :--- | -| host | Hostname of Apache SkyWalking OAP | 127.0.0.1 | -| port | TCP port of the Apache SkyWalking OAP | 12800 | -| auth_token | Authentication token if needed for Apache SkyWalking OAP | None | -| svc_name | Service name that fluent-bit belongs to | sw-service | -| svc_inst_name | Service instance name of fluent-bit | fluent-bit | -| workers | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `0` | +| parameter | description | default | +|:--------------|:---------------------------------------------------------------------------------------------------------------------|:-----------| +| host | Hostname of Apache SkyWalking OAP | 127.0.0.1 | +| port | TCP port of the Apache SkyWalking OAP | 12800 | +| auth_token | Authentication token if needed for Apache SkyWalking OAP | None | +| svc_name | Service name that fluent-bit belongs to | sw-service | +| svc_inst_name | Service instance name of fluent-bit | fluent-bit | +| workers | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `0` | ### TLS / SSL @@ -24,33 +24,53 @@ In order to start inserting records into an Apache SkyWalking service, you can r ### Configuration File -In your main configuration file append the following _Input_ & _Output_ sections: +In your main configuration file append the following: + +{% tabs %} +{% tab title="fluent-bit.yaml" %} + +```yaml +pipeline: + inputs: + - name: cpu + + outputs: + - name: skywalking + svc_name: dummy-service + svc_inst_name: dummy-service-fluentbit +``` + +{% endtab %} +{% tab title="fluent-bit.conf" %} ```text [INPUT] - Name cpu + Name cpu [OUTPUT] - Name skywalking - svc_name dummy-service - svc_inst_name dummy-service-fluentbit + Name skywalking + svc_name dummy-service + svc_inst_name dummy-service-fluentbit ``` +{% endtab %} +{% endtabs %} + ## Output Format The format of the plugin output follows the [data collect protocol](https://github.com/apache/skywalking-data-collect-protocol/blob/743f33119dc5621ae98b596eb8b131dd443445c7/logging/Logging.proto). For example, if we get log as follows, -```text +```json { - "log": "This is the original log message" + "log": "This is the original log message" } ``` This message is packed into the following protocol format and written to the OAP via the REST API. -```text +```json [{ "timestamp": 123456789, "service": "dummy-service", @@ -61,4 +81,4 @@ This message is packed into the following protocol format and written to the OAP } } }] -``` +``` \ No newline at end of file diff --git a/pipeline/outputs/slack.md b/pipeline/outputs/slack.md index 5cbee7f03..9a19553cf 100644 --- a/pipeline/outputs/slack.md +++ b/pipeline/outputs/slack.md @@ -6,7 +6,7 @@ This connector uses the Slack _Incoming Webhooks_ feature to post messages to Sl ## Slack Webhook -Before configuring this plugin, make sure to setup your Incoming Webhook. For detailed step-by-step instructions, review the following official documentation: +Before configuring this plugin, make sure to set up your Incoming Webhook. For detailed step-by-step instructions, review the following official documentation: * [https://api.slack.com/messaging/webhooks\#getting\_started](https://api.slack.com/messaging/webhooks#getting_started) @@ -14,18 +14,36 @@ Once you have obtained the Webhook address you can place it in the configuration ## Configuration Parameters -| Key | Description | Default | -| :--- | :--- | :--- | -| webhook | Absolute address of the Webhook provided by Slack | | -| workers | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `0` | +| Key | Description | Default | +|:--------|:---------------------------------------------------------------------------------------------------------------------|:--------| +| webhook | Absolute address of the Webhook provided by Slack | | +| workers | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `0` | ### Configuration File Get started quickly with this configuration file: +{% tabs %} +{% tab title="fluent-bit.yaml" %} + +```yaml +pipeline: + + outputs: + - name: slack + match: '*' + webhook: https://hooks.slack.com/services/T00000000/B00000000/XXXXXXXXXXXXXXXXXXXXXXXX +``` + +{% endtab %} +{% tab title="fluent-bit.conf" %} + ```text [OUTPUT] name slack match * webhook https://hooks.slack.com/services/T00000000/B00000000/XXXXXXXXXXXXXXXXXXXXXXXX ``` + +{% endtab %} +{% endtabs %} \ No newline at end of file diff --git a/pipeline/outputs/splunk.md b/pipeline/outputs/splunk.md index af83d4a17..d63a328eb 100644 --- a/pipeline/outputs/splunk.md +++ b/pipeline/outputs/splunk.md @@ -6,38 +6,38 @@ description: Send logs to Splunk HTTP Event Collector Splunk output plugin allows to ingest your records into a [Splunk Enterprise](https://www.splunk.com/en_us/products/splunk-enterprise.html) service through the HTTP Event Collector \(HEC\) interface. -To get more details about how to setup the HEC in Splunk please refer to the following documentation: [Splunk / Use the HTTP Event Collector](http://docs.splunk.com/Documentation/Splunk/7.0.3/Data/UsetheHTTPEventCollector) +To get more details about how to set up the HEC in Splunk please refer to the following documentation: [Splunk / Use the HTTP Event Collector](http://docs.splunk.com/Documentation/Splunk/7.0.3/Data/UsetheHTTPEventCollector) ## Configuration Parameters Connectivity, transport and authentication configuration properties: -| Key | Description | default | -| :--- | :--- | :--- | -| host | IP address or hostname of the target Splunk service. | 127.0.0.1 | -| port | TCP port of the target Splunk service. | 8088 | -| splunk\_token | Specify the Authentication Token for the HTTP Event Collector interface. | | -| http\_user | Optional username for Basic Authentication on HEC | | -| http\_passwd | Password for user defined in HTTP\_User | | -| http\_buffer\_size | Buffer size used to receive Splunk HTTP responses | 2M | -| compress | Set payload compression mechanism. The only available option is `gzip`. | | -| channel | Specify X-Splunk-Request-Channel Header for the HTTP Event Collector interface. | | -| http_debug_bad_request | If the HTTP server response code is 400 (bad request) and this flag is enabled, it will print the full HTTP request and response to the stdout interface. This feature is available for debugging purposes. | | -| workers | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `2` | +| Key | Description | default | +|:-----------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------| +| host | IP address or hostname of the target Splunk service. | 127.0.0.1 | +| port | TCP port of the target Splunk service. | 8088 | +| splunk\_token | Specify the Authentication Token for the HTTP Event Collector interface. | | +| http\_user | Optional username for Basic Authentication on HEC | | +| http\_passwd | Password for user defined in HTTP\_User | | +| http\_buffer\_size | Buffer size used to receive Splunk HTTP responses | 2M | +| compress | Set payload compression mechanism. The only available option is `gzip`. | | +| channel | Specify X-Splunk-Request-Channel Header for the HTTP Event Collector interface. | | +| http_debug_bad_request | If the HTTP server response code is 400 (bad request) and this flag is enabled, it will print the full HTTP request and response to the stdout interface. This feature is available for debugging purposes. | | +| workers | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `2` | Content and Splunk metadata \(fields\) handling configuration properties: -| Key | Description | default | -| :--- | :--- | :--- | -| splunk\_send\_raw | When enabled, the record keys and values are set in the top level of the map instead of under the event key. Refer to the _Sending Raw Events_ section from the docs for more details to make this option work properly. | off | -| event\_key | Specify the key name that will be used to send a single value as part of the record. | | -| event\_host | Specify the key name that contains the host value. This option allows a record accessors pattern. | | -| event\_source | Set the source value to assign to the event data. | | -| event\_sourcetype | Set the sourcetype value to assign to the event data. | | -| event\_sourcetype\_key | Set a record key that will populate 'sourcetype'. If the key is found, it will have precedence over the value set in `event_sourcetype`. | | -| event\_index | The name of the index by which the event data is to be indexed. | | -| event\_index\_key | Set a record key that will populate the `index` field. If the key is found, it will have precedence over the value set in `event_index`. | | -| event\_field | Set event fields for the record. This option can be set multiple times and the format is `key_name record_accessor_pattern`. | | +| Key | Description | default | +|:-----------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:--------| +| splunk\_send\_raw | When enabled, the record keys and values are set in the top level of the map instead of under the event key. Refer to the _Sending Raw Events_ section from the docs for more details to make this option work properly. | off | +| event\_key | Specify the key name that will be used to send a single value as part of the record. | | +| event\_host | Specify the key name that contains the host value. This option allows a record accessors pattern. | | +| event\_source | Set the source value to assign to the event data. | | +| event\_sourcetype | Set the sourcetype value to assign to the event data. | | +| event\_sourcetype\_key | Set a record key that will populate 'sourcetype'. If the key is found, it will have precedence over the value set in `event_sourcetype`. | | +| event\_index | The name of the index by which the event data is to be indexed. | | +| event\_index\_key | Set a record key that will populate the `index` field. If the key is found, it will have precedence over the value set in `event_index`. | | +| event\_field | Set event fields for the record. This option can be set multiple times and the format is `key_name record_accessor_pattern`. | | ### TLS / SSL @@ -52,7 +52,7 @@ In order to insert records into a Splunk service, you can run the plugin from th The **splunk** plugin, can read the parameters from the command line in two ways, through the **-p** argument \(property\), e.g: -```text +```shell fluent-bit -i cpu -t cpu -o splunk -p host=127.0.0.1 -p port=8088 \ -p tls=on -p tls.verify=off -m '*' ``` @@ -61,20 +61,44 @@ fluent-bit -i cpu -t cpu -o splunk -p host=127.0.0.1 -p port=8088 \ In your main configuration file append the following _Input_ & _Output_ sections: +{% tabs %} +{% tab title="fluent-bit.yaml" %} + +```yaml +pipeline: + inputs: + - name: cpu + tag: cpu + + outputs: + - name: splunk + match: '*' + host: 127.0.0.1 + port: 8088 + tls: on + tls.verify: off +``` + +{% endtab %} +{% tab title="fluent-bit.conf" %} + ```text [INPUT] - Name cpu - Tag cpu + Name cpu + Tag cpu [OUTPUT] - Name splunk - Match * - Host 127.0.0.1 - Port 8088 - TLS On - TLS.Verify Off + Name splunk + Match * + Host 127.0.0.1 + Port 8088 + TLS On + TLS.Verify Off ``` +{% endtab %} +{% endtabs %} + ### Data format By default, the Splunk output plugin nests the record under the `event` key in the payload sent to the HEC. It will also append the time of the record to a top level `time` key. @@ -83,6 +107,41 @@ If you would like to customize any of the Splunk event metadata, such as the hos For example, to add a custom index and hostname: + +{% tabs %} +{% tab title="fluent-bit.yaml" %} + +```yaml +pipeline: + inputs: + - name: cpu + tag: cpu + + filters: + # nest the record under the 'event' key + - name: nest + match: '*' + operation: nest + wildcard: '*' + nest_under: event + + - name: modify + match: '*' + add: + - index my-splunk-index + - host my-host + + outputs: + - name: splunk + match: '*' + host: 127.0.0.1 + splunk_token: 'xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxx' + splunk_send_raw: On +``` + +{% endtab %} +{% tab title="fluent-bit.conf" %} + ```text [INPUT] Name cpu @@ -111,18 +170,21 @@ For example, to add a custom index and hostname: Splunk_Send_Raw On ``` +{% endtab %} +{% endtabs %} + This will create a payload that looks like: -```javascript +```json { - "time": "1535995058.003385189", - "index": "my-splunk-index", - "host": "my-host", - "event": { - "cpu_p":0.000000, - "user_p":0.000000, - "system_p":0.000000 - } + "time": "1535995058.003385189", + "index": "my-splunk-index", + "host": "my-host", + "event": { + "cpu_p":0.000000, + "user_p":0.000000, + "system_p":0.000000 + } } ``` @@ -130,29 +192,29 @@ For more information on the Splunk HEC payload format and all event metadata Spl ### Sending Raw Events -If the option `splunk_send_raw` has been enabled, the user must take care to put all log details in the event field, and only specify fields known to Splunk in the top level event, if there is a mismatch, Splunk will return a HTTP error 400. +If the option `splunk_send_raw` has been enabled, the user must take care to put all log details in the event field, and only specify fields known to Splunk in the top level event, if there is a mismatch, Splunk will return an HTTP error 400. Consider the following example: **splunk\_send\_raw off** -```javascript -{"time": ..., "event": {"k1": "foo", "k2": "bar", "index": "applogs"}} +```json +{"time": "SOMETIME", "event": {"k1": "foo", "k2": "bar", "index": "applogs"}} ``` **splunk\_send\_raw on** -```text -{"time": .., "k1": "foo", "k2": "bar", "index": "applogs"} +```json +{"time": "SOMETIME", "k1": "foo", "k2": "bar", "index": "applogs"} ``` -For up to date information about the valid keys in the top level object, refer to the Splunk documentation: +For up-to-date information about the valid keys in the top level object, refer to the Splunk documentation: [http://docs.splunk.com/Documentation/Splunk/latest/Data/AboutHEC](http://docs.splunk.com/Documentation/Splunk/latest/Data/AboutHEC) ## Splunk Metric Index -With Splunk version 8.0> you can also use the Fluent Bit Splunk output plugin to send data to metric indices. This allows you to perform visualizations, metric queries, and analysis with other metrics you may be collecting. This is based off of Splunk 8.0 support of multi metric support via single JSON payload, more details can be found on [Splunk's documentation page](https://docs.splunk.com/Documentation/Splunk/8.1.2/Metrics/GetMetricsInOther#The_multiple-metric_JSON_format) +With Splunk version 8.0> you can also use the Fluent Bit Splunk output plugin to send data to metric indices. This allows you to perform visualizations, metric queries, and analysis with other metrics you may be collecting. This is based off of Splunk 8.0 support of multi metric support via single JSON payload, more details can be found on [Splunk documentation page](https://docs.splunk.com/Documentation/Splunk/8.1.2/Metrics/GetMetricsInOther#The_multiple-metric_JSON_format) Sending to a Splunk Metric index requires the use of `Splunk_send_raw` option being enabled and formatting the message properly. This includes three specific operations @@ -164,45 +226,89 @@ Sending to a Splunk Metric index requires the use of `Splunk_send_raw` option be The following configuration gathers CPU metrics, nests the appropriate field, adds the required identifiers and then sends to Splunk. +{% tabs %} +{% tab title="fluent-bit.yaml" %} + +```yaml +pipeline: + inputs: + - name: cpu + tag: cpu + + filters: + # Move CPU metrics to be nested under "fields" and + # add the prefix "metric_name:" to all metrics + # NOTE: you can change Wildcard field to only select metric fields + - name: nest + match: cpu + wildcard: '*' + operation: nest + nest_under: fields + add_prefix: 'metric_name:' + + # Add index, source, sourcetype + - name: modify + match: cpu + set: + - index cpu-metrics + - source fluent-bit + - sourcetype custom + + outputs: + - name: splunk + match: '*' + host: + port: 8088 + splunk_token: 'xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxx' + tls: on + tls.verify: off +``` + +{% endtab %} +{% tab title="fluent-bit.conf" %} + ```text [INPUT] - name cpu - tag cpu + name cpu + tag cpu # Move CPU metrics to be nested under "fields" and # add the prefix "metric_name:" to all metrics # NOTE: you can change Wildcard field to only select metric fields [FILTER] - Name nest - Match cpu - Wildcard * - Operation nest - Nest_under fields - Add_Prefix metric_name: + Name nest + Match cpu + Wildcard * + Operation nest + Nest_under fields + Add_Prefix metric_name: # Add index, source, sourcetype [FILTER] - Name modify - Match cpu - Set index cpu-metrics - Set source fluent-bit - Set sourcetype custom + Name modify + Match cpu + Set index cpu-metrics + Set source fluent-bit + Set sourcetype custom # ensure splunk_send_raw is on [OUTPUT] - name splunk - match * - host - port 8088 - splunk_send_raw on - splunk_token f9bd5bdb-c0b2-4a83-bcff-9625e5e908db - tls on - tls.verify off + name splunk + match * + host + port 8088 + splunk_send_raw on + splunk_token xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxx + tls on + tls.verify off ``` +{% endtab %} +{% endtabs %} + ## Send Metrics Events of Fluent Bit -With Fluent Bit 2.0, you can also send Fluent Bit's metrics type of events into Splunk via Splunk HEC. +Starting with Fluent Bit 2.0, you can also send Fluent Bit's metrics type of events into Splunk via Splunk HEC. This allows you to perform visualizations, metric queries, and analysis with directly sent Fluent Bit's metrics type of events. This is based off Splunk 8.0 support of multi metric support via single concatenated JSON payload. @@ -214,6 +320,28 @@ This example includes two specific operations * Collect node or Fluent Bit's internal metrics * Send metrics as single concatenated JSON payload +{% tabs %} +{% tab title="fluent-bit.yaml" %} + +```yaml +pipeline: + inputs: + - name: node_exporter_metrics + tag: node_exporter_metrics + + outputs: + - name: splunk + match: '*' + host: + port: 8088 + splunk_token: 'xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxx' + tls: on + tls.verify: off +``` + +{% endtab %} +{% tab title="fluent-bit.conf" %} + ```text [INPUT] name node_exporter_metrics @@ -224,7 +352,10 @@ This example includes two specific operations match * host port 8088 - splunk_token ee7edc62-19ad-4d1e-b957-448d3b326fb6 + splunk_token xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxx tls on tls.verify off ``` + +{% endtab %} +{% endtabs %} \ No newline at end of file diff --git a/pipeline/outputs/stackdriver.md b/pipeline/outputs/stackdriver.md index 22617d1cb..a4608f3d4 100644 --- a/pipeline/outputs/stackdriver.md +++ b/pipeline/outputs/stackdriver.md @@ -6,57 +6,81 @@ Before getting started with the plugin configuration, make sure to obtain the pr * [Creating a Google Service Account for Stackdriver](https://cloud.google.com/logging/docs/agent/authorization#create-service-account) -> Your goal is to obtain a credentials JSON file that will be used later by Fluent Bit Stackdriver output plugin. +{% hint style="info" %} + +Your goal is to obtain a credentials JSON file that will be used later by Fluent Bit Stackdriver output plugin. + +{% endhint %} Refer to the [Google Cloud `LogEntry` API documentation](https://cloud.google.com/logging/docs/reference/v2/rest/v2/LogEntry) for information on the meaning of some of the parameters below. ## Configuration Parameters -| Key | Description | default | -| :--- | :--- | :--- | -| google\_service\_credentials | Absolute path to a Google Cloud credentials JSON file | Value of environment variable `$GOOGLE_APPLICATION_CREDENTIALS` | -| service\_account\_email | Account email associated to the service. Only available if **no credentials file** has been provided. | Value of environment variable `$SERVICE_ACCOUNT_EMAIL` | -| service\_account\_secret | Private key content associated with the service account. Only available if **no credentials file** has been provided. | Value of environment variable `$SERVICE_ACCOUNT_SECRET` | -| metadata\_server | Prefix for a metadata server. | Value of environment variable `$METADATA_SERVER`, or http://metadata.google.internal if unset. | -| location | The GCP or AWS region in which to store data about the resource. If the resource type is one of the _generic\_node_ or _generic\_task_, then this field is required. | | -| namespace | A namespace identifier, such as a cluster name or environment. If the resource type is one of the _generic\_node_ or _generic\_task_, then this field is required. | | -| node\_id | A unique identifier for the node within the namespace, such as hostname or IP address. If the resource type is _generic\_node_, then this field is required. | | -| job | An identifier for a grouping of related task, such as the name of a microservice or distributed batch. If the resource type is _generic\_task_, then this field is required. | | -| task\_id | A unique identifier for the task within the namespace and job, such as a replica index identifying the task within the job. If the resource type is _generic\_task_, then this field is required. | | -| export\_to\_project\_id | The GCP project that should receive these logs. | The `project_id` in the google\_service\_credentials file, or the `project_id` from Google's metadata.google.internal server. | -| resource | Set resource type of data. Supported resource types: _k8s\_container_, _k8s\_node_, _k8s\_pod_, _global_, _generic\_node_, _generic\_task_, and _gce\_instance_. | global, gce\_instance | -| k8s\_cluster\_name | The name of the cluster that the container \(node or pod based on the resource type\) is running in. If the resource type is one of the _k8s\_container_, _k8s\_node_ or _k8s\_pod_, then this field is required. | | -| k8s\_cluster\_location | The physical location of the cluster that contains \(node or pod based on the resource type\) the container. If the resource type is one of the _k8s\_container_, _k8s\_node_ or _k8s\_pod_, then this field is required. | | -| labels\_key | The name of the key from the original record that contains the LogEntry's `labels`. | logging.googleapis.com/labels | -| labels | Optional list of comma-separated of strings specifying `key=value` pairs. The resulting labels will be combined with the elements obtained from `labels_key` to set the LogEntry Labels. Elements from `labels` will override duplicate values from `labels_key`.| | -| log\_name\_key | The name of the key from the original record that contains the logName value. | logging.googleapis.com/logName | -| tag\_prefix | Set the tag\_prefix used to validate the tag of logs with k8s resource type. Without this option, the tag of the log must be in format of k8s\_container\(pod/node\).\* in order to use the k8s\_container resource type. Now the tag prefix is configurable by this option \(note the ending dot\). | k8s\_container., k8s\_pod., k8s\_node. | -| severity\_key | The name of the key from the original record that contains the severity. | logging.googleapis.com/severity | -| project_id_key | The value of this field is used by the Stackdriver output plugin to find the gcp project id from jsonPayload and then extract the value of it to set the PROJECT_ID within LogEntry logName, which controls the gcp project that should receive these logs. | `logging.googleapis.com/projectId`. See [Stackdriver Special Fields][StackdriverSpecialFields] for more info. | -| autoformat\_stackdriver\_trace | Rewrite the _trace_ field to include the projectID and format it for use with Cloud Trace. When this flag is enabled, the user can get the correct result by printing only the traceID (usually 32 characters). | false | -| workers | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `1` | -| custom\_k8s\_regex | Set a custom regex to extract field like pod\_name, namespace\_name, container\_name and docker\_id from the local\_resource\_id in logs. This is helpful if the value of pod or node name contains dots. | `(?[a-z0-9](?:[-a-z0-9]*[a-z0-9])?(?:\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*)_(?[^_]+)_(?.+)-(?[a-z0-9]{64})\.log$` | -| resource_labels | An optional list of comma-separated strings specifying resource label plaintext assignments (`new=value`) and/or mappings from an original field in the log entry to a destination field (`destination=$original`). Nested fields and environment variables are also supported using the [record accessor syntax](https://docs.fluentbit.io/manual/administration/configuring-fluent-bit/classic-mode/record-accessor). If configured, *all* resource labels will be assigned using this API only, with the exception of `project_id`. See [Resource Labels](#resource-labels) for more details. | | -| http_request_key | The name of the key from the original record that contains the LogEntry's `httpRequest`. Note that the default value erroneously uses an underscore; users will likely need to set this to `logging.googleapis.com/httpRequest`. | logging.googleapis.com/http_request | -| compress | Set payload compression mechanism. The only available option is `gzip`. Default = "", which means no compression.| | -| cloud\_logging\_base\_url | Set the base Cloud Logging API URL to use for the `/v2/entries:write` API request. | https://logging.googleapis.com | +| Key | Description | default | +|:-------------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| google\_service\_credentials | Absolute path to a Google Cloud credentials JSON file | Value of environment variable `$GOOGLE_APPLICATION_CREDENTIALS` | +| service\_account\_email | Account email associated to the service. Only available if **no credentials file** has been provided. | Value of environment variable `$SERVICE_ACCOUNT_EMAIL` | +| service\_account\_secret | Private key content associated with the service account. Only available if **no credentials file** has been provided. | Value of environment variable `$SERVICE_ACCOUNT_SECRET` | +| metadata\_server | Prefix for a metadata server. | Value of environment variable `$METADATA_SERVER`, or http://metadata.google.internal if unset. | +| location | The GCP or AWS region in which to store data about the resource. If the resource type is one of the _generic\_node_ or _generic\_task_, then this field is required. | | +| namespace | A namespace identifier, such as a cluster name or environment. If the resource type is one of the _generic\_node_ or _generic\_task_, then this field is required. | | +| node\_id | A unique identifier for the node within the namespace, such as hostname or IP address. If the resource type is _generic\_node_, then this field is required. | | +| job | An identifier for a grouping of related task, such as the name of a microservice or distributed batch. If the resource type is _generic\_task_, then this field is required. | | +| task\_id | A unique identifier for the task within the namespace and job, such as a replica index identifying the task within the job. If the resource type is _generic\_task_, then this field is required. | | +| export\_to\_project\_id | The GCP project that should receive these logs. | The `project_id` in the google\_service\_credentials file, or the `project_id` from Google's metadata.google.internal server. | +| resource | Set resource type of data. Supported resource types: _k8s\_container_, _k8s\_node_, _k8s\_pod_, _global_, _generic\_node_, _generic\_task_, and _gce\_instance_. | global, gce\_instance | +| k8s\_cluster\_name | The name of the cluster that the container \(node or pod based on the resource type\) is running in. If the resource type is one of the _k8s\_container_, _k8s\_node_ or _k8s\_pod_, then this field is required. | | +| k8s\_cluster\_location | The physical location of the cluster that contains \(node or pod based on the resource type\) the container. If the resource type is one of the _k8s\_container_, _k8s\_node_ or _k8s\_pod_, then this field is required. | | +| labels\_key | The name of the key from the original record that contains the LogEntry's `labels`. | logging.googleapis.com/labels | +| labels | Optional list of comma-separated of strings specifying `key=value` pairs. The resulting labels will be combined with the elements obtained from `labels_key` to set the LogEntry Labels. Elements from `labels` will override duplicate values from `labels_key`. | | +| log\_name\_key | The name of the key from the original record that contains the logName value. | logging.googleapis.com/logName | +| tag\_prefix | Set the tag\_prefix used to validate the tag of logs with k8s resource type. Without this option, the tag of the log must be in format of k8s\_container\(pod/node\).\* in order to use the k8s\_container resource type. Now the tag prefix is configurable by this option \(note the ending dot\). | k8s\_container., k8s\_pod., k8s\_node. | +| severity\_key | The name of the key from the original record that contains the severity. | logging.googleapis.com/severity | +| project_id_key | The value of this field is used by the Stackdriver output plugin to find the gcp project id from jsonPayload and then extract the value of it to set the PROJECT_ID within LogEntry logName, which controls the gcp project that should receive these logs. | `logging.googleapis.com/projectId`. See [Stackdriver Special Fields][StackdriverSpecialFields] for more info. | +| autoformat\_stackdriver\_trace | Rewrite the _trace_ field to include the projectID and format it for use with Cloud Trace. When this flag is enabled, the user can get the correct result by printing only the traceID (usually 32 characters). | false | +| workers | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `1` | +| custom\_k8s\_regex | Set a custom regex to extract field like pod\_name, namespace\_name, container\_name and docker\_id from the local\_resource\_id in logs. This is helpful if the value of pod or node name contains dots. | `(?[a-z0-9](?:[-a-z0-9]*[a-z0-9])?(?:\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*)_(?[^_]+)_(?.+)-(?[a-z0-9]{64})\.log$` | +| resource_labels | An optional list of comma-separated strings specifying resource label plaintext assignments (`new=value`) and/or mappings from an original field in the log entry to a destination field (`destination=$original`). Nested fields and environment variables are also supported using the [record accessor syntax](https://docs.fluentbit.io/manual/administration/configuring-fluent-bit/classic-mode/record-accessor). If configured, *all* resource labels will be assigned using this API only, with the exception of `project_id`. See [Resource Labels](#resource-labels) for more details. | | +| http_request_key | The name of the key from the original record that contains the LogEntry's `httpRequest`. Note that the default value erroneously uses an underscore; users will likely need to set this to `logging.googleapis.com/httpRequest`. | logging.googleapis.com/http_request | +| compress | Set payload compression mechanism. The only available option is `gzip`. Default = "", which means no compression. | | +| cloud\_logging\_base\_url | Set the base Cloud Logging API URL to use for the `/v2/entries:write` API request. | https://logging.googleapis.com | ### Configuration File -If you are using a _Google Cloud Credentials File_, the following configuration is enough to get started: +If you are using a `Google Cloud Credentials File`, the following configuration is enough to get started: + +{% tabs %} +{% tab title="fluent-bit.yaml" %} +```yaml +pipeline: + inputs: + - name: cpu + tag: cpu + + outputs: + - name: stackdriver + match: '*' ``` + +{% endtab %} +{% tab title="fluent-bit.conf" %} + +```text [INPUT] - Name cpu - Tag cpu + Name cpu + Tag cpu [OUTPUT] - Name stackdriver - Match * + Name stackdriver + Match * ``` -Example configuration file for k8s resource type: +{% endtab %} +{% endtabs %} + +#### Example configuration file for k8s resource type: `local_resource_id` is used by the Stackdriver output plugin to set the labels field for different k8s resource types. Stackdriver plugin will try to find the `local_resource_id` field in the log entry. If there is no field `logging.googleapis.com/local_resource_id` in the log, the plugin will then construct it by using the tag value of the log. @@ -68,31 +92,59 @@ The local_resource_id should be in format: This implies that if there is no local_resource_id in the log entry then the tag of logs should match this format. Note that we have an option tag_prefix so it is not mandatory to use k8s_container(node/pod) as the prefix for tag. +{% tabs %} +{% tab title="fluent-bit.yaml" %} + +```yaml +pipeline: + inputs: + - name: tail + tag_regex: 'var.log.containers.(?[a-z0-9](?:[-a-z0-9]*[a-z0-9])?(?:\\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*)_(?[^_]+)_(?.+)-(?[a-z0-9]{64})\.log$' + tag: custom_tag... + path: /var/log/containers/*.log + parser: docker + db: /var/log/fluent-bit-k8s-container.db + + outputs: + - name: stackdriver + match: 'custom_tag.*' + resource: k8s_container + k8s_cluster_name: test_cluster_name + k8s_cluster_location: test_cluster_location + tag_prefix: 'custom_tag.' ``` + +{% endtab %} +{% tab title="fluent-bit.conf" %} + +```text [INPUT] - Name tail - Tag_Regex var.log.containers.(?[a-z0-9](?:[-a-z0-9]*[a-z0-9])?(?:\\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*)_(?[^_]+)_(?.+)-(?[a-z0-9]{64})\.log$ - Tag custom_tag... - Path /var/log/containers/*.log - Parser docker - DB /var/log/fluent-bit-k8s-container.db + Name tail + Tag_Regex var.log.containers.(?[a-z0-9](?:[-a-z0-9]*[a-z0-9])?(?:\\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*)_(?[^_]+)_(?.+)-(?[a-z0-9]{64})\.log$ + Tag custom_tag... + Path /var/log/containers/*.log + Parser docker + DB /var/log/fluent-bit-k8s-container.db [OUTPUT] - Name stackdriver - Match custom_tag.* - Resource k8s_container - k8s_cluster_name test_cluster_name - k8s_cluster_location test_cluster_location - tag_prefix custom_tag. + Name stackdriver + Match custom_tag.* + Resource k8s_container + k8s_cluster_name test_cluster_name + k8s_cluster_location test_cluster_location + tag_prefix custom_tag. ``` +{% endtab %} +{% endtabs %} + ## Resource Labels Currently, there are four ways which fluent-bit uses to assign fields into the resource/labels section. 1. Resource Labels API 2. Monitored Resource API -3. Local Resource Id +3. Local Resource ID 4. Credentials / Config Parameters If `resource_labels` is correctly configured, then fluent-bit will attempt to populate all resource/labels using the entries specified. Otherwise, fluent-bit will attempt to use the monitored resource API. Similarly, if the monitored resource API cannot be used, then fluent-bit will attempt to populate resource/labels using configuration parameters and/or credentials specific to the resource type. As mentioned in the [Configuration File](#configuration-file) section, fluent-bit will attempt to use or construct a local resource ID for a K8s resource type which does not use the resource labels or monitored resource API. @@ -104,7 +156,8 @@ Note that the `project_id` resource label will always be set from the service cr The `resource_labels` configuration parameter offers an alternative API for assigning the resource labels. To use, input a list of comma separated strings specifying resource labels plaintext assignments (`new=value`), mappings from an original field in the log entry to a destination field (`destination=$original`) and/or environment variable assignments (`new=${var}`). For instance, consider the following log entry: -``` + +```json { "keyA": "valA", "toplevel": { @@ -113,23 +166,43 @@ For instance, consider the following log entry: } ``` -Combined with the following Stackdriver configuration: +Combined with the following Fluent Bit Stackdriver output plugin configuration: + +{% tabs %} +{% tab title="fluent-bit.yaml" %} + +```yaml +pipeline: + + outputs: + - name: stackdriver + match: '*' + resource_labels: keyC=$keyA,keyD=$toplevel['keyB'],keyE=valC ``` + +{% endtab %} +{% tab title="fluent-bit.conf" %} + +```text [OUTPUT] - Name stackdriver - Match * - Resource_Labels keyC=$keyA,keyD=$toplevel['keyB'],keyE=valC + Name stackdriver + Match * + Resource_Labels keyC=$keyA,keyD=$toplevel['keyB'],keyE=valC ``` +{% endtab %} +{% endtabs %} + This will produce the following log: -``` + +```json { "resource": { "type": "global", "labels": { "project_id": "fluent-bit", "keyC": "valA", - "keyD": "valB" + "keyD": "valB", "keyE": "valC" } }, @@ -140,7 +213,7 @@ This will produce the following log: "toplevel": { "keyB": "valB" } - }, + } } ] } @@ -150,34 +223,52 @@ This makes the `resource_labels` API the recommended choice for supporting new o For instance, for a K8s resource type, `resource_labels` can be used in tandem with the [Kubernetes filter](https://docs.fluentbit.io/manual/pipeline/filters/kubernetes) to pack all six resource labels. Below is an example of what this could look like for a `k8s_container` resource: +{% tabs %} +{% tab title="fluent-bit.yaml" %} +```yaml +pipeline: + + outputs: + - name: stackdriver + match: '*' + resource: k8s_container + resource_labels: cluster_name=my-cluster,location=us-central1-c,container_name=$kubernetes['container_name'],namespace_name=$kubernetes['namespace_name'],pod_name=$kubernetes['pod_name'] ``` + +{% endtab %} +{% tab title="fluent-bit.conf" %} + +```text [OUTPUT] - Name stackdriver - Match * - Resource k8s_container - Resource_Labels cluster_name=my-cluster,location=us-central1-c,container_name=$kubernetes['container_name'],namespace_name=$kubernetes['namespace_name'],pod_name=$kubernetes['pod_name'] + Name stackdriver + Match * + Resource k8s_container + Resource_Labels cluster_name=my-cluster,location=us-central1-c,container_name=$kubernetes['container_name'],namespace_name=$kubernetes['namespace_name'],pod_name=$kubernetes['pod_name'] ``` +{% endtab %} +{% endtabs %} + `resource_labels` also supports validation for required labels based on the input resource type. This allows fluent-bit to check if all specified labels are present for a given configuration before runtime. If validation is not currently supported for a resource type that you would like to use this API with, we encourage you to open a pull request for it. Adding validation for a new resource type is simple - all that is needed is to specify the resources associated with the type alongside the required labels [here](https://github.com/fluent/fluent-bit/blob/master/plugins/out_stackdriver/stackdriver_resource_types.c#L27). ## Log Name By default, the plugin will write to the following log name: -``` + +```text /projects//logs/ ``` -You may be in a scenario where being more specific about the log name is important (for example [integration with Log Router rules](https://cloud.google.com/logging/docs/routing/overview) or [controlling cardinality of log based metrics]((https://cloud.google.com/logging/docs/logs-based-metrics/troubleshooting#too-many-time-series))). You can control the log name directly on a per-log basis by using the [`logging.googleapis.com/logName` special field][StackdriverSpecialFields]. You can configure a `log_name_key` if you'd like to use something different than `logging.googleapis.com/logName`, i.e. if the `log_name_key` is set to `mylognamefield` will extract the log name from `mylognamefield` in the log. + +You may be in a scenario where being more specific about the log name is important (for example [integration with Log Router rules](https://cloud.google.com/logging/docs/routing/overview) or [controlling cardinality of log based metrics]((https://cloud.google.com/logging/docs/logs-based-metrics/troubleshooting#too-many-time-series))). You can control the log name directly on a per-log basis by using the [`logging.googleapis.com/logName` special field][StackdriverSpecialFields]. You can configure a `log_name_key` if you'd like to use something different from `logging.googleapis.com/logName`, i.e. if the `log_name_key` is set to `mylognamefield` will extract the log name from `mylognamefield` in the log. ## Troubleshooting Notes ### Upstream connection error -> Github reference: [#761](https://github.com/fluent/fluent-bit/issues/761) - An upstream connection error means Fluent Bit was not able to reach Google services, the error looks like this: -``` +```text [2019/01/07 23:24:09] [error] [oauth2] could not get an upstream connection ``` @@ -186,11 +277,18 @@ This is due to a network issue in the environment where Fluent Bit is running. M * [https://www.googleapis.com](https://www.googleapis.com) * [https://logging.googleapis.com](https://logging.googleapis.com) + +{% hint style="warning" %} + +For more details, see GitHub reference: [#761](https://github.com/fluent/fluent-bit/issues/761) + +{% endhint %} + ### Fail to process local_resource_id The error looks like this: -``` +```text [2020/08/04 14:43:03] [error] [output:stackdriver:stackdriver.0] fail to process local_resource_id from log entry for k8s_container ``` @@ -205,4 +303,4 @@ Stackdriver officially supports a [logging agent based on Fluentd](https://cloud We plan to support some [special fields in structured payloads](https://cloud.google.com/logging/docs/agent/configuration#special-fields). Use cases of special fields is [here](./stackdriver_special_fields.md). -[StackdriverSpecialFields]: ./stackdriver_special_fields.md#log-entry-fields +[StackdriverSpecialFields]: ./stackdriver_special_fields.md#log-entry-fields \ No newline at end of file diff --git a/pipeline/outputs/stackdriver_special_fields.md b/pipeline/outputs/stackdriver_special_fields.md index 3bb2b9cd3..b79ec1abf 100644 --- a/pipeline/outputs/stackdriver_special_fields.md +++ b/pipeline/outputs/stackdriver_special_fields.md @@ -10,7 +10,7 @@ Currently, we support some special fields in fluent-bit for setting fields on th | `logging.googleapis.com/logName` | `logName` | `string` | The log name to write this log to. | | `logging.googleapis.com/labels` | `labels` | `object` | The labels for this log. | | `logging.googleapis.com/severity` | `severity` | [`LogSeverity` enum](https://cloud.google.com/logging/docs/reference/v2/rest/v2/LogEntry#LogSeverity) | The severity of this log. | -| `logging.googleapis.com/monitored_resource` | `resource` | [`MonitoredResource`](https://cloud.google.com/logging/docs/reference/v2/rest/v2/MonitoredResource) (without `type`) | Resource labels for this log. See [caveats](#monitored-resource). | +| `logging.googleapis.com/monitored_resource` | `resource` | [`MonitoredResource`](https://cloud.google.com/logging/docs/reference/v2/rest/v2/MonitoredResource) (without `type`) | Resource labels for this log. | | `logging.googleapis.com/operation` | `operation` | [`LogEntryOperation`](https://cloud.google.com/logging/docs/reference/v2/rest/v2/LogEntry#LogEntryOperation) | Additional information about a potentially long-running operation. | | `logging.googleapis.com/insertId` | `insertId` | `string` | A unique identifier for the log entry. It is used to order logEntries. | | `logging.googleapis.com/sourceLocation` | `sourceLocation` | [`LogEntrySourceLocation`](https://cloud.google.com/logging/docs/reference/v2/rest/v2/LogEntry#LogEntrySourceLocation) | Additional information about the source code location that produced the log entry. | @@ -23,76 +23,90 @@ Currently, we support some special fields in fluent-bit for setting fields on th ## Other Special Fields -| JSON log field | Description | -| :--- | :--- | -| `logging.googleapis.com/projectId` | Changes the project ID that this log will be written to. Ensure that you are authenticated to write logs to this project. | -| `logging.googleapis.com/local_resource_id` | Overrides the [configured `local_resource_id`](./stackdriver.md#resource-labels). | +| JSON log field | Description | +|:-------------------------------------------|:--------------------------------------------------------------------------------------------------------------------------| +| `logging.googleapis.com/projectId` | Changes the project ID that this log will be written to. Ensure that you are authenticated to write logs to this project. | +| `logging.googleapis.com/local_resource_id` | Overrides the [configured `local_resource_id`](./stackdriver.md#resource-labels). | ## Using Special Fields To use a special field, you must add a field with the right name and value to your log. Given an example structured log (internally in MessagePack but shown in JSON for demonstration): + ```json { - "log": "Hello world!" + "log": "Hello world!" } ``` + To use the `logging.googleapis.com/logName` special field, you would add it to your structured log as follows: + ```json { - "log": "Hello world!", - "logging.googleapis.com/logName": "my_log" + "log": "Hello world!", + "logging.googleapis.com/logName": "my_log" } ``` -For the special fields that map to `LogEntry` protos, you will need to add them as objects with field names that match the proto. For example, to use the `logging.googleapis.com/operation`: + +For the special fields that map to `LogEntry` prototypes, you will need to add them as objects with field names that match the proto. For example, to use the `logging.googleapis.com/operation`: + ```json { - "log": "Hello world!", - "logging.googleapis.com/operation": { - "id": "test_id", - "producer": "test_producer", - "first": true, - "last": true - } + "log": "Hello world!", + "logging.googleapis.com/operation": { + "id": "test_id", + "producer": "test_producer", + "first": true, + "last": true + } } ``` + Adding special fields to logs is best done through the [`modify` filter](https://docs.fluentbit.io/manual/pipeline/filters/modify) for simple fields, or [a Lua script using the `lua` filter](https://docs.fluentbit.io/manual/pipeline/filters/lua) for more complex fields. ## Simple Type Special Fields -For special fields with simple types (with the exception of the [`logging.googleapis.com/insertId` field](#insert-id)), they will follow this pattern (demonstrated with the `logging.googleapis.com/logName` field): +For special fields with simple types (except for the [`logging.googleapis.com/insertId` field](#insert-id)), they will follow this pattern (demonstrated with the `logging.googleapis.com/logName` field): 1. If the special field matches the type, it will be moved to the corresponding LogEntry field. For example: + ```text { - "logging.googleapis.com/logName": "my_log" - ... + ... + "logging.googleapis.com/logName": "my_log" + ... } ``` + the logEntry will be: + ```text { - "jsonPayload": { - ... - } - "logName": "my_log" + "jsonPayload": { ... + } + "logName": "my_log" + ... } ``` 2. If the field is non-empty but an invalid, it will be left in the jsonPayload. For example: + ```text { - "logging.googleapis.com/logName": 12345 - ... + ... + "logging.googleapis.com/logName": 12345 + ... } ``` + the logEntry will be: + ```text { - "jsonPayload": { - "logging.googleapis.com/logName": 12345 - ... - } + "jsonPayload": { + "logging.googleapis.com/logName": 12345 + ... + } } ``` @@ -107,20 +121,26 @@ If the `logging.googleapis.com/insertId` field has an invalid type, the log will If the [`autoformat_stackdriver_trace` plugin configuration option]() is set to `true`, the value provided in the `trace` field will be formatted into the format that Cloud Logging expects along with the detected Project ID (from the Google Metadata server, configured in the plugin, or provided via special field). For example, if `autoformat_stackdriver_trace` is enabled, this: + ```text { - "logging.googleapis.com/projectId": "my-project-id", - "logging.googleapis.com/trace": "12345" + ... + "logging.googleapis.com/projectId": "my-project-id", + "logging.googleapis.com/trace": "12345" + ... } ``` + Will become this: + ```text { - "jsonPayload": { - ... - }, - "projectId": "my-project-id", - "trace": "/projects/my-project-id/traces/12345" + "jsonPayload": { + ... + }, + "projectId": "my-project-id", + "trace": "/projects/my-project-id/traces/12345" + ... } ``` @@ -130,86 +150,96 @@ The `timestampSecond` and `timestampNano` fields don't map directly to the `time ## Proto Special Fields -For special fields that expect the format of a proto type from the `LogEntry` (with the exception of the `logging.googleapis.com/monitored_resource` field) will follow this pattern (demonstrated with the `logging.googleapis.com/operation` field): +For special fields that expect the format of a prototype from the `LogEntry` (except for the `logging.googleapis.com/monitored_resource` field) will follow this pattern (demonstrated with the `logging.googleapis.com/operation` field): If any subfields of the proto are empty or in incorrect type, the plugin will set these subfields empty. For example: + ```text { - "logging.googleapis.com/operation": { - "id": 123, #incorrect type - # no producer here - "first": true, - "last": true - } - ... + "logging.googleapis.com/operation": { + "id": 123, #incorrect type + # no producer here + "first": true, + "last": true + } + ... } ``` + the logEntry will be: + ```text { - "jsonPayload": { - ... - } - "operation": { - "first": true, - "last": true - } + "jsonPayload": { ... + } + "operation": { + "first": true, + "last": true + } + ... } ``` If the field itself is not a map, the plugin will leave this field untouched. For example: + ```text { - "logging.googleapis.com/operation": "some string", - ... + ... + "logging.googleapis.com/operation": "some string", + ... } ``` + the logEntry will be: + ```text { - "jsonPayload": { - "logging.googleapis.com/operation": "some string", - ... - } + "jsonPayload": { + "logging.googleapis.com/operation": "some string", ... + } + ... } ``` If there are extra subfields, the plugin will add the recognized fields to the corresponding field in the LogEntry, and preserve the extra subfields in jsonPayload. For example: + ```text { - "logging.googleapis.com/operation": { - "id": "test_id", - "producer": "test_producer", - "first": true, - "last": true, - - "extra1": "some string", - "extra2": 123, - "extra3": true - } - ... + ... + "logging.googleapis.com/operation": { + "id": "test_id", + "producer": "test_producer", + "first": true, + "last": true, + "extra1": "some string", + "extra2": 123, + "extra3": true + } + ... } ``` + the logEntry will be: + ```text { - "jsonPayload": { - "logging.googleapis.com/operation": { - "extra1": "some string", - "extra2": 123, - "extra3": true - } - ... - } - "operation": { - "id": "test_id", - "producer": "test_producer", - "first": true, - "last": true + "jsonPayload": { + "logging.googleapis.com/operation": { + "extra1": "some string", + "extra2": 123, + "extra3": true } ... + } + "operation": { + "id": "test_id", + "producer": "test_producer", + "first": true, + "last": true + } + ... } ``` @@ -223,7 +253,7 @@ The `type` field from the [`MonitoredResource` proto]() is not parsed out of the The `labels` field is expected to be an `object`. If any fields have a value that is not a string, the value is ignored and not preserved. The plugin logs an error and drops the field. -If no valid `labels` field is found, or if all of entries in the `labels` object provided are invalid, the `logging.googleapis.com/monitored_resource` field is dropped in favour of automatically setting resource labels using other available information based on the configured `resource` type. +If no valid `labels` field is found, or if all entries in the `labels` object provided are invalid, the `logging.googleapis.com/monitored_resource` field is dropped in favour of automatically setting resource labels using other available information based on the configured `resource` type. ## Timestamp @@ -231,6 +261,7 @@ We support two formats of time-related fields: Format 1 - timestamp: Log body contains a `timestamp` field that includes the seconds and nanos fields. + ```text { "timestamp": { @@ -239,14 +270,15 @@ Log body contains a `timestamp` field that includes the seconds and nanos fields } } ``` + Format 2 - timestampSeconds/timestampNanos: Log body contains both the `timestampSeconds` and `timestampNanos` fields. + ```text { - "timestampSeconds": CURRENT_SECONDS, - "timestampNanos": CURRENT_NANOS + "timestampSeconds": CURRENT_SECONDS, + "timestampNanos": CURRENT_NANOS } - ``` If one of the following JSON timestamp representations is present in a structured record, the plugin collapses them into a single representation in the timestamp field in the `LogEntry` object. @@ -258,21 +290,23 @@ Without time-related fields, the plugin will set the current time as timestamp. Set the input log as followed: ```text { - "timestamp": { - "seconds": 1596149787, - "nanos": 12345 - } - ... + "timestamp": { + "seconds": 1596149787, + "nanos": 12345 + } + ... } ``` + the logEntry will be: + ```text { - "jsonPayload": { - ... - } - "timestamp": "2020-07-30T22:56:27.000012345Z" + "jsonPayload": { ... + } + "timestamp": "2020-07-30T22:56:27.000012345Z" + ... } ``` @@ -281,19 +315,21 @@ the logEntry will be: Set the input log as followed: ```text { - "timestampSeconds":1596149787, - "timestampNanos": 12345 - ... + "timestampSeconds":1596149787, + "timestampNanos": 12345 + ... } ``` + the logEntry will be: + ```text { - "jsonPayload": { - ... - } - "timestamp": "2020-07-30T22:56:27.000012345Z" + "jsonPayload": { ... + } + "timestamp": "2020-07-30T22:56:27.000012345Z" + ... } ``` diff --git a/pipeline/outputs/standard-output.md b/pipeline/outputs/standard-output.md index 665a34c35..1e5333d48 100644 --- a/pipeline/outputs/standard-output.md +++ b/pipeline/outputs/standard-output.md @@ -4,34 +4,30 @@ The **stdout** output plugin allows to print to the standard output the data rec ## Configuration Parameters -| Key | Description | default | -| :--- | :--- | :--- | -| Format | Specify the data format to be printed. Supported formats are _msgpack_, _json_, _json\_lines_ and _json\_stream_. | msgpack | -| json\_date\_key | Specify the name of the time key in the output record. To disable the time key just set the value to `false`. | date | -| json\_date\_format | Specify the format of the date. Supported formats are _double_, _epoch_, _iso8601_ (eg: _2018-05-30T09:39:52.000681Z_) and _java_sql_timestamp_ (eg: _2018-05-30 09:39:52.000681_) | double | -| workers | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `1` | +| Key | Description | default | +|:-------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:--------| +| Format | Specify the data format to be printed. Supported formats are _msgpack_, _json_, _json\_lines_ and _json\_stream_. | msgpack | +| json\_date\_key | Specify the name of the time key in the output record. To disable the time key just set the value to `false`. | date | +| json\_date\_format | Specify the format of the date. Supported formats are _double_, _epoch_, _iso8601_ (eg: _2018-05-30T09:39:52.000681Z_) and _java_sql_timestamp_ (eg: _2018-05-30 09:39:52.000681_) | double | +| workers | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `1` | ### Command Line -```bash -bin/fluent-bit -i cpu -o stdout -v +```shell +fluent-bit -i cpu -o stdout -v ``` -We have specified to gather [CPU](https://github.com/fluent/fluent-bit-docs/tree/ddc1cf3d996966b9db39f8784596c8b7132b4d5b/pipeline/input/cpu.md) usage metrics and print them out to the standard output in a human readable way: +We have specified to gather [CPU](https://github.com/fluent/fluent-bit-docs/tree/ddc1cf3d996966b9db39f8784596c8b7132b4d5b/pipeline/input/cpu.md) usage metrics and print them out to the standard output in a human-readable way: -```bash -$ bin/fluent-bit -i cpu -o stdout -p format=msgpack -v -Fluent Bit v1.x.x -* Copyright (C) 2019-2020 The Fluent Bit Authors -* Copyright (C) 2015-2018 Treasure Data -* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd -* https://fluentbit.io +```shell +$ fluent-bit -i cpu -o stdout -p format=msgpack -v -[2016/10/07 21:52:01] [ info] [engine] started +... [0] cpu.0: [1475898721, {"cpu_p"=>0.500000, "user_p"=>0.250000, "system_p"=>0.250000, "cpu0.p_cpu"=>0.000000, "cpu0.p_user"=>0.000000, "cpu0.p_system"=>0.000000, "cpu1.p_cpu"=>0.000000, "cpu1.p_user"=>0.000000, "cpu1.p_system"=>0.000000, "cpu2.p_cpu"=>0.000000, "cpu2.p_user"=>0.000000, "cpu2.p_system"=>0.000000, "cpu3.p_cpu"=>1.000000, "cpu3.p_user"=>0.000000, "cpu3.p_system"=>1.000000}] [1] cpu.0: [1475898722, {"cpu_p"=>0.250000, "user_p"=>0.250000, "system_p"=>0.000000, "cpu0.p_cpu"=>0.000000, "cpu0.p_user"=>0.000000, "cpu0.p_system"=>0.000000, "cpu1.p_cpu"=>1.000000, "cpu1.p_user"=>1.000000, "cpu1.p_system"=>0.000000, "cpu2.p_cpu"=>0.000000, "cpu2.p_user"=>0.000000, "cpu2.p_system"=>0.000000, "cpu3.p_cpu"=>0.000000, "cpu3.p_user"=>0.000000, "cpu3.p_system"=>0.000000}] [2] cpu.0: [1475898723, {"cpu_p"=>0.750000, "user_p"=>0.250000, "system_p"=>0.500000, "cpu0.p_cpu"=>2.000000, "cpu0.p_user"=>1.000000, "cpu0.p_system"=>1.000000, "cpu1.p_cpu"=>0.000000, "cpu1.p_user"=>0.000000, "cpu1.p_system"=>0.000000, "cpu2.p_cpu"=>1.000000, "cpu2.p_user"=>0.000000, "cpu2.p_system"=>1.000000, "cpu3.p_cpu"=>0.000000, "cpu3.p_user"=>0.000000, "cpu3.p_system"=>0.000000}] [3] cpu.0: [1475898724, {"cpu_p"=>1.000000, "user_p"=>0.750000, "system_p"=>0.250000, "cpu0.p_cpu"=>1.000000, "cpu0.p_user"=>1.000000, "cpu0.p_system"=>0.000000, "cpu1.p_cpu"=>2.000000, "cpu1.p_user"=>1.000000, "cpu1.p_system"=>1.000000, "cpu2.p_cpu"=>1.000000, "cpu2.p_user"=>1.000000, "cpu2.p_system"=>0.000000, "cpu3.p_cpu"=>1.000000, "cpu3.p_user"=>1.000000, "cpu3.p_system"=>0.000000}] +... ``` -No more, no less, it just works. +No more, no less, it just works. \ No newline at end of file diff --git a/pipeline/outputs/syslog.md b/pipeline/outputs/syslog.md index 9cce9d2e5..a927382da 100644 --- a/pipeline/outputs/syslog.md +++ b/pipeline/outputs/syslog.md @@ -9,29 +9,29 @@ You must be aware of the structure of your original record so you can configure ## Configuration Parameters -| Key | Description | Default | -| :--- | :--- | :--- | -| host | Domain or IP address of the remote Syslog server. | 127.0.0.1 | -| port | TCP or UDP port of the remote Syslog server. | 514 | -| mode | Desired transport type. Available options are `tcp` and `udp`. | udp | -| syslog\_format | The Syslog protocol format to use. Available options are `rfc3164` and `rfc5424`. | rfc5424 | -| syslog\_maxsize | The maximum size allowed per message. The value must be an integer representing the number of bytes allowed. If no value is provided, the default size is set depending of the protocol version specified by `syslog_format`.

`rfc3164` sets max size to 1024 bytes.

`rfc5424` sets the size to 2048 bytes. | | -| syslog\_severity\_key | The key name from the original record that contains the Syslog severity number. This configuration is optional. | | -| syslog\_severity\_preset | The preset severity number. It will be overwritten if `syslog_severity_key` is set and a key of a record is matched. This configuration is optional. | 6 | -| syslog\_facility\_key | The key name from the original record that contains the Syslog facility number. This configuration is optional. | | -| syslog\_facility\_preset | The preset facility number. It will be overwritten if `syslog_facility_key` is set and a key of a record is matched. This configuration is optional. | 1 | -| syslog\_hostname\_key | The key name from the original record that contains the hostname that generated the message. This configuration is optional. | | -| syslog\_hostname\_preset | The preset hostname. It will be overwritten if `syslog_hostname_key` is set and a key of a record is matched. This configuration is optional. | | -| syslog\_appname\_key | The key name from the original record that contains the application name that generated the message. This configuration is optional. | | -| syslog\_appname\_preset | The preset application name. It will be overwritten if `syslog_appname_key` is set and a key of a record is matched. This configuration is optional. | | -| syslog\_procid\_key | The key name from the original record that contains the Process ID that generated the message. This configuration is optional. | | -| syslog\_procid\_preset | The preset process ID. It will be overwritten if `syslog_procid_key` is set and a key of a record is matched. This configuration is optional. | | -| syslog\_msgid\_key | The key name from the original record that contains the Message ID associated to the message. This configuration is optional. | | -| syslog\_msgid\_preset | The preset message ID. It will be overwritten if `syslog_msgid_key` is set and a key of a record is matched. This configuration is optional. | | -| syslog\_sd\_key | The key name from the original record that contains a map of key/value pairs to use as Structured Data \(SD\) content. The key name is included in the resulting SD field as shown in examples below. This configuration is optional. | | -| syslog\_message\_key | The key name from the original record that contains the message to deliver. Note that this property is **mandatory**, otherwise the message will be empty. | | -| allow\_longer\_sd\_id| If true, Fluent-bit allows SD-ID that is longer than 32 characters. Such long SD-ID violates RFC 5424.| false | -| workers | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `0` | +| Key | Description | Default | +|:-------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------| +| host | Domain or IP address of the remote Syslog server. | 127.0.0.1 | +| port | TCP or UDP port of the remote Syslog server. | 514 | +| mode | Desired transport type. Available options are `tcp` and `udp`. | udp | +| syslog\_format | The Syslog protocol format to use. Available options are `rfc3164` and `rfc5424`. | rfc5424 | +| syslog\_maxsize | The maximum size allowed per message. The value must be an integer representing the number of bytes allowed. If no value is provided, the default size is set depending of the protocol version specified by `syslog_format`.

`rfc3164` sets max size to 1024 bytes.

`rfc5424` sets the size to 2048 bytes. | | +| syslog\_severity\_key | The key name from the original record that contains the Syslog severity number. This configuration is optional. | | +| syslog\_severity\_preset | The preset severity number. It will be overwritten if `syslog_severity_key` is set and a key of a record is matched. This configuration is optional. | 6 | +| syslog\_facility\_key | The key name from the original record that contains the Syslog facility number. This configuration is optional. | | +| syslog\_facility\_preset | The preset facility number. It will be overwritten if `syslog_facility_key` is set and a key of a record is matched. This configuration is optional. | 1 | +| syslog\_hostname\_key | The key name from the original record that contains the hostname that generated the message. This configuration is optional. | | +| syslog\_hostname\_preset | The preset hostname. It will be overwritten if `syslog_hostname_key` is set and a key of a record is matched. This configuration is optional. | | +| syslog\_appname\_key | The key name from the original record that contains the application name that generated the message. This configuration is optional. | | +| syslog\_appname\_preset | The preset application name. It will be overwritten if `syslog_appname_key` is set and a key of a record is matched. This configuration is optional. | | +| syslog\_procid\_key | The key name from the original record that contains the Process ID that generated the message. This configuration is optional. | | +| syslog\_procid\_preset | The preset process ID. It will be overwritten if `syslog_procid_key` is set and a key of a record is matched. This configuration is optional. | | +| syslog\_msgid\_key | The key name from the original record that contains the Message ID associated to the message. This configuration is optional. | | +| syslog\_msgid\_preset | The preset message ID. It will be overwritten if `syslog_msgid_key` is set and a key of a record is matched. This configuration is optional. | | +| syslog\_sd\_key | The key name from the original record that contains a map of key/value pairs to use as Structured Data \(SD\) content. The key name is included in the resulting SD field as shown in examples below. This configuration is optional. | | +| syslog\_message\_key | The key name from the original record that contains the message to deliver. Note that this property is **mandatory**, otherwise the message will be empty. | | +| allow\_longer\_sd\_id | If true, Fluent-bit allows SD-ID that is longer than 32 characters. Such long SD-ID violates RFC 5424. | false | +| workers | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `0` | ### TLS / SSL @@ -45,45 +45,51 @@ For more details about the properties available and general configuration, see [ Get started quickly with this configuration file: {% tabs %} +{% tab title="fluent-bit.yaml" %} + +```yaml +pipeline: + + outputs: + - name: syslog + match: "*" + host: syslog.yourserver.com + port: 514 + mode: udp + syslog_format: rfc5424 + syslog_maxsize: 2048 + syslog_severity_key: severity + syslog_facility_key: facility + syslog_hostname_key: hostname + syslog_appname_key: appname + syslog_procid_key: procid + syslog_msgid_key: msgid + syslog_sd_key: sd + syslog_message_key: message +``` + +{% endtab %} {% tab title="fluent-bit.conf" %} + ```text [OUTPUT] - name syslog - match * - host syslog.yourserver.com - port 514 - mode udp - syslog_format rfc5424 - syslog_maxsize 2048 - syslog_severity_key severity - syslog_facility_key facility - syslog_hostname_key hostname - syslog_appname_key appname - syslog_procid_key procid - syslog_msgid_key msgid - syslog_sd_key sd - syslog_message_key message -``` -{% endtab %} -{% tab title="fluent-bit.yaml" %} -```yaml - outputs: - - name: syslog - match: "*" - host: syslog.yourserver.com - port: 514 - mode: udp - syslog_format: rfc5424 - syslog_maxsize: 2048 - syslog_severity_key: severity - syslog_facility_key: facility - syslog_hostname_key: hostname - syslog_appname_key: appname - syslog_procid_key: procid - syslog_msgid_key: msgid - syslog_sd_key: sd - syslog_message_key: message + name syslog + match * + host syslog.yourserver.com + port 514 + mode udp + syslog_format rfc5424 + syslog_maxsize 2048 + syslog_severity_key severity + syslog_facility_key facility + syslog_hostname_key hostname + syslog_appname_key appname + syslog_procid_key procid + syslog_msgid_key msgid + syslog_sd_key sd + syslog_message_key message ``` + {% endtab %} {% endtabs %} @@ -95,42 +101,27 @@ Example log: ```json { - "hostname": "myhost", - "appname": "myapp", - "procid": "1234", - "msgid": "ID98", - "uls@0": { - "logtype": "access", - "clustername": "mycluster", - "namespace": "mynamespace" - }, - "log": "Sample app log message." + "hostname": "myhost", + "appname": "myapp", + "procid": "1234", + "msgid": "ID98", + "uls@0": { + "logtype": "access", + "clustername": "mycluster", + "namespace": "mynamespace" + }, + "log": "Sample app log message." } ``` Example configuration file: {% tabs %} -{% tab title="fluent-bit.conf" %} -```text -[OUTPUT] - name syslog - match * - host syslog.yourserver.com - port 514 - mode udp - syslog_format rfc5424 - syslog_maxsize 2048 - syslog_hostname_key hostname - syslog_appname_key appname - syslog_procid_key procid - syslog_msgid_key msgid - syslog_sd_key uls@0 - syslog_message_key log -``` -{% endtab %} {% tab title="fluent-bit.yaml" %} + ```yaml +pipeline: + outputs: - name: syslog match: "*" @@ -146,13 +137,36 @@ Example configuration file: syslog_sd_key: uls@0 syslog_message_key: log ``` + +{% endtab %} +{% tab title="fluent-bit.conf" %} + +```text +[OUTPUT] + name syslog + match * + host syslog.yourserver.com + port 514 + mode udp + syslog_format rfc5424 + syslog_maxsize 2048 + syslog_hostname_key hostname + syslog_appname_key appname + syslog_procid_key procid + syslog_msgid_key msgid + syslog_sd_key uls@0 + syslog_message_key log +``` + {% endtab %} {% endtabs %} Example output: -```bash +```text +... <14>1 2021-07-12T14:37:35.569848Z myhost myapp 1234 ID98 [uls@0 logtype="access" clustername="mycluster" namespace="mynamespace"] Sample app log message. +... ``` ### Adding Structured Data Authentication Token @@ -162,32 +176,11 @@ However, this requires setting the token as a key rather than as a value. Here's an example of how that might be achieved, using `AUTH_TOKEN` as a [variable](../../administration/configuring-fluent-bit/classic-mode/variables.md): {% tabs %} -{% tab title="fluent-bit.conf" %} -```text -[FILTER] - name lua - match * - call append_token - code function append_token(tag, timestamp, record) record["${AUTH_TOKEN}"] = {} return 2, timestamp, record end - -[OUTPUT] - name syslog - match * - host syslog.yourserver.com - port 514 - mode tcp - syslog_format rfc5424 - syslog_hostname_preset my-hostname - syslog_appname_preset my-appname - syslog_message_key log - allow_longer_sd_id true - syslog_sd_key ${AUTH_TOKEN} - tls on - tls.crt_file /path/to/my.crt -``` -{% endtab %} {% tab title="fluent-bit.yaml" %} + ```yaml +pipeline: + filters: - name: lua match: "*" @@ -213,5 +206,32 @@ Here's an example of how that might be achieved, using `AUTH_TOKEN` as a [variab tls: on tls.crt_file: /path/to/my.crt ``` + {% endtab %} -{% endtabs %} +{% tab title="fluent-bit.conf" %} + +```text +[FILTER] + name lua + match * + call append_token + code function append_token(tag, timestamp, record) record["${AUTH_TOKEN}"] = {} return 2, timestamp, record end + +[OUTPUT] + name syslog + match * + host syslog.yourserver.com + port 514 + mode tcp + syslog_format rfc5424 + syslog_hostname_preset my-hostname + syslog_appname_preset my-appname + syslog_message_key log + allow_longer_sd_id true + syslog_sd_key ${AUTH_TOKEN} + tls on + tls.crt_file /path/to/my.crt +``` + +{% endtab %} +{% endtabs %} \ No newline at end of file diff --git a/pipeline/outputs/tcp-and-tls.md b/pipeline/outputs/tcp-and-tls.md index efcd8d016..15d02efe6 100644 --- a/pipeline/outputs/tcp-and-tls.md +++ b/pipeline/outputs/tcp-and-tls.md @@ -4,35 +4,35 @@ The **tcp** output plugin allows to send records to a remote TCP server. The pay ## Configuration Parameters -| Key | Description | default | -| :--- | :--- | :--- | -| Host | Target host where Fluent-Bit or Fluentd are listening for Forward messages. | 127.0.0.1 | -| Port | TCP Port of the target service. | 5170 | -| Format | Specify the data format to be printed. Supported formats are _msgpack_ _json_, _json\_lines_ and _json\_stream_. | msgpack | -| json\_date\_key | Specify the name of the time key in the output record. To disable the time key just set the value to `false`. | date | -| json\_date\_format | Specify the format of the date. Supported formats are _double_, _epoch_, _iso8601_ (eg: _2018-05-30T09:39:52.000681Z_) and _java_sql_timestamp_ (eg: _2018-05-30 09:39:52.000681_) | double | -| workers | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `2` | +| Key | Description | default | +|:-------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------| +| Host | Target host where Fluent-Bit or Fluentd are listening for Forward messages. | 127.0.0.1 | +| Port | TCP Port of the target service. | 5170 | +| Format | Specify the data format to be printed. Supported formats are _msgpack_ _json_, _json\_lines_ and _json\_stream_. | msgpack | +| json\_date\_key | Specify the name of the time key in the output record. To disable the time key just set the value to `false`. | date | +| json\_date\_format | Specify the format of the date. Supported formats are _double_, _epoch_, _iso8601_ (eg: _2018-05-30T09:39:52.000681Z_) and _java_sql_timestamp_ (eg: _2018-05-30 09:39:52.000681_) | double | +| workers | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `2` | ## TLS Configuration Parameters The following parameters are available to configure a secure channel connection through TLS: -| Key | Description | Default | -| :--- | :--- | :--- | -| tls | Enable or disable TLS support | Off | -| tls.verify | Force certificate validation | On | -| tls.debug | Set TLS debug verbosity level. It accept the following values: 0 \(No debug\), 1 \(Error\), 2 \(State change\), 3 \(Informational\) and 4 Verbose | 1 | -| tls.ca\_file | Absolute path to CA certificate file | | -| tls.crt\_file | Absolute path to Certificate file. | | -| tls.key\_file | Absolute path to private Key file. | | -| tls.key\_passwd | Optional password for tls.key\_file file. | | +| Key | Description | Default | +|:----------------|:--------------------------------------------------------------------------------------------------------------------------------------------------|:--------| +| tls | Enable or disable TLS support | Off | +| tls.verify | Force certificate validation | On | +| tls.debug | Set TLS debug verbosity level. It accept the following values: 0 \(No debug\), 1 \(Error\), 2 \(State change\), 3 \(Informational\) and 4 Verbose | 1 | +| tls.ca\_file | Absolute path to CA certificate file | | +| tls.crt\_file | Absolute path to Certificate file. | | +| tls.key\_file | Absolute path to private Key file. | | +| tls.key\_passwd | Optional password for tls.key\_file file. | | ### Command Line #### JSON format -```bash -bin/fluent-bit -i cpu -o tcp://127.0.0.1:5170 -p format=json_lines -v +```shell +fluent-bit -i cpu -o tcp://127.0.0.1:5170 -p format=json_lines -v ``` We have specified to gather [CPU](https://github.com/fluent/fluent-bit-docs/tree/16f30161dc4c79d407cd9c586a0c6839d0969d97/pipeline/input/cpu.md) usage metrics and send them in JSON lines mode to a remote end-point using netcat service. @@ -40,18 +40,20 @@ We have specified to gather [CPU](https://github.com/fluent/fluent-bit-docs/tree Run the following in a separate terminal, `netcat` will start listening for messages on TCP port 5170. Once it connects to Fluent Bit ou should see the output as above in JSON format: -```bash +```shell $ nc -l 5170 + +... {"date":1644834856.905985,"cpu_p":1.1875,"user_p":0.5625,"system_p":0.625,"cpu0.p_cpu":0.0,"cpu0.p_user":0.0,"cpu0.p_system":0.0,"cpu1.p_cpu":1.0,"cpu1.p_user":1.0,"cpu1.p_system":0.0,"cpu2.p_cpu":4.0,"cpu2.p_user":2.0,"cpu2.p_system":2.0,"cpu3.p_cpu":1.0,"cpu3.p_user":0.0,"cpu3.p_system":1.0,"cpu4.p_cpu":1.0,"cpu4.p_user":0.0,"cpu4.p_system":1.0,"cpu5.p_cpu":1.0,"cpu5.p_user":1.0,"cpu5.p_system":0.0,"cpu6.p_cpu":0.0,"cpu6.p_user":0.0,"cpu6.p_system":0.0,"cpu7.p_cpu":3.0,"cpu7.p_user":1.0,"cpu7.p_system":2.0,"cpu8.p_cpu":0.0,"cpu8.p_user":0.0,"cpu8.p_system":0.0,"cpu9.p_cpu":1.0,"cpu9.p_user":0.0,"cpu9.p_system":1.0,"cpu10.p_cpu":1.0,"cpu10.p_user":0.0,"cpu10.p_system":1.0,"cpu11.p_cpu":0.0,"cpu11.p_user":0.0,"cpu11.p_system":0.0,"cpu12.p_cpu":0.0,"cpu12.p_user":0.0,"cpu12.p_system":0.0,"cpu13.p_cpu":3.0,"cpu13.p_user":2.0,"cpu13.p_system":1.0,"cpu14.p_cpu":1.0,"cpu14.p_user":1.0,"cpu14.p_system":0.0,"cpu15.p_cpu":0.0,"cpu15.p_user":0.0,"cpu15.p_system":0.0} +... ``` #### Msgpack format Repeat the JSON approach but using the `msgpack` output format. -```bash -bin/fluent-bit -i cpu -o tcp://127.0.0.1:5170 -p format=msgpack -v - +```shell +fluent-bit -i cpu -o tcp://127.0.0.1:5170 -p format=msgpack -v ``` We could send this to stdout but as it is a serialized format you would end up with strange output. @@ -79,9 +81,11 @@ while True: print(unpacked) ``` -```bash +```shell $ pip install msgpack $ python3 test.py -(ExtType(code=0, data=b'b\n5\xc65\x05\x14\xac'), {'cpu_p': 0.1875, 'user_p': 0.125, 'system_p': 0.0625, 'cpu0.p_cpu': 0.0, 'cpu0.p_user': 0.0, 'cpu0.p_system': 0.0, 'cpu1.p_cpu': 0.0, 'cpu1.p_user': 0.0, 'cpu1.p_system': 0.0, 'cpu2.p_cpu': 1.0, 'cpu2.p_user': 0.0, 'cpu2.p_system': 1.0, 'cpu3.p_cpu': 0.0, 'cpu3.p_user': 0.0, 'cpu3.p_system': 0.0, 'cpu4.p_cpu': 0.0, 'cpu4.p_user': 0.0, 'cpu4.p_system': 0.0, 'cpu5.p_cpu': 0.0, 'cpu5.p_user': 0.0, 'cpu5.p_system': 0.0, 'cpu6.p_cpu': 0.0, 'cpu6.p_user': 0.0, 'cpu6.p_system': 0.0, 'cpu7.p_cpu': 0.0, 'cpu7.p_user': 0.0, 'cpu7.p_system': 0.0, 'cpu8.p_cpu': 0.0, 'cpu8.p_user': 0.0, 'cpu8.p_system': 0.0, 'cpu9.p_cpu': 1.0, 'cpu9.p_user': 1.0, 'cpu9.p_system': 0.0, 'cpu10.p_cpu': 0.0, 'cpu10.p_user': 0.0, 'cpu10.p_system': 0.0, 'cpu11.p_cpu': 0.0, 'cpu11.p_user': 0.0, 'cpu11.p_system': 0.0, 'cpu12.p_cpu': 0.0, 'cpu12.p_user': 0.0, 'cpu12.p_system': 0.0, 'cpu13.p_cpu': 0.0, 'cpu13.p_user': 0.0, 'cpu13.p_system': 0.0, 'cpu14.p_cpu': 0.0, 'cpu14.p_user': 0.0, 'cpu14.p_system': 0.0, 'cpu15.p_cpu': 0.0, 'cpu15.p_user': 0.0, 'cpu15.p_system': 0.0}) -``` +... +(ExtType(code=0, data=b'b\n5\xc65\x05\x14\xac'), {'cpu_p': 0.1875, 'user_p': 0.125, 'system_p': 0.0625, 'cpu0.p_cpu': 0.0, 'cpu0.p_user': 0.0, 'cpu0.p_system': 0.0, 'cpu1.p_cpu': 0.0, 'cpu1.p_user': 0.0, 'cpu1.p_system': 0.0, 'cpu2.p_cpu': 1.0, 'cpu2.p_user': 0.0, 'cpu2.p_system': 1.0, 'cpu3.p_cpu': 0.0, 'cpu3.p_user': 0.0, 'cpu3.p_system': 0.0, 'cpu4.p_cpu': 0.0, 'cpu4.p_user': 0.0, 'cpu4.p_system': 0.0, 'cpu5.p_cpu': 0.0, 'cpu5.p_user': 0.0, 'cpu5.p_system': 0.0, 'cpu6.p_cpu': 0.0, 'cpu6.p_user': 0.0, 'cpu6.p_system': 0.0, 'cpu7.p_cpu': 0.0, 'cpu7.p_user': 0.0, 'cpu7.p_system': 0.0, 'cpu8.p_cpu': 0.0, 'cpu8.p_user': 0.0, 'cpu8.p_system': 0.0, 'cpu9.p_cpu': 1.0, 'cpu9.p_user': 1.0, 'cpu9.p_system': 0.0, 'cpu10.p_cpu': 0.0, 'cpu10.p_user': 0.0, 'cpu10.p_system': 0.0, 'cpu11.p_cpu': 0.0, 'cpu11.p_user': 0.0, 'cpu11.p_system': 0.0, 'cpu12.p_cpu': 0.0, 'cpu12.p_user': 0.0, 'cpu12.p_system': 0.0, 'cpu13.p_cpu': 0.0, 'cpu13.p_user': 0.0, 'cpu13.p_system': 0.0, 'cpu14.p_cpu': 0.0, 'cpu14.p_user': 0.0, 'cpu14.p_system': 0.0, 'cpu15.p_cpu': 0.0, 'cpu15.p_user': 0.0, 'cpu15.p_system': 0.0}) +... +``` \ No newline at end of file diff --git a/pipeline/outputs/treasure-data.md b/pipeline/outputs/treasure-data.md index 5db12a997..c3c146770 100644 --- a/pipeline/outputs/treasure-data.md +++ b/pipeline/outputs/treasure-data.md @@ -6,13 +6,13 @@ The **td** output plugin, allows to flush your records into the [Treasure Data]( The plugin supports the following configuration parameters: -| Key | Description | Default | -| :--- | :--- | :--- | -| API | The [Treasure Data](http://treasuredata.com) API key. To obtain it please log into the [Console](https://console.treasuredata.com) and in the API keys box, copy the API key hash. | | -| Database | Specify the name of your target database. | | -| Table | Specify the name of your target table where the records will be stored. | | -| Region | Set the service region, available values: US and JP | US | -| Workers | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `0` | +| Key | Description | Default | +|:---------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:--------| +| API | The [Treasure Data](http://treasuredata.com) API key. To obtain it please log into the [Console](https://console.treasuredata.com) and in the API keys box, copy the API key hash. | | +| Database | Specify the name of your target database. | | +| Table | Specify the name of your target table where the records will be stored. | | +| Region | Set the service region, available values: US and JP | US | +| Workers | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `0` | ## Getting Started @@ -20,7 +20,7 @@ In order to start inserting records into [Treasure Data](https://www.treasuredat ### Command Line: -```bash +```shell fluent-bit -i cpu -o td -p API="abc" -p Database="fluentbit" -p Table="cpu_samples" ``` @@ -28,17 +28,41 @@ Ideally you don't want to expose your API key from the command line, using a con ### Configuration File -In your main configuration file append the following _Input_ & _Output_ sections: +In your main configuration file append the following: -```python +{% tabs %} +{% tab title="fluent-bit.yaml" %} + +```yaml +pipeline: + inputs: + - name: cpu + tag: my_cpu + + outputs: + - name: td + match: '*' + api: 'XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX' + database: fluentbit + table: cpu_samples +``` + +{% endtab %} +{% tab title="fluent-bit.conf" %} + + +```text [INPUT] - Name cpu - Tag my_cpu + Name cpu + Tag my_cpu [OUTPUT] - Name td - Match * - API 5713/e75be23caee19f8041dfa635ddfbd0dcd8c8d981 - Database fluentbit - Table cpu_samples + Name td + Match * + API XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX + Database fluentbit + Table cpu_samples ``` + +{% endtab %} +{% endtabs %} \ No newline at end of file diff --git a/pipeline/outputs/vivo-exporter.md b/pipeline/outputs/vivo-exporter.md index 661b30968..b79b0c926 100644 --- a/pipeline/outputs/vivo-exporter.md +++ b/pipeline/outputs/vivo-exporter.md @@ -5,31 +5,55 @@ Vivo Exporter is an output plugin that exposes logs, metrics, and traces through ### Configuration Parameters | Key | Description | Default | -| ------------------------ | -------------------------------------------------------------------------------------------------------------------------------------- | ------- | +|--------------------------|----------------------------------------------------------------------------------------------------------------------------------------|---------| | `empty_stream_on_read` | If enabled, when an HTTP client consumes the data from a stream, the stream content will be removed. | Off | | `stream_queue_size` | Specify the maximum queue size per stream. Each specific stream for logs, metrics and traces can hold up to `stream_queue_size` bytes. | 20M | | `http_cors_allow_origin` | Specify the value for the HTTP Access-Control-Allow-Origin header (CORS). | | -| `workers` | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `1` | +| `workers` | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `1` | ### Getting Started Here is a simple configuration of Vivo Exporter, note that this example is not based on defaults. -```python +{% tabs %} +{% tab title="fluent-bit.yaml" %} + +```yaml +pipeline: + inputs: + - name: dummy + tag: events + rate: 2 + + outputs: + - name: vivo_exporter + match: '*' + empty_stream_on_read: off + stream_queue_size: 20M + http_cors_allow_origin: '*' +``` + +{% endtab %} +{% tab title="fluent-bit.conf" %} + +```text [INPUT] - name dummy - tag events - rate 2 + name dummy + tag events + rate 2 [OUTPUT] - name vivo_exporter - match * - empty_stream_on_read off - stream_queue_size 20M - http_cors_allow_origin * + name vivo_exporter + match * + empty_stream_on_read off + stream_queue_size 20M + http_cors_allow_origin * ``` +{% endtab %} +{% endtabs %} + ### How it works Vivo Exporter provides buffers that serve as streams for each telemetry data type, in this case, `logs`, `metrics`, and `traces`. Each buffer contains a fixed capacity in terms of size (20M by default). When the data arrives at a stream, it's appended to the end. If the buffer is full, it removes the older entries to make room for new data. @@ -41,7 +65,7 @@ The `data` that arrives is a `chunk`. A chunk is a group of events that belongs By using a simple HTTP request, you can retrieve the data from the streams. The following are the endpoints available: | endpoint | Description | -| ---------- | ----------------------------------------------------------------------------------------------------------------------------- | +|------------|-------------------------------------------------------------------------------------------------------------------------------| | `/logs` | Exposes log events in JSON format. Each event contains a timestamp, metadata and the event content. | | `/metrics` | Exposes metrics events in JSON format. Each metric contains name, metadata, metric type and labels (dimensions). | | `/traces` | Exposes traces events in JSON format. Each trace contains a name, resource spans, spans, attributes, events information, etc. | @@ -50,21 +74,42 @@ The example below will generate dummy log events which will be consuming by usin **Configure and start Fluent Bit** -```python + +{% tabs %} +{% tab title="fluent-bit.yaml" %} + +```yaml +pipeline: + inputs: + - name: dummy + tag: events + rate: 2 + + outputs: + - name: vivo_exporter + match: '*' +``` + +{% endtab %} +{% tab title="fluent-bit.conf" %} + +```text [INPUT] - name dummy - tag events - rate 2 + name dummy + tag events + rate 2 [OUTPUT] - name vivo_exporter - match * - + name vivo_exporter + match * ``` +{% endtab %} +{% endtabs %} + **Retrieve the data** -```bash +```shell curl -i http://127.0.0.1:2025/logs ``` @@ -72,7 +117,7 @@ curl -i http://127.0.0.1:2025/logs Curl output would look like this: -```bash +```shell HTTP/1.1 200 OK Server: Monkey/1.7.0 Date: Tue, 21 Mar 2023 16:42:28 GMT @@ -88,6 +133,7 @@ Vivo-Stream-End-ID: 3 [[1679416947459806000,{"_tag":"events"}],{"message":"dummy"}] [[1679416947958777000,{"_tag":"events"}],{"message":"dummy"}] [[1679416948459391000,{"_tag":"events"}],{"message":"dummy"}] +... ``` ### Streams and IDs @@ -105,18 +151,20 @@ A client might be interested into always retrieve the latest chunks available an To query ranges or starting from specific chunks IDs, remember that they are incremental, you can use a mix of the following options: | Query string option | Description | -| ------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------- | +|---------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------| | `from` | Specify the first chunk ID that is desired to be retrieved. Note that if the `chunk` ID does not exists the next one in the queue will be provided. | | `to` | The last chunk ID is desired. If not found, the whole stream will be provided (starting from `from` if was set). | | `limit` | Limit the output to a specific number of chunks. The default value is `0`, which means: send everything. | The following example specifies the range from chunk ID 1 to chunk ID 3 and only 1 chunk: -`curl -i "http://127.0.0.1:2025/logs?from=1&to=3&limit=1"` +```shell +curl -i "http://127.0.0.1:2025/logs?from=1&to=3&limit=1"` +``` Output: -```bash +```shell HTTP/1.1 200 OK Server: Monkey/1.7.0 Date: Tue, 21 Mar 2023 16:45:05 GMT @@ -127,4 +175,5 @@ Vivo-Stream-End-ID: 1 [[1679416945959398000,{"_tag":"events"}],{"message":"dummy"}] [[1679416946459271000,{"_tag":"events"}],{"message":"dummy"}] -``` +... +``` \ No newline at end of file diff --git a/pipeline/outputs/websocket.md b/pipeline/outputs/websocket.md index 64610000b..22cfa5b31 100644 --- a/pipeline/outputs/websocket.md +++ b/pipeline/outputs/websocket.md @@ -1,23 +1,23 @@ # WebSocket -The **websocket** output plugin allows to flush your records into a WebSocket endpoint. For now the functionality is pretty basic and it issues a HTTP GET request to do the handshake, and then use TCP connections to send the data records in either JSON or [MessagePack](http://msgpack.org) \(or JSON\) format. +The **websocket** output plugin allows to flush your records into a WebSocket endpoint. For now the functionality is pretty basic, and it issues an HTTP GET request to do the handshake, and then use TCP connections to send the data records in either JSON or [MessagePack](http://msgpack.org) \(or JSON\) format. ## Configuration Parameters -| Key | Description | default | -| :--- | :--- | :--- | -| Host | IP address or hostname of the target WebSocket Server | 127.0.0.1 | -| Port | TCP port of the target WebSocket Server | 80 | -| URI | Specify an optional HTTP URI for the target websocket server, e.g: /something | / | -| Header | Add a HTTP header key/value pair. Multiple headers can be set. | | -| Format | Specify the data format to be used in the HTTP request body, by default it uses _msgpack_. Other supported formats are _json_, _json\_stream_ and _json\_lines_ and _gelf_. | msgpack | -| json\_date\_key | Specify the name of the date field in output | date | -| json\_date\_format | Specify the format of the date. Supported formats are _double_, _epoch_, _iso8601_ (eg: _2018-05-30T09:39:52.000681Z_) and _java_sql_timestamp_ (eg: _2018-05-30 09:39:52.000681_) | double | -| workers | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `0` | +| Key | Description | default | +|:-------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------| +| Host | IP address or hostname of the target WebSocket Server | 127.0.0.1 | +| Port | TCP port of the target WebSocket Server | 80 | +| URI | Specify an optional HTTP URI for the target websocket server, e.g: /something | / | +| Header | Add a HTTP header key/value pair. Multiple headers can be set. | | +| Format | Specify the data format to be used in the HTTP request body, by default it uses _msgpack_. Other supported formats are _json_, _json\_stream_ and _json\_lines_ and _gelf_. | msgpack | +| json\_date\_key | Specify the name of the date field in output | date | +| json\_date\_format | Specify the format of the date. Supported formats are _double_, _epoch_, _iso8601_ (eg: _2018-05-30T09:39:52.000681Z_) and _java_sql_timestamp_ (eg: _2018-05-30 09:39:52.000681_) | double | +| workers | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `0` | ## Getting Started -In order to insert records into a HTTP server, you can run the plugin from the command line or through the configuration file: +In order to insert records into an HTTP server, you can run the plugin from the command line or through the configuration file: ### Command Line @@ -29,74 +29,118 @@ http://host:port/something Using the format specified, you could start Fluent Bit through: -```text +```shell fluent-bit -i cpu -t cpu -o websocket://192.168.2.3:80/something -m '*' ``` ### Configuration File -In your main configuration file, append the following _Input_ & _Output_ sections: +In your main configuration file, append the following: + +{% tabs %} +{% tab title="fluent-bit.yaml" %} + +```yaml +pipeline: + inputs: + - name: cpu + tag: cpu + + outputs: + - name: websocket + match: '*' + host: 192.168.2.3 + port: 80 + uri: /something + format: json +``` + +{% endtab %} +{% tab title="fluent-bit.conf" %} -```python +```text [INPUT] - Name cpu - Tag cpu + Name cpu + Tag cpu [OUTPUT] - Name websocket - Match * - Host 192.168.2.3 - Port 80 - URI /something - Format json + Name websocket + Match * + Host 192.168.2.3 + Port 80 + URI /something + Format json ``` -Websocket plugin is working with tcp keepalive mode, please refer to [networking](https://docs.fluentbit.io/manual/v/master/administration/networking#configuration-options) section for details. Since websocket is a stateful plugin, it will decide when to send out handshake to server side, for example when plugin just begins to work or after connection with server has been dropped. In general, the interval to init a new websocket handshake would be less than the keepalive interval. With that strategy, it could detect and resume websocket connections. +{% endtab %} +{% endtabs %} +Websocket plugin is working with tcp keepalive mode, please refer to [networking](https://docs.fluentbit.io/manual/v/master/administration/networking#configuration-options) section for details. Since websocket is a stateful plugin, it will decide when to send out handshake to server side, for example when plugin just begins to work or after connection with server has been dropped. In general, the interval to init a new websocket handshake would be less than the keepalive interval. With that strategy, it could detect and resume websocket connections. ## Testing ### Configuration File +{% tabs %} +{% tab title="fluent-bit.yaml" %} + +```yaml +pipeline: + inputs: + - name: tcp + listen: 0.0.0.0 + port: 5170 + format: json + + outputs: + - name: websocket + match: '*' + host: 127.0.0.1 + port: 8080 + uri: / + format: json + workers: 4 + net.keepalive: on + net.keepalive_idle_timeout: 30 +``` + +{% endtab %} +{% tab title="fluent-bit.conf" %} + ```text [INPUT] - Name tcp - Listen 0.0.0.0 - Port 5170 - Format json + Name tcp + Listen 0.0.0.0 + Port 5170 + Format json [OUTPUT] - Name websocket - Match * - Host 127.0.0.1 - Port 8080 - URI / - Format json - workers 4 - net.keepalive on - net.keepalive_idle_timeout 30 + Name websocket + Match * + Host 127.0.0.1 + Port 8080 + URI / + Format json + workers 4 + net.keepalive on + net.keepalive_idle_timeout 30 ``` +{% endtab %} +{% endtabs %} + Once Fluent Bit is running, you can send some messages using the _netcat_: -```bash +```shell echo '{"key 1": 123456789, "key 2": "abcdefg"}' | nc 127.0.0.1 5170; sleep 35; echo '{"key 1": 123456789, "key 2": "abcdefg"}' | nc 127.0.0.1 5170 ``` In [Fluent Bit](http://fluentbit.io) we should see the following output: -```bash -bin/fluent-bit -c ../conf/out_ws.conf -Fluent Bit v1.7.0 -* Copyright (C) 2019-2020 The Fluent Bit Authors -* Copyright (C) 2015-2018 Treasure Data -* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd -* https://fluentbit.io - -[2021/02/05 22:17:09] [ info] [engine] started (pid=6056) -[2021/02/05 22:17:09] [ info] [storage] version=1.1.0, initializing... -[2021/02/05 22:17:09] [ info] [storage] in-memory -[2021/02/05 22:17:09] [ info] [storage] normal synchronization mode, checksum disabled, max_chunks_up=128 +```shell +$ fluent-bit -c ../conf/out_ws.conf + +... [2021/02/05 22:17:09] [ info] [input:tcp:tcp.0] listening on 0.0.0.0:5170 [2021/02/05 22:17:09] [ info] [out_ws] we have following parameter /, 127.0.0.1, 8080, 25 [2021/02/05 22:17:09] [ info] [output:websocket:websocket.0] worker #1 started @@ -119,10 +163,11 @@ Fluent Bit v1.7.0 [2021/02/05 22:18:27] [ info] [output:websocket:websocket.0] thread worker #3 stopping... [2021/02/05 22:18:27] [ info] [output:websocket:websocket.0] thread worker #3 stopped [2021/02/05 22:18:27] [ info] [out_ws] flb_ws_conf_destroy +... ``` ### Scenario Description -From the output of fluent-bit log, we see that once data has been ingested into fluent bit, plugin would perform handshake. After a while, no data or traffic is undergoing, tcp connection would been abort. And then another piece of data arrived, a retry for websocket plugin has been triggered, with another handshake and data flush. +From the output of fluent-bit log, we see that once data has been ingested into fluent bit, plugin would perform handshake. After a while, no data or traffic is undergoing, tcp connection would be aborted. And then another piece of data arrived, a retry for websocket plugin has been triggered, with another handshake and data flush. -There is another scenario, once websocket server flaps in a short time, which means it goes down and up in a short time, fluent-bit would resume tcp connection immediately. But in that case, websocket output plugin is a malfunction state, it needs to restart fluent-bit to get back to work. +There is another scenario, once websocket server flaps in a short time, which means it goes down and up in a short time, fluent-bit would resume tcp connection immediately. But in that case, websocket output plugin is a malfunction state, it needs to restart fluent-bit to get back to work. \ No newline at end of file