Skip to content

Commit 66da244

Browse files
authored
Merge branch 'master' into lynettemiles/sc-136228/update-fluent-bit-docs-pipeline-outputs-azure
Signed-off-by: Lynette Miles <[email protected]>
2 parents d901cc4 + bd07152 commit 66da244

File tree

10 files changed

+177
-188
lines changed

10 files changed

+177
-188
lines changed

pipeline/outputs/azure.md

Lines changed: 24 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -1,40 +1,40 @@
11
---
2-
description: 'Send logs, metrics to Azure Log Analytics'
2+
description: Send logs, metrics to Azure Log Analytics
33
---
44

55
# Azure Log Analytics
66

7-
Azure output plugin allows to ingest your records into [Azure Log Analytics](https://azure.microsoft.com/en-us/services/log-analytics/) service.
7+
The Azure output plugin lets you ingest your records into [Azure Log Analytics](https://azure.microsoft.com/en-us/services/log-analytics/) service.
88

9-
To get more details about how to setup Azure Log Analytics, please refer to the following documentation: [Azure Log Analytics](https://docs.microsoft.com/en-us/azure/log-analytics/)
9+
For details about how to setup Azure Log Analytics, see the [Azure Log Analytics](https://docs.microsoft.com/en-us/azure/log-analytics/) documentation.
1010

11-
## Configuration Parameters
11+
## Configuration parameters
1212

13-
| Key | Description | default |
13+
| Key | Description | Default |
1414
| :--- | :--- | :--- |
15-
| Customer\_ID | Customer ID or WorkspaceID string. | |
16-
| Shared\_Key | The primary or the secondary Connected Sources client authentication key. | |
17-
| Log\_Type | The name of the event type. | fluentbit |
18-
| Log_Type_Key | If included, the value for this key will be looked upon in the record and if present, will over-write the `log_type`. If not found then the `log_type` value will be used. | |
19-
| Time\_Key | Optional parameter to specify the key name where the timestamp will be stored. | @timestamp |
20-
| Time\_Generated | If enabled, the HTTP request header 'time-generated-field' will be included so Azure can override the timestamp with the key specified by 'time_key' option. | off |
21-
| Workers | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `0` |
15+
| `Customer_ID` | Customer ID or WorkspaceID string. | _none_ |
16+
| `Shared_Key` | The primary or the secondary Connected Sources client authentication key. | _none_ |
17+
| `Log_Type` | The name of the event type. | `fluentbit` |
18+
| `Log_Type_Key` | If included, the value for this key checked in the record and if present, will overwrite the `log_type`. If not found then the `log_type` value will be used. | _none_ |
19+
| `Time_Key` | Optional. Specify the key name where the timestamp will be stored. | `@timestamp` |
20+
| `Time_Generated` | If enabled, the HTTP request header `time-generated-field` will be included so Azure can override the timestamp with the key specified by `time_key` option. | `off` |
21+
| `Workers` | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `0` |
2222

23-
## Getting Started
23+
## Get started
2424

25-
In order to insert records into an Azure Log Analytics instance, you can run the plugin from the command line or through the configuration file:
25+
To insert records into an Azure Log Analytics instance, run the plugin from the command line or through the configuration file.
2626

27-
### Command Line
27+
### Command line
2828

29-
The **azure** plugin, can read the parameters from the command line in two ways, through the **-p** argument \(property\), e.g:
29+
The _Azure_ plugin can read the parameters from the command line in the following ways, using the `-p` argument (property):
3030

3131
```shell
3232
fluent-bit -i cpu -o azure -p customer_id=abc -p shared_key=def -m '*' -f 1
3333
```
3434

35-
### Configuration File
35+
### Configuration file
3636

37-
In your main configuration file append the following _Input_ & _Output_ sections:
37+
In your main configuration file append the following sections:
3838

3939
{% tabs %}
4040
{% tab title="fluent-bit.yaml" %}
@@ -43,12 +43,12 @@ In your main configuration file append the following _Input_ & _Output_ sections
4343
pipeline:
4444
inputs:
4545
- name: cpu
46-
46+
4747
outputs:
4848
- name: azure
4949
match: '*'
5050
customer_id: abc
51-
shared_key: def
51+
shared_key: def
5252
```
5353
5454
{% endtab %}
@@ -68,7 +68,7 @@ pipeline:
6868
{% endtab %}
6969
{% endtabs %}
7070

71-
Another example using the `Log_Type_Key` with [record-accessor](https://docs.fluentbit.io/manual/administration/configuring-fluent-bit/classic-mode/record-accessor), which will read the table name (or event type) dynamically from kubernetes label `app`, instead of `Log_Type`:
71+
The following example uses the `Log_Type_Key` with [record-accessor](https://docs.fluentbit.io/manual/administration/configuring-fluent-bit/classic-mode/record-accessor), which will read the table name (or event type) dynamically from the Kubernetes label `app`, instead of `Log_Type`:
7272

7373
{% tabs %}
7474
{% tab title="fluent-bit.yaml" %}
@@ -77,13 +77,13 @@ Another example using the `Log_Type_Key` with [record-accessor](https://docs.flu
7777
pipeline:
7878
inputs:
7979
- name: cpu
80-
80+
8181
outputs:
8282
- name: azure
8383
match: '*'
8484
log_type_key: $kubernetes['labels']['app']
8585
customer_id: abc
86-
shared_key: def
86+
shared_key: def
8787
```
8888
8989
{% endtab %}
@@ -102,4 +102,4 @@ pipeline:
102102
```
103103

104104
{% endtab %}
105-
{% endtabs %}
105+
{% endtabs %}

pipeline/outputs/bigquery.md

Lines changed: 38 additions & 53 deletions
Original file line numberDiff line numberDiff line change
@@ -1,71 +1,56 @@
11
# Google Cloud BigQuery
22

3-
BigQuery output plugin is an _experimental_ plugin that allows you to stream records into [Google Cloud BigQuery](https://cloud.google.com/bigquery/) service. The implementation does not support the following, which would be expected in a full production version:
3+
The _BigQuery_ output plugin is an experimental plugin that lets you stream records
4+
into the [Google Cloud BigQuery](https://cloud.google.com/bigquery/) service.
45

5-
* [Application Default Credentials](https://cloud.google.com/docs/authentication/production).
6-
* [Data deduplication](https://cloud.google.com/bigquery/streaming-data-into-bigquery) using `insertId`.
7-
* [Template tables](https://cloud.google.com/bigquery/streaming-data-into-bigquery) using `templateSuffix`.
6+
The implementation doesn't support the following, which would be expected in a full production version:
87

9-
## Google Cloud Configuration
8+
- [Application Default Credentials](https://cloud.google.com/docs/authentication/production).
9+
- [Data deduplication](https://cloud.google.com/bigquery/streaming-data-into-bigquery) using `insertId`.
10+
- [Template tables](https://cloud.google.com/bigquery/streaming-data-into-bigquery) using `templateSuffix`.
1011

11-
Fluent Bit streams data into an existing BigQuery table using a service account that you specify. Therefore, before using the BigQuery output plugin, you must create a service account, create a BigQuery dataset and table, authorize the service account to write to the table, and provide the service account credentials to Fluent Bit.
12+
## Google Cloud configuration
1213

13-
### Creating a Service Account
14+
Fluent Bit streams data into an existing BigQuery table using a service account that you specify. Before using the BigQuery output plugin, you must:
1415

15-
To stream data into BigQuery, the first step is to create a Google Cloud service account for Fluent Bit:
16+
1. To stream data into BigQuery, you must create a [Google Cloud service account](https://cloud.google.com/iam/docs/creating-managing-service-accounts) for Fluent Bit.
17+
1. Create a BigQuery dataset.
18+
Fluent Bit doesn't create datasets for your data, so you must [create the dataset]((https://cloud.google.com/bigquery/docs/datasets)) ahead of time. You must also grant the service account `WRITER` permission on the dataset.
1619

17-
* [Creating a Google Cloud Service Account](https://cloud.google.com/iam/docs/creating-managing-service-accounts)
20+
Within the dataset you must create a table for the data to reside in. Use the following instructions for creating your table. Pay close attention to the schema, as it must match the schema of your output JSON. Unfortunately, because BigQuery doesn't allow dots in field names, you must use a filter to change the fields for many of the standard inputs (for example, `mem` or `cpu`).
21+
1. [Create a BigQuery table](https://cloud.google.com/bigquery/docs/tables).
22+
1. Fluent Bit BigQuery output plugin uses a JSON credentials file for authentication credentials. [Authorize the service account](https://cloud.google.com/iam/docs/creating-managing-service-account-keys) to write to the table.
23+
1. Provide the service account credentials to Fluent Bit.
24+
With [workload identity federation](https://cloud.google.com/iam/docs/workload-identity-federation), you can grant on-premises or multi-cloud workloads access to Google Cloud resources, without using a service account key. It can be used as a more secure alternative to service account credentials. Google Cloud's workload identity federation supports several identity providers (see documentation) but Fluent Bit BigQuery plugin currently supports Amazon Web Services (AWS) only.
1825

19-
### Creating a BigQuery Dataset and Table
26+
You must configure workload identity federation in GCP before using it with Fluent Bit.
2027

21-
Fluent Bit does not create datasets or tables for your data, so you must create these ahead of time. You must also grant the service account `WRITER` permission on the dataset:
28+
- [Configuring workload identity federation](https://cloud.google.com/iam/docs/configuring-workload-identity-federation#aws)
29+
- [Obtaining short-lived credentials with identity federation](https://cloud.google.com/iam/docs/using-workload-identity-federation)
2230

23-
* [Creating and using datasets](https://cloud.google.com/bigquery/docs/datasets)
31+
## Configurations parameters
2432

25-
Within the dataset you will need to create a table for the data to reside in. You can follow the following instructions for creating your table. Pay close attention to the schema. It must match the schema of your output JSON. Unfortunately, since BigQuery does not allow dots in field names, you will need to use a filter to change the fields for many of the standard inputs \(e.g, mem or cpu\).
26-
27-
* [Creating and using tables](https://cloud.google.com/bigquery/docs/tables)
28-
29-
### Retrieving Service Account Credentials
30-
31-
Fluent Bit BigQuery output plugin uses a JSON credentials file for authentication credentials. Download the credentials file by following these instructions:
32-
33-
* [Creating and Managing Service Account Keys](https://cloud.google.com/iam/docs/creating-managing-service-account-keys)
34-
35-
### Workload Identity Federation
36-
37-
Using identity federation, you can grant on-premises or multi-cloud workloads access to Google Cloud resources, without using a service account key. It can be used as a more secure alternative to service account credentials. Google Cloud's workload identity federation supports several identity providers (see documentation) but Fluent Bit BigQuery plugin currently supports Amazon Web Services (AWS) only.
38-
39-
* [Workload Identity Federation overview](https://cloud.google.com/iam/docs/workload-identity-federation)
40-
41-
You must configure workload identity federation in GCP before using it with Fluent Bit.
42-
43-
* [Configuring workload identity federation](https://cloud.google.com/iam/docs/configuring-workload-identity-federation#aws)
44-
* [Obtaining short-lived credentials with identity federation](https://cloud.google.com/iam/docs/using-workload-identity-federation)
45-
46-
## Configurations Parameters
47-
48-
| Key | Description | default |
33+
| Key | Description | Default |
4934
| :--- | :--- | :--- |
50-
| google\_service\_credentials | Absolute path to a Google Cloud credentials JSON file. | Value of the environment variable _$GOOGLE\_SERVICE\_CREDENTIALS_ |
51-
| project\_id | The project id containing the BigQuery dataset to stream into. | The value of the `project_id` in the credentials file |
52-
| dataset\_id | The dataset id of the BigQuery dataset to write into. This dataset must exist in your project. | |
53-
| table\_id | The table id of the BigQuery table to write into. This table must exist in the specified dataset and the schema must match the output. | |
54-
| skip\_invalid\_rows | Insert all valid rows of a request, even if invalid rows exist. The default value is false, which causes the entire request to fail if any invalid rows exist. | Off |
55-
| ignore\_unknown\_values | Accept rows that contain values that do not match the schema. The unknown values are ignored. Default is false, which treats unknown values as errors. | Off |
56-
| enable\_workload\_identity\_federation | Enables workload identity federation as an alternative authentication method. Cannot be used with service account credentials file or environment variable. AWS is the only identity provider currently supported. | Off |
57-
| aws\_region | Used to construct a regional endpoint for AWS STS to verify AWS credentials obtained by Fluent Bit. Regional endpoints are recommended by AWS. | |
58-
| project\_number | GCP project number where the identity provider was created. Used to construct the full resource name of the identity provider. | |
59-
| pool\_id | GCP workload identity pool where the identity provider was created. Used to construct the full resource name of the identity provider. | |
60-
| provider\_id | GCP workload identity provider. Used to construct the full resource name of the identity provider. Currently only AWS accounts are supported. | |
61-
| google\_service\_account | Email address of the Google service account to impersonate. The workload identity provider must have permissions to impersonate this service account, and the service account must have permissions to access Google BigQuery resources (e.g. `write` access to tables) | |
62-
| workers | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `0` |
35+
| `google_service_credentials` | Absolute path to a Google Cloud credentials JSON file. | Value of the environment variable `$GOOGLE_SERVICE_CREDENTIALS`. |
36+
| `project_id` | The project id containing the BigQuery dataset to stream into. | Value of the `project_id` in the credentials file. |
37+
| `dataset_id` | The dataset id of the BigQuery dataset to write into. This dataset must exist in your project. | _none_ |
38+
| `table_id` | The table id of the BigQuery table to write into. This table must exist in the specified dataset and the schema must match the output. | _none_ |
39+
| `skip_invalid_rows` | Insert all valid rows of a request, even if invalid rows exist. The default value is false, which causes the entire request to fail if any invalid rows exist. | `Off` |
40+
| `ignore_unknown_values` | Accept rows that contain values that don't match the schema. The unknown values are ignored. Default is `Off`, which treats unknown values as errors. | `Off` |
41+
| `enable_workload_identity_federation` | Enables workload identity federation as an alternative authentication method. Can't be used with service account credentials file or environment variable. AWS is the only identity provider currently supported. | `Off` |
42+
| `aws_region` | Used to construct a regional endpoint for AWS STS to verify AWS credentials obtained by Fluent Bit. Regional endpoints are recommended by AWS. | _none_ |
43+
| `project_number` | GCP project number where the identity provider was created. Used to construct the full resource name of the identity provider. | _none_ |
44+
| `pool_id` | GCP workload identity pool where the identity provider was created. Used to construct the full resource name of the identity provider. | _none_ |
45+
| `provider_id` | GCP workload identity provider. Used to construct the full resource name of the identity provider. Currently only AWS accounts are supported. | _none_ |
46+
| `google_service_account` | The email address of the Google service account to impersonate. The workload identity provider must have permissions to impersonate this service account, and the service account must have permissions to access Google BigQuery resources ( `write` access to tables) | _none_ |
47+
| `workers` | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `0` |
6348

6449
See Google's [official documentation](https://cloud.google.com/bigquery/docs/reference/rest/v2/tabledata/insertAll) for further details.
6550

66-
## Configuration File
51+
## Configuration file
6752

68-
If you are using a _Google Cloud Credentials File_, the following configuration is enough to get you started:
53+
If you are using a Google Cloud credentials file, the following configuration will get you started:
6954

7055
{% tabs %}
7156
{% tab title="fluent-bit.yaml" %}
@@ -75,7 +60,7 @@ pipeline:
7560
inputs:
7661
- name: dummy
7762
tag: dummy
78-
63+
7964
outputs:
8065
- name: bigquery
8166
match: '*'
@@ -99,4 +84,4 @@ pipeline:
9984
```
10085

10186
{% endtab %}
102-
{% endtabs %}
87+
{% endtabs %}

0 commit comments

Comments
 (0)