diff --git a/pipeline/outputs/cloudwatch.md b/pipeline/outputs/cloudwatch.md index 50cc7d296..a63e96621 100644 --- a/pipeline/outputs/cloudwatch.md +++ b/pipeline/outputs/cloudwatch.md @@ -45,15 +45,33 @@ In order to send records into Amazon Cloudwatch, you can run the plugin from the The **cloudwatch** plugin, can read the parameters from the command line through the **-p** argument (property), e.g: -``` +```shell fluent-bit -i cpu -o cloudwatch_logs -p log_group_name=group -p log_stream_name=stream -p region=us-west-2 -m '*' -f 1 ``` ### Configuration File -In your main configuration file append the following _Output_ section: - +In your main configuration file append the following: + +{% tabs %} +{% tab title="fluent-bit.yaml" %} + +```yaml +pipeline: + + outputs: + - name: cloudwatch_logs + match: '*' + region: us-east-1 + log_group_name: fluent-bit-cloudwatch + log_stream_prefix: from-fluent-bit- + auto_create_group: on ``` + +{% endtab %} +{% tab title="fluent-bit.conf" %} + +```text [OUTPUT] Name cloudwatch_logs Match * @@ -62,22 +80,56 @@ In your main configuration file append the following _Output_ section: log_stream_prefix from-fluent-bit- auto_create_group On ``` -#### Intergration with Localstack (Cloudwatch Logs) -For an instance of Localstack running at `http://localhost:4566`, the following configuration needs to be added to the `[OUTPUT]` section: +{% endtab %} +{% endtabs %} + +#### Integration with Localstack (Cloudwatch Logs) + +For an instance of `Localstack` running at `http://localhost:4566`, the following configuration is needed: + +{% tabs %} +{% tab title="fluent-bit.yaml" %} + +```yaml +pipeline: + + outputs: + - name: cloudwatch_logs + match: '*' + region: us-east-1 + log_group_name: fluent-bit-cloudwatch + log_stream_prefix: from-fluent-bit- + auto_create_group: on + endpoint: localhost + port: 4566 +``` + +{% endtab %} +{% tab title="fluent-bit.conf" %} ```text -endpoint localhost -port 4566 +[OUTPUT] + Name cloudwatch_logs + Match * + region us-east-1 + log_group_name fluent-bit-cloudwatch + log_stream_prefix from-fluent-bit- + auto_create_group On + endpoint localhost + port 4566 ``` +{% endtab %} +{% endtabs %} + Any testing credentials can be exported as local variables, such as `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY`. ### Permissions The following AWS IAM permissions are required to use this plugin: -``` +```json { "Version": "2012-10-17", "Statement": [{ @@ -100,7 +152,7 @@ Here is an example usage, for a common use case- templating log group and stream Recall that the kubernetes filter can add metadata which will look like the following: -``` +```text kubernetes: { annotations: { "kubernetes.io/psp": "eks.privileged" @@ -121,9 +173,29 @@ kubernetes: { Using record\_accessor, we can build a template based on this object. -Here is our output configuration: - +Here is our configuration: + +{% tabs %} +{% tab title="fluent-bit.yaml" %} + +```yaml +pipeline: + + outputs: + - name: cloudwatch_logs + match: '*' + region: us-east-1 + log_group_name: fallback-group + log_stream_prefix: fallback-stream + auto_create_group: on + log_group_template: application-logs-$kubernetes['host'].$kubernetes['namespace_name'] + log_stream_template: $kubernetes['pod_name'].$kubernetes['container_name'] ``` + +{% endtab %} +{% tab title="fluent-bit.conf" %} + +```text [OUTPUT] Name cloudwatch_logs Match * @@ -135,17 +207,20 @@ Here is our output configuration: log_stream_template $kubernetes['pod_name'].$kubernetes['container_name'] ``` +{% endtab %} +{% endtabs %} + With the above kubernetes metadata, the log group name will be `application-logs-ip-10-1-128-166.us-east-2.compute.internal.my-namespace`. And the log stream name will be `myapp-5468c5d4d7-n2swr.myapp`. If the kubernetes structure is not found in the log record, then the `log_group_name` and `log_stream_prefix` will be used instead, and Fluent Bit will log an error like: -``` +```text [2022/06/30 06:09:29] [ warn] [record accessor] translation failed, root key=kubernetes ``` #### Limitations of record\_accessor syntax -Notice in the example above, that the template values are separated by dot characters. This is important; the Fluent Bit record\_accessor library has a limitation in the characters that can separate template variables- only dots and commas (`.` and `,`) can come after a template variable. This is because the templating library must parse the template and determine the end of a variable. +Notice in the example above, that the template values are separated by dot characters. This is important; the Fluent Bit record\_accessor library has a limitation in the characters that can separate template variables; only dots and commas (`.` and `,`) can come after a template variable. This is because the templating library must parse the template and determine the end of a variable. Assume that your log records contain the metadata keys `container_name` and `task`. The following would be invalid templates because the two template variables are not separated by commas or dots: @@ -168,20 +243,51 @@ And the following are valid since they only contain one template variable with n ### Metrics Tutorial -Fluent Bit has different input plugins (cpu, mem, disk, netif) to collect host resource usage metrics. `cloudwatch_logs` output plugin can be used to send these host metrics to CloudWatch in Embedded Metric Format (EMF). If data comes from any of the above mentioned input plugins, `cloudwatch_logs` output plugin will convert them to EMF format and sent to CloudWatch as JSON log. Additionally, if we set `json/emf` as the value of `log_format` config option, CloudWatch will extract custom metrics from embedded JSON payload. +Fluent Bit has different input plugins (cpu, mem, disk, netif) to collect host resource usage metrics. `cloudwatch_logs` output plugin can be used to send these host metrics to CloudWatch in Embedded Metric Format (EMF). If data comes from any of the above-mentioned input plugins, `cloudwatch_logs` output plugin will convert them to EMF format and sent to CloudWatch as JSON log. Additionally, if we set `json/emf` as the value of `log_format` config option, CloudWatch will extract custom metrics from embedded JSON payload. Note: Right now, only `cpu` and `mem` metrics can be sent to CloudWatch. For using the `mem` input plugin and sending memory usage metrics to CloudWatch, we can consider the following example config file. Here, we use the `aws` filter which adds `ec2_instance_id` and `az` (availability zone) to the log records. Later, in the output config section, we set `ec2_instance_id` as our metric dimension. +{% tabs %} +{% tab title="fluent-bit.yaml" %} + +```yaml +service: + log_level: info + +pipeline: + inputs: + - name: mem + tag: mem + + filters: + - name: aws + match: '*' + + outputs: + - name: cloudwatch_logs + match: '*' + region: us-west-2 + log_stream_name: fluent-bit-cloudwatch + log_group_name: fluent-bit-cloudwatch + log_format: json/emf + metric_namespace: fluent-bit-metrics + metric_dimensions: ec2_instance_id + auto_create_group: true ``` + +{% endtab %} +{% tab title="fluent-bit.conf" %} + +```text [SERVICE] Log_Level info [INPUT] Name mem Tag mem - + [FILTER] Name aws Match * @@ -189,18 +295,59 @@ For using the `mem` input plugin and sending memory usage metrics to CloudWatch, [OUTPUT] Name cloudwatch_logs Match * + region us-west-2 log_stream_name fluent-bit-cloudwatch log_group_name fluent-bit-cloudwatch - region us-west-2 log_format json/emf metric_namespace fluent-bit-metrics metric_dimensions ec2_instance_id auto_create_group true ``` +{% endtab %} +{% endtabs %} + The following config will set two dimensions to all of our metrics- `ec2_instance_id` and `az`. +{% tabs %} +{% tab title="fluent-bit.yaml" %} + +```yaml +service: + log_level: info + +pipeline: + inputs: + - name: mem + tag: mem + + filters: + - name: aws + match: '*' + + outputs: + - name: cloudwatch_logs + match: '*' + region: us-west-2 + log_stream_name: fluent-bit-cloudwatch + log_group_name: fluent-bit-cloudwatch + log_format: json/emf + metric_namespace: fluent-bit-metrics + metric_dimensions: ec2_instance_id,az + auto_create_group: true ``` + +{% endtab %} +{% tab title="fluent-bit.conf" %} + +```text +[SERVICE] + Log_Level info + +[INPUT] + Name mem + Tag mem + [FILTER] Name aws Match * @@ -208,15 +355,18 @@ The following config will set two dimensions to all of our metrics- `ec2_instanc [OUTPUT] Name cloudwatch_logs Match * + region us-west-2 log_stream_name fluent-bit-cloudwatch log_group_name fluent-bit-cloudwatch - region us-west-2 log_format json/emf metric_namespace fluent-bit-metrics metric_dimensions ec2_instance_id,az auto_create_group true ``` +{% endtab %} +{% endtabs %} + ### AWS for Fluent Bit Amazon distributes a container image with Fluent Bit and these plugins. @@ -231,19 +381,19 @@ Amazon distributes a container image with Fluent Bit and these plugins. Our images are available in Amazon ECR Public Gallery. You can download images with different tags by following command: -``` +```shell docker pull public.ecr.aws/aws-observability/aws-for-fluent-bit: ``` For example, you can pull the image with latest version by: -``` +```shell docker pull public.ecr.aws/aws-observability/aws-for-fluent-bit:latest ``` If you see errors for image pull limits, try log into public ECR with your AWS credentials: -``` +```shell aws ecr-public get-login-password --region us-east-1 | docker login --username AWS --password-stdin public.ecr.aws ``` @@ -257,8 +407,8 @@ You can check the [Amazon ECR Public official doc](https://docs.aws.amazon.com/A You can use our SSM Public Parameters to find the Amazon ECR image URI in your region: -``` +```shell aws ssm get-parameters-by-path --path /aws/service/aws-for-fluent-bit/ ``` -For more see [the AWS for Fluent Bit github repo](https://github.com/aws/aws-for-fluent-bit#public-images). \ No newline at end of file +For more see [the AWS for Fluent Bit GitHub repo](https://github.com/aws/aws-for-fluent-bit#public-images). \ No newline at end of file diff --git a/pipeline/outputs/firehose.md b/pipeline/outputs/firehose.md index 9a97ae307..fb3d9c4cb 100644 --- a/pipeline/outputs/firehose.md +++ b/pipeline/outputs/firehose.md @@ -4,8 +4,6 @@ description: Send logs to Amazon Kinesis Firehose # Amazon Kinesis Data Firehose -![](../../.gitbook/assets/image%20%288%29.png) - The Amazon Kinesis Data Firehose output plugin allows to ingest your records into the [Firehose](https://aws.amazon.com/kinesis/data-firehose/) service. This is the documentation for the core Fluent Bit Firehose plugin written in C. It can replace the [aws/amazon-kinesis-firehose-for-fluent-bit](https://github.com/aws/amazon-kinesis-firehose-for-fluent-bit) Golang Fluent Bit plugin released last year. The Golang plugin was named `firehose`; this new high performance and highly efficient firehose plugin is called `kinesis_firehose` to prevent conflicts/confusion. @@ -38,13 +36,29 @@ In order to send records into Amazon Kinesis Data Firehose, you can run the plug The **firehose** plugin, can read the parameters from the command line through the **-p** argument \(property\), e.g: -```text +```shell fluent-bit -i cpu -o kinesis_firehose -p delivery_stream=my-stream -p region=us-west-2 -m '*' -f 1 ``` ### Configuration File -In your main configuration file append the following _Output_ section: +In your main configuration file append the following: + +{% tabs %} +{% tab title="fluent-bit.yaml" %} + +```yaml +pipeline: + + outputs: + - name: kinesis_firehose + match: '*' + region: us-east-1 + delivery_stream: my-stream +``` + +{% endtab %} +{% tab title="fluent-bit.conf" %} ```text [OUTPUT] @@ -54,11 +68,14 @@ In your main configuration file append the following _Output_ section: delivery_stream my-stream ``` +{% endtab %} +{% endtabs %} + ### Permissions The following AWS IAM permissions are required to use this plugin: -``` +```json { "Version": "2012-10-17", "Statement": [{ @@ -77,6 +94,23 @@ Fluent Bit 1.7 adds a new feature called `workers` which enables outputs to have Example: +{% tabs %} +{% tab title="fluent-bit.yaml" %} + +```yaml +pipeline: + + outputs: + - name: kinesis_firehose + match: '*' + region: us-east-1 + delivery_stream: my-stream + workers: 2 +``` + +{% endtab %} +{% tab title="fluent-bit.conf" %} + ```text [OUTPUT] Name kinesis_firehose @@ -86,8 +120,15 @@ Example: workers 2 ``` +{% endtab %} +{% endtabs %} + +{% hint style="info" %} + If you enable a single worker, you are enabling a dedicated thread for your Firehose output. We recommend starting with without workers, evaluating the performance, and then adding workers one at a time until you reach your desired/needed throughput. For most users, no workers or a single worker will be sufficient. +{% endhint %} + ### AWS for Fluent Bit Amazon distributes a container image with Fluent Bit and these plugins. @@ -102,19 +143,19 @@ Amazon distributes a container image with Fluent Bit and these plugins. Our images are available in Amazon ECR Public Gallery. You can download images with different tags by following command: -```text +```shell docker pull public.ecr.aws/aws-observability/aws-for-fluent-bit: ``` For example, you can pull the image with latest version by: -```text +```shell docker pull public.ecr.aws/aws-observability/aws-for-fluent-bit:latest ``` If you see errors for image pull limits, try log into public ECR with your AWS credentials: -```text +```shell aws ecr-public get-login-password --region us-east-1 | docker login --username AWS --password-stdin public.ecr.aws ``` @@ -128,8 +169,8 @@ You can check the [Amazon ECR Public official doc](https://docs.aws.amazon.com/A You can use our SSM Public Parameters to find the Amazon ECR image URI in your region: -```text +```shell aws ssm get-parameters-by-path --path /aws/service/aws-for-fluent-bit/ ``` -For more see [the AWS for Fluent Bit github repo](https://github.com/aws/aws-for-fluent-bit#public-images). +For more see [the AWS for Fluent Bit GitHub repo](https://github.com/aws/aws-for-fluent-bit#public-images). \ No newline at end of file diff --git a/pipeline/outputs/kinesis.md b/pipeline/outputs/kinesis.md index 6a4a48cab..9318b4e8d 100644 --- a/pipeline/outputs/kinesis.md +++ b/pipeline/outputs/kinesis.md @@ -4,8 +4,6 @@ description: Send logs to Amazon Kinesis Streams # Amazon Kinesis Data Streams -![](../../.gitbook/assets/image%20%288%29.png) - The Amazon Kinesis Data Streams output plugin allows to ingest your records into the [Kinesis](https://aws.amazon.com/kinesis/data-streams/) service. This is the documentation for the core Fluent Bit Kinesis plugin written in C. It has all the core features of the [aws/amazon-kinesis-streams-for-fluent-bit](https://github.com/aws/amazon-kinesis-streams-for-fluent-bit) Golang Fluent Bit plugin released in 2019. The Golang plugin was named `kinesis`; this new high performance and highly efficient kinesis plugin is called `kinesis_streams` to prevent conflicts/confusion. @@ -40,13 +38,29 @@ In order to send records into Amazon Kinesis Data Streams, you can run the plugi The **kinesis\_streams** plugin, can read the parameters from the command line through the **-p** argument \(property\), e.g: -```text +```shell fluent-bit -i cpu -o kinesis_streams -p stream=my-stream -p region=us-west-2 -m '*' -f 1 ``` ### Configuration File -In your main configuration file append the following _Output_ section: +In your main configuration file append the following: + +{% tabs %} +{% tab title="fluent-bit.yaml" %} + +```yaml +pipeline: + + outputs: + - name: kinesis_steams + match: '*' + region: us-east-1 + stream: my-stream +``` + +{% endtab %} +{% tab title="fluent-bit.conf" %} ```text [OUTPUT] @@ -56,11 +70,14 @@ In your main configuration file append the following _Output_ section: stream my-stream ``` +{% endtab %} +{% endtabs %} + ### Permissions The following AWS IAM permissions are required to use this plugin: -``` +```json { "Version": "2012-10-17", "Statement": [{ @@ -87,19 +104,19 @@ Amazon distributes a container image with Fluent Bit and these plugins. Our images are available in Amazon ECR Public Gallery. You can download images with different tags by following command: -```text +```shell docker pull public.ecr.aws/aws-observability/aws-for-fluent-bit: ``` For example, you can pull the image with latest version by: -```text +```shell docker pull public.ecr.aws/aws-observability/aws-for-fluent-bit:latest ``` If you see errors for image pull limits, try log into public ECR with your AWS credentials: -```text +```shell aws ecr-public get-login-password --region us-east-1 | docker login --username AWS --password-stdin public.ecr.aws ``` @@ -113,8 +130,8 @@ You can check the [Amazon ECR Public official doc](https://docs.aws.amazon.com/A You can use our SSM Public Parameters to find the Amazon ECR image URI in your region: -```text +```shell aws ssm get-parameters-by-path --path /aws/service/aws-for-fluent-bit/ ``` -For more see [the AWS for Fluent Bit github repo](https://github.com/aws/aws-for-fluent-bit#public-images). +For more see [the AWS for Fluent Bit github repo](https://github.com/aws/aws-for-fluent-bit#public-images). \ No newline at end of file diff --git a/pipeline/outputs/s3.md b/pipeline/outputs/s3.md index e1346bd0a..0f3f2e512 100644 --- a/pipeline/outputs/s3.md +++ b/pipeline/outputs/s3.md @@ -29,11 +29,13 @@ See [AWS Credentials](https://github.com/fluent/fluent-bit-docs/tree/43c4fe134611da471e706b0edb2f9acd7cdfdbc3/administration/aws-credentials.md) for details about fetching AWS credentials. -{% hint style="info" %} +{% hint style="warning" %} + The [Prometheus success/retry/error metrics values](administration/monitoring.md) output by the built-in HTTP server in Fluent Bit are meaningless for S3 output. S3 has its own buffering and retry mechanisms. The Fluent Bit AWS S3 maintainers apologize for this feature gap; you can [track issue progress on GitHub](https://github.com/fluent/fluent-bit/issues/6141). + {% endhint %} ## Configuration Parameters @@ -79,7 +81,7 @@ properties available and general configuration, refer to The plugin requires the following AWS IAM permissions: -```text +```json { "Version": "2012-10-17", "Statement": [{ @@ -148,7 +150,26 @@ inject the tag into the S3 key using the following syntax: In the following example, assume the date is `January 1st, 2020 00:00:00` and the tag associated with the logs in question is `my_app_name-logs.prod`. -```python +{% tabs %} +{% tab title="fluent-bit.yaml" %} + +```yaml +pipeline: + + outputs: + - name: s3 + match: '*' + bucket: my-bucket + region: us-west-2 + total_file_size: 250M + s3_key_format: '/$TAG[2]/$TAG[0]/%Y/%m/%d/%H/%M/%S/$UUID.gz' + s3_key_format_tag_delimiters: '.-' +``` + +{% endtab %} +{% tab title="fluent-bit.conf" %} + +```text [OUTPUT] Name s3 Match * @@ -159,6 +180,9 @@ associated with the logs in question is `my_app_name-logs.prod`. s3_key_format_tag_delimiters .- ``` +{% endtab %} +{% endtabs %} + With the delimiters as `.` and `-`, the tag splits into parts as follows: - `$TAG[0]` = `my_app_name` @@ -198,7 +222,27 @@ random UUID appended to it. Disabled this with `static_file_path On`. This example attempts to set a `.gz` extension without specifying `$UUID`: -```python +{% tabs %} +{% tab title="fluent-bit.yaml" %} + +```yaml +pipeline: + + outputs: + - name: s3 + match: '*' + bucket: my-bucket + region: us-west-2 + total_file_size: 50M + use_put_object: off + compression: gzip + s3_key_format: '/$TAG/%Y/%m/%d/%H_%M_%S.gz' +``` + +{% endtab %} +{% tab title="fluent-bit.conf" %} + +```text [OUTPUT] Name s3 Match * @@ -210,6 +254,9 @@ This example attempts to set a `.gz` extension without specifying `$UUID`: s3_key_format /$TAG/%Y/%m/%d/%H_%M_%S.gz ``` +{% endtab %} +{% endtabs %} + In the case where pending data is uploaded on shutdown, if the tag was `app`, the S3 key in the S3 bucket might be: @@ -224,7 +271,28 @@ There are two ways of disabling this behavior: - Use `static_file_path`: - ```python + {% tabs %} + {% tab title="fluent-bit.yaml" %} + + ```yaml + pipeline: + + outputs: + - name: s3 + match: '*' + bucket: my-bucket + region: us-west-2 + total_file_size: 50M + use_put_object: off + compression: gzip + s3_key_format: '/$TAG/%Y/%m/%d/%H_%M_%S.gz' + static_file_path: on + ``` + + {% endtab %} + {% tab title="fluent-bit.conf" %} + + ```text [OUTPUT] Name s3 Match * @@ -236,20 +304,46 @@ There are two ways of disabling this behavior: s3_key_format /$TAG/%Y/%m/%d/%H_%M_%S.gz static_file_path On ``` - -- Explicitly define where the random UUID will go in the S3 key format: - - ```python - [OUTPUT] - Name s3 - Match * - bucket my-bucket - region us-west-2 - total_file_size 50M - use_put_object Off - compression gzip - s3_key_format /$TAG/%Y/%m/%d/%H_%M_%S/$UUID.gz - ``` + + {% endtab %} + {% endtabs %} + + - Explicitly define where the random UUID will go in the S3 key format: + + {% tabs %} + {% tab title="fluent-bit.yaml" %} + + ```yaml + pipeline: + + outputs: + - name: s3 + match: '*' + bucket: my-bucket + region: us-west-2 + total_file_size: 50M + use_put_object: off + compression: gzip + s3_key_format: '/$TAG/%Y/%m/%d/%H_%M_%S/$UUID.gz' + ``` + + {% endtab %} + {% tab title="fluent-bit.conf" %} + + ```text + [OUTPUT] + Name s3 + Match * + bucket my-bucket + region us-west-2 + total_file_size 50M + use_put_object Off + compression gzip + s3_key_format /$TAG/%Y/%m/%d/%H_%M_%S/$UUID.gz + ``` + + {% endtab %} + {% endtabs %} ## Reliability @@ -285,17 +379,39 @@ in the event Fluent Bit is killed unexpectedly. The following settings are recommended for this use case: -```python +{% tabs %} +{% tab title="fluent-bit.yaml" %} + +```yaml +pipeline: + + outputs: + - name: s3 + match: '*' + bucket: your-bucket + region: us-east-1 + total_file_size: 1M + upload_timeout: 1m + use_put_object: on +``` + +{% endtab %} +{% tab title="fluent-bit.conf" %} + +```text [OUTPUT] - Name s3 + Name s3 Match * - bucket your-bucket - region us-east-1 - total_file_size 1M - upload_timeout 1m - use_put_object On + bucket your-bucket + region us-east-1 + total_file_size 1M + upload_timeout 1m + use_put_object On ``` +{% endtab %} +{% endtabs %} + ## S3 Multipart Uploads With `use_put_object Off` (default), S3 will attempt to send files using multipart @@ -393,14 +509,33 @@ at `localhost:9000`, and create a bucket of `your-bucket`. Example: -```python +{% tabs %} +{% tab title="fluent-bit.yaml" %} + +```yaml +pipeline: + + outputs: + - name: s3 + match: '*' + bucket: your-bucket + endpoint: http://localhost:9000 +``` + +{% endtab %} +{% tab title="fluent-bit.conf" %} + +```text [OUTPUT] - Name s3 - Match * - bucket your-bucket - endpoint http://localhost:9000 + Name s3 + Match * + bucket your-bucket + endpoint http://localhost:9000 ``` +{% endtab %} +{% endtabs %} + The records store in the MinIO server. ## Usage with Google Cloud @@ -410,14 +545,33 @@ those keys for `access-key` and `access-secret`. Example: -```python +{% tabs %} +{% tab title="fluent-bit.yaml" %} + +```yaml +pipeline: + + outputs: + - name: s3 + match: '*' + bucket: your-bucket + endpoint: https://storage.googleapis.com +``` + +{% endtab %} +{% tab title="fluent-bit.conf" %} + +```text [OUTPUT] - Name s3 - Match * - bucket your-bucket - endpoint https://storage.googleapis.com + Name s3 + Match * + bucket your-bucket + endpoint https://storage.googleapis.com ``` +{% endtab %} +{% endtabs %} + ## Get Started To send records into Amazon S3, you can run the plugin from the command line or @@ -427,7 +581,7 @@ through the configuration file. The S3 plugin reads parameters from the command line through the `-p` argument: -```text +```shell fluent-bit -i cpu -o s3 -p bucket=my-bucket -p region=us-west-2 -p -m '*' -f 1 ``` @@ -435,31 +589,76 @@ fluent-bit -i cpu -o s3 -p bucket=my-bucket -p region=us-west-2 -p -m '*' -f 1 In your main configuration file append the following `Output` section: -```python +{% tabs %} +{% tab title="fluent-bit.yaml" %} + +```yaml +pipeline: + + outputs: + - name: s3 + match: '*' + bucket: your-bucket + region: us-east-1 + store_dir: /home/ec2-user/buffer + total_file_size: 50M + upload_timeout: 10m +``` + +{% endtab %} +{% tab title="fluent-bit.conf" %} + +```text [OUTPUT] - Name s3 - Match * - bucket your-bucket - region us-east-1 - store_dir /home/ec2-user/buffer + Name s3 + Match * + bucket your-bucket + region us-east-1 + store_dir /home/ec2-user/buffer total_file_size 50M - upload_timeout 10m + upload_timeout 10m ``` +{% endtab %} +{% endtabs %} + An example using `PutObject` instead of multipart: -```python +{% tabs %} +{% tab title="fluent-bit.yaml" %} + +```yaml +pipeline: + + outputs: + - name: s3 + match: '*' + bucket: your-bucket + region: us-east-1 + store_dir: /home/ec2-user/buffer + use_put_object: on + total_file_size: 10M + upload_timeout: 10m +``` + +{% endtab %} +{% tab title="fluent-bit.conf" %} + +```text [OUTPUT] - Name s3 - Match * - bucket your-bucket - region us-east-1 - store_dir /home/ec2-user/buffer - use_put_object On + Name s3 + Match * + bucket your-bucket + region us-east-1 + store_dir /home/ec2-user/buffer + use_put_object On total_file_size 10M - upload_timeout 10m + upload_timeout 10m ``` +{% endtab %} +{% endtabs %} + ## AWS for Fluent Bit Amazon distributes a container image with Fluent Bit and plugins. @@ -475,20 +674,20 @@ Images are available in the Amazon ECR Public Gallery as You can download images with different tags using the following command: -```text +```shell docker pull public.ecr.aws/aws-observability/aws-for-fluent-bit: ``` For example, you can pull the image with latest version with: -```text +```shell docker pull public.ecr.aws/aws-observability/aws-for-fluent-bit:latest ``` If you see errors for image pull limits, try signing in to public ECR with your AWS credentials: -```text +```shell aws ecr-public get-login-password --region us-east-1 | docker login --username AWS --password-stdin public.ecr.aws ``` @@ -505,7 +704,7 @@ is also available from the Docker Hub. Use Fluent Bit SSM Public Parameters to find the Amazon ECR image URI in your region: -```text +```shell aws ssm get-parameters-by-path --path /aws/service/aws-for-fluent-bit/ ``` @@ -523,7 +722,7 @@ default, and has a dependency on a shared version of `libarrow`. To use this feature, `FLB_ARROW` must be turned on at compile time. Use the following commands: -```text +```shell cd build/ cmake -DFLB_ARROW=On .. cmake --build . @@ -533,7 +732,27 @@ After being compiled, Fluent Bit can upload incoming data to S3 in Apache Arrow For example: -```python +{% tabs %} +{% tab title="fluent-bit.yaml" %} + +```yaml +pipeline: + inputs: + - name: cpu + + outputs: + - name: s3 + bucket: your-bucket-name + total_file_size: 1M + use_put_object: on + upload_timeout: 60s + compression: arrow +``` + +{% endtab %} +{% tab title="fluent-bit.conf" %} + +```text [INPUT] Name cpu @@ -546,8 +765,10 @@ For example: Compression arrow ``` -Setting `Compression` to `arrow` makes Fluent Bit convert payload into Apache Arrow -format. +{% endtab %} +{% endtabs %} + +Setting `Compression` to `arrow` makes Fluent Bit convert payload into Apache Arrow format. Load, analyze, and process stored data using popular data processing tools such as Python pandas, Apache Spark and Tensorflow. @@ -568,4 +789,4 @@ The following example uses `pyarrow` to analyze the uploaded data: 2 2021-04-27T09:33:55.539305Z 1.0 0.0 1.0 1.0 0.0 1.0 3 2021-04-27T09:33:56.539430Z 0.0 0.0 0.0 0.0 0.0 0.0 4 2021-04-27T09:33:57.539803Z 0.0 0.0 0.0 0.0 0.0 0.0 -``` +``` \ No newline at end of file