You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: pipeline/inputs/tail.md
+12Lines changed: 12 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -37,6 +37,7 @@ The plugin supports the following configuration parameters:
37
37
|`static_batch_size`| Set the maximum number of bytes to process per iteration for the monitored static files (files that already exist upon Fluent Bit start). |`50M`|
38
38
|`file_cache_advise`| Set the `posix_fadvise` in `POSIX_FADV_DONTNEED` mode. This reduces the usage of the kernel file cache. This option is ignored if not running on Linux. |`on`|
39
39
|`threaded`| Indicates whether to run this input in its own [thread](../../administration/multithreading.md#inputs). |`false`|
40
+
|`Unicode.Encoding`| Set the Unicode character encoding of the file data. This parameter requests two-byte aligned chunk and buffer sizes. If data is not aligned for two bytes, Fluent Bit will use two-byte alignment automatically to avoid character breakages on consuming boundaries. Supported values: `UTF-16LE`, `UTF-16BE`, and `auto`. |`none`|
40
41
41
42
## Buffers and memory management
42
43
@@ -77,6 +78,17 @@ If no database file is present, positioning behavior depends on the value of `re
77
78
78
79
The database file essentially stores `inode=offset` so it should be unique per instance of the plugin, for example if you have two tail inputs then use two separate `db` files for each. That way each tail input can independently track its own state.
79
80
81
+
{% hint style="info" %}
82
+
Note that `Unicode.Encoding` depends on simdutf library which is written in C++11 or above.
83
+
So, the older platforms are not supported for this feature.
84
+
In addition, `Unicode.Encoding auto` is not covered for the all of the usages.
85
+
This is because sometimes this auto-detecting for character encodings makes a mistake to guess the correct encoding.
86
+
87
+
We recommend to use `UTF-16LE` or `UTF-16BE` if the target file encoding is pre-determined or known beforehand.
88
+
In details, this parameter requests to use 2-bytes aligned chunk and buffer sizes.
89
+
If they are not aligned for 2 bytes, Fluent Bit will use 2-bytes alignments automatically to avoid character breakages on consuming boundaries.
90
+
{% endhint %}
91
+
80
92
## Monitor a large number of files
81
93
82
94
To monitor a large number of files, you can increase the `inotify` settings in your Linux environment by modifying the following `sysctl` parameters:
0 commit comments