Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions configuration/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,4 +13,5 @@ Note that all configuration files use a specific fixed and strict schema, please
* [TLS / SSL](tls_ssl.md)
* [Backpressure](backpressure.md)
* [Memory Usage](memory_usage.md)
* [UTF-8 Encoding](encoder.md)

31 changes: 31 additions & 0 deletions configuration/encoder.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
# Encoding input to UTF-8

Some input plugins converting input to UTF-8 from a specified encoding. The current set of supported encodings are:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

converting -> convert


* [iso-8859-1](https://en.wikipedia.org/wiki/ISO/IEC_8859-1) Latin-1 Western European
* [iso-8859-2](https://en.wikipedia.org/wiki/ISO/IEC_8859-2) Latin-2 East European
* [iso-8859-3](https://en.wikipedia.org/wiki/ISO/IEC_8859-3) Latin-3 South European
* [iso-8859-4](https://en.wikipedia.org/wiki/ISO/IEC_8859-4) Latin-4 North European
* [iso-8859-5](https://en.wikipedia.org/wiki/ISO/IEC_8859-5) Part 5: Latin/Cyrillic
* [iso-8859-6](https://en.wikipedia.org/wiki/ISO/IEC_8859-6) Part 6: Latin/Arabic
* [iso-8859-7](https://en.wikipedia.org/wiki/ISO/IEC_8859-7) Part 7: Latin/Greek
* [iso-8859-8](https://en.wikipedia.org/wiki/ISO/IEC_8859-8) Part 8: Latin/Hebrew
* [iso-8859-9](https://en.wikipedia.org/wiki/ISO/IEC_8859-9) Latin-5 Turkish
* [iso-8859-10](https://en.wikipedia.org/wiki/ISO/IEC_8859-10) Latin-6 Nordic
* [iso-8859-11](https://en.wikipedia.org/wiki/ISO/IEC_8859-11) Part 11: Latin/Thai
* [iso-8859-13](https://en.wikipedia.org/wiki/ISO/IEC_8859-13) Latin-7 Baltic Rim
* [iso-8859-14](https://en.wikipedia.org/wiki/ISO/IEC_8859-14) Latin-8 Celtic
* [iso-8859-15](https://en.wikipedia.org/wiki/ISO/IEC_8859-15) Latin-9 Western European
* [iso-8859-16](https://en.wikipedia.org/wiki/ISO/IEC_8859-16) Latin-10 South-Eastern European

* [windows-1250](https://en.wikipedia.org/wiki/Windows-1250) Central European and Eastern European
* [windows-1251](https://en.wikipedia.org/wiki/Windows-1251) Cyrillic
* [windows-1252](https://en.wikipedia.org/wiki/Windows-1252) English
* [windows-1253](https://en.wikipedia.org/wiki/Windows-1253) Greek
* [windows-1254](https://en.wikipedia.org/wiki/Windows-1254) Turkish
* [windows-1255](https://en.wikipedia.org/wiki/Windows-1255) Hebrew
* [windows-1256](https://en.wikipedia.org/wiki/Windows-1256) Arabic
* [windows-1257](https://en.wikipedia.org/wiki/Windows-1257) Baltic
* [windows-1258](https://en.wikipedia.org/wiki/Windows-1258) Vietnamese

The plugins supporting UTF-8 encoding currently include [head](../input/head.md), [tail](../input/tail.md) and [syslog](../input/syslog.md) via the `Encoder` parameter.
1 change: 1 addition & 0 deletions input/head.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ The plugin supports the following configuration parameters:
| Key | Rename a key. Default: head. |
| Lines | Line number to read. If the number N is set, in\_head reads first N lines like head\(1\) -n. |
| Split\_line | If enabled, in\_head generates key-value pair per line. |
| Encoder | Optionally encode input to UTF-8. E.g. `iso-8859-1` |
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Link to configuration/encoder.md


### Split Line Mode

Expand Down
1 change: 1 addition & 0 deletions input/syslog.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ The plugin supports the following configuration parameters:
| Parser | Specify an alternative parser for the message. By default, the plugin uses the parser _syslog-rfc3164_. If your syslog messages have fractional seconds set this Parser value to _syslog-rfc5424_ instead. | |
| Buffer\_Chunk\_Size | By default the buffer to store the incoming Syslog messages, do not allocate the maximum memory allowed, instead it allocate memory when is required. The rounds of allocations are set by _Buffer\_Chunk\_Size_ in KB. If not set, _Buffer\_Chunk\_Size_ is equal to 32 \(32KB\). Read considerations below when using _udp_ or _unix\_udp_ mode. | |
| Buffer\_Max_Size | Specify the maximum buffer size in KB to receive a Syslog message. If not set, the default size will be the value of _Buffer\_Chunk\_Size_. | |
| Encoder | Optionally encode input to UTF-8. E.g. `iso-8859-1` | |

### Considerations

Expand Down
3 changes: 2 additions & 1 deletion input/tail.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Tail

The **tail** input plugin allows to monitor one or several text files. It has a similar behavior like _tail -f_ shell command.
The **tail** input plugin allows to monitor one or several text files. It has a similar behavior to the _tail -f_ shell command.

The plugin reads every matched file in the _Path_ pattern and for every new line found \(separated by a \n\), it generates a new record. Optionally a database file can be used so the plugin can have a history of tracked files and a state of offsets, this is very useful to resume a state if the service is restarted.

Expand Down Expand Up @@ -34,6 +34,7 @@ The plugin supports the following configuration parameters:
| Key | When a message is unstructured \(no parser applied\), it's appended as a string under the key name _log_. This option allows to define an alternative name for that key. | log |
| Tag | Set a tag (with regex-extract fields) that will be placed on lines read. E.g. `kube.<namespace_name>.<pod_name>.<container_name>` | |
| Tag_Regex | Set a regex to exctract fields from the file. E.g. `(?<pod_name>[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*)_(?<namespace_name>[^_]+)_(?<container_name>.+)-` | |
| Encoder | Optionally encode input to UTF-8. E.g. `iso-8859-1` | |

Note that if the database parameter _db_ is **not** specified, by default the plugin will start reading each target file from the beginning.

Expand Down