fluent · nigels-com · Nov 6, 2019 · Nov 6, 2019 · nigels-com · Nov 27, 2019
@@ -13,4 +13,5 @@ Note that all configuration files use a specific fixed and strict schema, please
 * [TLS / SSL](tls_ssl.md)
 * [Backpressure](backpressure.md)
 * [Memory Usage](memory_usage.md)
+* [UTF-8 Encoding](encoder.md)
 
@@ -0,0 +1,31 @@
+# Encoding input to UTF-8
+
+Some input plugins converting input to UTF-8 from a specified encoding.  The current set of supported encodings are:
+
+  * [iso-8859-1](https://en.wikipedia.org/wiki/ISO/IEC_8859-1) Latin-1 Western European
+  * [iso-8859-2](https://en.wikipedia.org/wiki/ISO/IEC_8859-2) Latin-2 East European
+  * [iso-8859-3](https://en.wikipedia.org/wiki/ISO/IEC_8859-3) Latin-3 South European
+  * [iso-8859-4](https://en.wikipedia.org/wiki/ISO/IEC_8859-4) Latin-4 North European
+  * [iso-8859-5](https://en.wikipedia.org/wiki/ISO/IEC_8859-5) Part 5: Latin/Cyrillic
+  * [iso-8859-6](https://en.wikipedia.org/wiki/ISO/IEC_8859-6) Part 6: Latin/Arabic
+  * [iso-8859-7](https://en.wikipedia.org/wiki/ISO/IEC_8859-7) Part 7: Latin/Greek
+  * [iso-8859-8](https://en.wikipedia.org/wiki/ISO/IEC_8859-8) Part 8: Latin/Hebrew
+  * [iso-8859-9](https://en.wikipedia.org/wiki/ISO/IEC_8859-9) Latin-5 Turkish
+  * [iso-8859-10](https://en.wikipedia.org/wiki/ISO/IEC_8859-10) Latin-6 Nordic
+  * [iso-8859-11](https://en.wikipedia.org/wiki/ISO/IEC_8859-11) Part 11: Latin/Thai
+  * [iso-8859-13](https://en.wikipedia.org/wiki/ISO/IEC_8859-13) Latin-7 Baltic Rim
+  * [iso-8859-14](https://en.wikipedia.org/wiki/ISO/IEC_8859-14) Latin-8 Celtic
+  * [iso-8859-15](https://en.wikipedia.org/wiki/ISO/IEC_8859-15) Latin-9 Western European 
+  * [iso-8859-16](https://en.wikipedia.org/wiki/ISO/IEC_8859-16) Latin-10 South-Eastern European
+
+  * [windows-1250](https://en.wikipedia.org/wiki/Windows-1250) Central European and Eastern European
+  * [windows-1251](https://en.wikipedia.org/wiki/Windows-1251) Cyrillic
+  * [windows-1252](https://en.wikipedia.org/wiki/Windows-1252) English
+  * [windows-1253](https://en.wikipedia.org/wiki/Windows-1253) Greek
+  * [windows-1254](https://en.wikipedia.org/wiki/Windows-1254) Turkish
+  * [windows-1255](https://en.wikipedia.org/wiki/Windows-1255) Hebrew
+  * [windows-1256](https://en.wikipedia.org/wiki/Windows-1256) Arabic
+  * [windows-1257](https://en.wikipedia.org/wiki/Windows-1257) Baltic
+  * [windows-1258](https://en.wikipedia.org/wiki/Windows-1258) Vietnamese
+
+The plugins supporting UTF-8 encoding currently include [head](../input/head.md), [tail](../input/tail.md) and [syslog](../input/syslog.md) via the `Encoder` parameter.
@@ -16,6 +16,7 @@ The plugin supports the following configuration parameters:
 | Key | Rename a key. Default: head. |
 | Lines | Line number to read. If the number N is set, in\_head reads first N lines like head\(1\) -n. |
 | Split\_line | If enabled, in\_head generates key-value pair per line. |
+| Encoder | Optionally encode input to UTF-8. E.g. `iso-8859-1` |
 
 ### Split Line Mode
 

@@ -16,6 +16,7 @@ The plugin supports the following configuration parameters:
 | Parser | Specify an alternative parser for the message. By default, the plugin uses the parser _syslog-rfc3164_. If your syslog messages have fractional seconds set this Parser value to _syslog-rfc5424_ instead. |  |
 | Buffer\_Chunk\_Size | By default the buffer to store the incoming Syslog messages, do not allocate the maximum memory allowed, instead it allocate memory when is required. The rounds of allocations are set by _Buffer\_Chunk\_Size_ in KB. If not set, _Buffer\_Chunk\_Size_ is equal to 32 \(32KB\). Read considerations below when using _udp_ or _unix\_udp_ mode. |  |
 | Buffer\_Max_Size | Specify the maximum buffer size in KB to receive a Syslog message. If not set, the default size will be the value of _Buffer\_Chunk\_Size_. |  |
+| Encoder | Optionally encode input to UTF-8. E.g. `iso-8859-1` | |
 
 ### Considerations
 

@@ -1,6 +1,6 @@
 # Tail
 
-The **tail** input plugin allows to monitor one or several text files. It has a similar behavior like _tail -f_ shell command.
+The **tail** input plugin allows to monitor one or several text files. It has a similar behavior to the _tail -f_ shell command.
 
 The plugin reads every matched file in the _Path_ pattern and for every new line found \(separated by a \n\), it generates a new record. Optionally a database file can be used so the plugin can have a history of tracked files and a state of offsets, this is very useful to resume a state if the service is restarted.
 
@@ -34,6 +34,7 @@ The plugin supports the following configuration parameters:
 | Key | When a message is unstructured \(no parser applied\), it's appended as a string under the key name _log_. This option allows to define an alternative name for that key. | log |
 | Tag | Set a tag (with regex-extract fields) that will be placed on lines read. E.g. `kube.<namespace_name>.<pod_name>.<container_name>` | |
 | Tag_Regex | Set a regex to exctract fields from the file. E.g. `(?<pod_name>[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*)_(?<namespace_name>[^_]+)_(?<container_name>.+)-` | |
+| Encoder | Optionally encode input to UTF-8. E.g. `iso-8859-1` | |
 
 Note that if the database parameter _db_ is **not** specified, by default the plugin will start reading each target file from the beginning.