-
Notifications
You must be signed in to change notification settings - Fork 25.7k
Improve docs and tests for convert processor
#133160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 1 commit
e7c3bac
3de1ec3
753e77d
9076d9d
1d89527
eaa39b3
49996aa
51d63db
7ec8b13
a591de8
4f4ff85
899f34f
061cedf
a603cb4
466ba40
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||||||||||||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -6,16 +6,48 @@ mapped_pages: | |||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||
| # Convert processor [convert-processor] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Converts a field in the currently ingested document to a different type, such as converting a string to an integer. If the field value is an array, all members will be converted. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||
| The supported types include: `integer`, `long`, `float`, `double`, `string`, `boolean`, `ip`, and `auto`. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Specifying a target `type` of `integer` supports inputs which are `Integer` values, `Long` values in 32-bit signed | ||||||||||||||||||||||||||||||||||||||||||||||||||||||
| integer range, or `String` values representing an integer in 32-bit signed integer range in either decimal format | ||||||||||||||||||||||||||||||||||||||||||||||||||||||
| (without a decimal point) or hex format (e.g. `"123"` or `"0x7b"`). | ||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Specifying `long` supports inputs which are `Integer` values, `Long` values, or `String` values representing an integer | ||||||||||||||||||||||||||||||||||||||||||||||||||||||
| in 64-bit signed integer range in either decimal format (without a decimal point) or hex format (e.g. `"123"` or | ||||||||||||||||||||||||||||||||||||||||||||||||||||||
| `"0x7b"`). | ||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Specifying `float` supports inputs which are `Integer` values, `Long` values (conversions from either `Integer` or | ||||||||||||||||||||||||||||||||||||||||||||||||||||||
| `Long` may lose precision for absolute values greater than 2^24), `Float` values, `Double` values (may lose precision), | ||||||||||||||||||||||||||||||||||||||||||||||||||||||
| `String` values representing a floating point number in decimal, scientific, or hex format (e.g. `"123.0"`, `"123.45"`, | ||||||||||||||||||||||||||||||||||||||||||||||||||||||
| `"1.23e2"`, or `"0x1.ecp6"`) or an integer (conversions from `String` may lose precision, and will give positive or | ||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||
| negative infinity if out of range for a 32-bit floating point value). | ||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Specifying `double` supports inputs which are `Integer` values, `Long` values (may lose precision for absolute values | ||||||||||||||||||||||||||||||||||||||||||||||||||||||
| greater than 2^53), `Float` values, `Double` values, `String` values representing a floating point number in decimal, | ||||||||||||||||||||||||||||||||||||||||||||||||||||||
| scientific, or hex format (e.g. `"123.0"`, `"123.45"`, `"1.23e2"`, or `"0x1.ecp6"`) or an integer (conversions from | ||||||||||||||||||||||||||||||||||||||||||||||||||||||
| `String` may lose precision, and will give positive or negative infinity if out of range for a 64-bit floating point | ||||||||||||||||||||||||||||||||||||||||||||||||||||||
| value). | ||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Specifying `boolean` will set the field to true if its string value is equal to `true` (ignore case), to false if its string value is equal to `false` (ignore case), or it will throw an exception otherwise. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Specifying `ip` will set the target field to the value of `field` if it contains a valid IPv4 or IPv6 address that can be indexed into an [IP field type](/reference/elasticsearch/mapping-reference/ip.md). | ||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Specifying `auto` will attempt to convert the string-valued `field` into the closest non-string, non-IP type. For example, a field whose value is `"true"` will be converted to its respective boolean type: `true`. Do note that float takes precedence of double in `auto`. A value of `"242.15"` will "automatically" be converted to `242.15` of type `float`. If a provided field cannot be appropriately converted, the processor will still process successfully and leave the field value as-is. In such a case, `target_field` will be updated with the unconverted field value. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Specifying `auto` will attempt to convert a string-valued `field` into the closest non-string, non-IP type. For example, | ||||||||||||||||||||||||||||||||||||||||||||||||||||||
| a field whose value is `"true"` will be converted to its respective boolean type: `true`. A string representing an | ||||||||||||||||||||||||||||||||||||||||||||||||||||||
| integer in decimal or hex format (e.g. `"123"` or `"0x7b"`) will be converted to an `Integer` if the number fits in a | ||||||||||||||||||||||||||||||||||||||||||||||||||||||
| 32-bit signed integer, else to a `Long` if it fits in a 64-bit signed integer, else to a `Float` (in which case it may | ||||||||||||||||||||||||||||||||||||||||||||||||||||||
| lose precision, and will give positive or negative infinity if out of range for a 32-bit floating point value). A string | ||||||||||||||||||||||||||||||||||||||||||||||||||||||
| representing a floating point number in decimal, scientific, or hex format (e.g. `"123.0"`, `"123.45"`, `"1.23e2"`, or | ||||||||||||||||||||||||||||||||||||||||||||||||||||||
| `"0x1.ecp6"`) will be converted to a `Float` (and may lose precision, and will give positive or negative infinity if out | ||||||||||||||||||||||||||||||||||||||||||||||||||||||
| of range for a 32-bit floating point value). If a provided field is either not a `String` or a `String` which cannot be | ||||||||||||||||||||||||||||||||||||||||||||||||||||||
| converted, the processor will still process successfully and leave the field value as-is. In such a case, `target_field` | ||||||||||||||||||||||||||||||||||||||||||||||||||||||
| will be updated with the unconverted field value. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Specifying `auto` will attempt to convert a string-valued `field` into the closest non-string, non-IP type. For example, | |
| a field whose value is `"true"` will be converted to its respective boolean type: `true`. A string representing an | |
| integer in decimal or hex format (e.g. `"123"` or `"0x7b"`) will be converted to an `Integer` if the number fits in a | |
| 32-bit signed integer, else to a `Long` if it fits in a 64-bit signed integer, else to a `Float` (in which case it may | |
| lose precision, and will give positive or negative infinity if out of range for a 32-bit floating point value). A string | |
| representing a floating point number in decimal, scientific, or hex format (e.g. `"123.0"`, `"123.45"`, `"1.23e2"`, or | |
| `"0x1.ecp6"`) will be converted to a `Float` (and may lose precision, and will give positive or negative infinity if out | |
| of range for a 32-bit floating point value). If a provided field is either not a `String` or a `String` which cannot be | |
| converted, the processor will still process successfully and leave the field value as-is. In such a case, `target_field` | |
| will be updated with the unconverted field value. | |
| Specifying `auto` will attempt to convert a string-valued `field` into the closest non-string, non-IP type. | |
| For example: | |
| * A field whose value is `"true"` will be converted to its respective boolean type: `true` | |
| * A string representing an | |
| integer in decimal or hex format (e.g. `"123"` or `"0x7b"`) will be converted to an `Integer` if the number fits in a | |
| 32-bit signed integer, else to a `Long` if it fits in a 64-bit signed integer, else to a `Float` (in which case it may | |
| lose precision, and will give positive or negative infinity if out of range for a 32-bit floating point value). | |
| * A string | |
| representing a floating point number in decimal, scientific, or hex format (e.g. `"123.0"`, `"123.45"`, `"1.23e2"`, or | |
| `"0x1.ecp6"`) will be converted to a `Float` (and may lose precision, and will give positive or negative infinity if out | |
| of range for a 32-bit floating point value). | |
| If a provided field is either not a `String` or a `String` which cannot be | |
| converted, the processor will still process successfully and leave the field value as-is. In such a case, `target_field` | |
| will be updated with the unconverted field value. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea. I've pushed a commit with a version of this. I've reworded slightly to avoid the 'For example' since it's actually an exhaustive list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
consider making this a table that indicates the supported inputs with notes. e.g.
or alternatively lists with sub-items
reason: reading this in the preview is very very very dense. see screenshot.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks... As it happens, I did initially try putting the new content in a table, but then I realized there was a disconnect between the new content (for the numeric types) and the existing content (
boolean,ip, andautotypes). Would you suggest moving the content for all the types into a table?Incidentally, I found the markdown kind of horrible for tables with long paragraphs of text, although luckily my IDE took care of the worst of it. I assume that's something we live with, on the basis that the readability of the output is far more important?
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd suggest that we refactor the content for all of the types for consistency.
totally agree about the pain of formatting in markdown tables. this is a pain that we're currently living with, and I would prioritize a big increase in user readability over raw content readability because hopefully this will not need a TON of regular maintenance. if this was something we needed to tweak all the time, we might think twice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I've pushed a commit which moves most of this content into a table. Please let me know what you think.
I've kept the paragraph about
autoout of the table. This is because that content is a little different: for the other types, we are describing the accepted inputs (and the provisos attached), whereas for theautotype all inputs are accepted and we're describing the logic around figuring out the output type. It also has the benefit of avoiding the widest of the columns as that's the longest paragraph.