Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
36 changes: 34 additions & 2 deletions docs/reference/enrich-processor/convert-processor.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,16 +6,48 @@ mapped_pages:

# Convert processor [convert-processor]


Converts a field in the currently ingested document to a different type, such as converting a string to an integer. If the field value is an array, all members will be converted.

The supported types include: `integer`, `long`, `float`, `double`, `string`, `boolean`, `ip`, and `auto`.

Specifying a target `type` of `integer` supports inputs which are `Integer` values, `Long` values in 32-bit signed
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider making this a table that indicates the supported inputs with notes. e.g.

Type Supported input Notes
Integer Long 32-bit signed integer range
String Must represent an integer in a 32-bit signed ...

or alternatively lists with sub-items

reason: reading this in the preview is very very very dense. see screenshot.

image

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks... As it happens, I did initially try putting the new content in a table, but then I realized there was a disconnect between the new content (for the numeric types) and the existing content (boolean, ip, and auto types). Would you suggest moving the content for all the types into a table?

Incidentally, I found the markdown kind of horrible for tables with long paragraphs of text, although luckily my IDE took care of the worst of it. I assume that's something we live with, on the basis that the readability of the output is far more important?

Copy link
Contributor

@shainaraskas shainaraskas Aug 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest that we refactor the content for all of the types for consistency.

totally agree about the pain of formatting in markdown tables. this is a pain that we're currently living with, and I would prioritize a big increase in user readability over raw content readability because hopefully this will not need a TON of regular maintenance. if this was something we needed to tweak all the time, we might think twice.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I've pushed a commit which moves most of this content into a table. Please let me know what you think.

I've kept the paragraph about auto out of the table. This is because that content is a little different: for the other types, we are describing the accepted inputs (and the provisos attached), whereas for the auto type all inputs are accepted and we're describing the logic around figuring out the output type. It also has the benefit of avoiding the widest of the columns as that's the longest paragraph.

integer range, or `String` values representing an integer in 32-bit signed integer range in either decimal format
(without a decimal point) or hex format (e.g. `"123"` or `"0x7b"`).

Specifying `long` supports inputs which are `Integer` values, `Long` values, or `String` values representing an integer
in 64-bit signed integer range in either decimal format (without a decimal point) or hex format (e.g. `"123"` or
`"0x7b"`).

Specifying `float` supports inputs which are `Integer` values, `Long` values (conversions from either `Integer` or
`Long` may lose precision for absolute values greater than 2^24), `Float` values, `Double` values (may lose precision),
`String` values representing a floating point number in decimal, scientific, or hex format (e.g. `"123.0"`, `"123.45"`,
`"1.23e2"`, or `"0x1.ecp6"`) or an integer (conversions from `String` may lose precision, and will give positive or
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

N.B. I have been vague about which integer formats are accepted. The current situation is that it accepts decimal but not hex, but I was reluctant to document that since it feels like a bug.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any indication on how the Beats or Logstash processors handle this? Might indicate if it's a bug or intended behaviour.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we're going to follow up with them, and hopefully we'll be able to clarify the docs one way or another then.

negative infinity if out of range for a 32-bit floating point value).

Specifying `double` supports inputs which are `Integer` values, `Long` values (may lose precision for absolute values
greater than 2^53), `Float` values, `Double` values, `String` values representing a floating point number in decimal,
scientific, or hex format (e.g. `"123.0"`, `"123.45"`, `"1.23e2"`, or `"0x1.ecp6"`) or an integer (conversions from
`String` may lose precision, and will give positive or negative infinity if out of range for a 64-bit floating point
value).

Specifying `boolean` will set the field to true if its string value is equal to `true` (ignore case), to false if its string value is equal to `false` (ignore case), or it will throw an exception otherwise.

Specifying `ip` will set the target field to the value of `field` if it contains a valid IPv4 or IPv6 address that can be indexed into an [IP field type](/reference/elasticsearch/mapping-reference/ip.md).

Specifying `auto` will attempt to convert the string-valued `field` into the closest non-string, non-IP type. For example, a field whose value is `"true"` will be converted to its respective boolean type: `true`. Do note that float takes precedence of double in `auto`. A value of `"242.15"` will "automatically" be converted to `242.15` of type `float`. If a provided field cannot be appropriately converted, the processor will still process successfully and leave the field value as-is. In such a case, `target_field` will be updated with the unconverted field value.
Specifying `auto` will attempt to convert a string-valued `field` into the closest non-string, non-IP type. For example,
a field whose value is `"true"` will be converted to its respective boolean type: `true`. A string representing an
integer in decimal or hex format (e.g. `"123"` or `"0x7b"`) will be converted to an `Integer` if the number fits in a
32-bit signed integer, else to a `Long` if it fits in a 64-bit signed integer, else to a `Float` (in which case it may
lose precision, and will give positive or negative infinity if out of range for a 32-bit floating point value). A string
representing a floating point number in decimal, scientific, or hex format (e.g. `"123.0"`, `"123.45"`, `"1.23e2"`, or
`"0x1.ecp6"`) will be converted to a `Float` (and may lose precision, and will give positive or negative infinity if out
of range for a 32-bit floating point value). If a provided field is either not a `String` or a `String` which cannot be
converted, the processor will still process successfully and leave the field value as-is. In such a case, `target_field`
will be updated with the unconverted field value.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

breaking this out into its own section outside of the table is perfect. I was actually going to suggest it but then didn't for some reason, so you read my mind

this could be helped with a couple of bullets. could refine even more more but this is a lightweight way to make it easier to scan.

Suggested change
Specifying `auto` will attempt to convert a string-valued `field` into the closest non-string, non-IP type. For example,
a field whose value is `"true"` will be converted to its respective boolean type: `true`. A string representing an
integer in decimal or hex format (e.g. `"123"` or `"0x7b"`) will be converted to an `Integer` if the number fits in a
32-bit signed integer, else to a `Long` if it fits in a 64-bit signed integer, else to a `Float` (in which case it may
lose precision, and will give positive or negative infinity if out of range for a 32-bit floating point value). A string
representing a floating point number in decimal, scientific, or hex format (e.g. `"123.0"`, `"123.45"`, `"1.23e2"`, or
`"0x1.ecp6"`) will be converted to a `Float` (and may lose precision, and will give positive or negative infinity if out
of range for a 32-bit floating point value). If a provided field is either not a `String` or a `String` which cannot be
converted, the processor will still process successfully and leave the field value as-is. In such a case, `target_field`
will be updated with the unconverted field value.
Specifying `auto` will attempt to convert a string-valued `field` into the closest non-string, non-IP type.
For example:
* A field whose value is `"true"` will be converted to its respective boolean type: `true`
* A string representing an
integer in decimal or hex format (e.g. `"123"` or `"0x7b"`) will be converted to an `Integer` if the number fits in a
32-bit signed integer, else to a `Long` if it fits in a 64-bit signed integer, else to a `Float` (in which case it may
lose precision, and will give positive or negative infinity if out of range for a 32-bit floating point value).
* A string
representing a floating point number in decimal, scientific, or hex format (e.g. `"123.0"`, `"123.45"`, `"1.23e2"`, or
`"0x1.ecp6"`) will be converted to a `Float` (and may lose precision, and will give positive or negative infinity if out
of range for a 32-bit floating point value).
If a provided field is either not a `String` or a `String` which cannot be
converted, the processor will still process successfully and leave the field value as-is. In such a case, `target_field`
will be updated with the unconverted field value.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. I've pushed a commit with a version of this. I've reworded slightly to avoid the 'For example' since it's actually an exhaustive list.


N.B. If conversions other than those provided by this processor are required, the
[`script`](/reference/enrich-processor/script-processor.md) processor mey be used to implement the desired behaviour.
(The performance of the `script` processor should be as good or better than the `convert` processor.)

$$$convert-options$$$

Expand Down
Loading
Loading