From 8698e455695bd8314576c90b77b79bb64649a832 Mon Sep 17 00:00:00 2001 From: Parker Timmins Date: Wed, 1 Oct 2025 16:09:17 -0500 Subject: [PATCH 01/22] First version of pattern_text docs --- .../elasticsearch/mapping-reference/text.md | 79 +++++++++++++++++++ 1 file changed, 79 insertions(+) diff --git a/docs/reference/elasticsearch/mapping-reference/text.md b/docs/reference/elasticsearch/mapping-reference/text.md index 612ce067cd4b3..cdd83b4b151ed 100644 --- a/docs/reference/elasticsearch/mapping-reference/text.md +++ b/docs/reference/elasticsearch/mapping-reference/text.md @@ -11,6 +11,7 @@ The text family includes the following field types: * [`text`](#text-field-type), the traditional field type for full-text content such as the body of an email or the description of a product. * [`match_only_text`](#match-only-text-field-type), a space-optimized variant of `text` that disables scoring and performs slower on queries that need positions. It is best suited for indexing log messages. +* [`pattern_text`](#pattern-text-text-field-type), a variant of `text` with improved space efficiency when storing log messages. ## Text field type [text-field-type] @@ -341,3 +342,81 @@ The following mapping parameters are accepted: : Metadata about the field. +## Pattern text field type [pattern-text-field-type] +::::{warning} + +This functionality is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features. +:::: + +A variant of [`text`](#text-field-type) with improved space efficiency for log data. +Internally, it decomposed values into static parts that are likely to be shared between many values, and dynamic parts that tend to vary between values. +The static parts will usually come from the explanatory text of a log message, and the dynamic parts will be the variables which were interpolated into the logs. +This decomposition allows for improved compression on log-like data. + +We call the static portion of the value, the `template`. +Though the `template` cannot be accessed directly, a separate field called `.template_id` is accessible. +This field is a hash of the `template` and can be used to group similar values. +As this feature is in technical preview, the internal structure of the `template` is subject to change. +Because of this, the `template_id` is also subject to future changes. + +Unlike most mapping types, `pattern_text` does not support multiple values for a given field per document. +If a document is created with multiple values for a pattern text field, an error will be returned. + +Analysis is configurable, but defaults to a delimiter-based analyzer. +This analyzer applies a lowercase filter then splits on whitespace, and the followings delimiters: `=, ?, :, [, ], {, }, ", \, '`. + +[span queries](/reference/query-languages/query-dsl/span-queries.md) are not supported with this field, use [interval queries](/reference/query-languages/query-dsl/query-dsl-intervals-query.md) instead, or the [`text`](#text-field-type) field type if you absolutely need span queries. + +Like `text`, `pattern_text` does not support sorting and has only limited support for aggregations. + +### Phrase matching +Pattern text supports an `index_options` parameter with valid values of `docs` and `positions`. +The default values is `docs`, which makes `pattern_text` behave similarly to `match_only_text` for phrase queries. +Specifically, positions are not stored, which reduces the index size at the cost of slowing down phrase queries. +If `index_options` is set to `positions`, positions are stored and `pattern_text` will support fast phrase queries. +In both case, all queries return a constant score of 1.0. + +### Index sorting for improved compression + +The compression provided by `pattern_text` can be improved significantly if the index is sorted by the `template_id`. +For example, of typical approach would be to sort first by `message.template_id`, then by `@timestamp`, as in the following example. + + +```console +PUT logs +{ + "settings": { + "index": { + "sort": { + "field": ["message.template_id", "@timestamp"], + "order": ["asc", "desc"] + } + } + } + "mappings": { + "properties": { + "@timestamp": { + "type": "date" + }, + "message": { + "type": "pattern_text" + } + } + } +} +``` + + +### Parameters for pattern text fields [pattern-text-params] + +The following mapping parameters are accepted: + +[`analyzer`](/reference/elasticsearch/mapping-reference/analyzer.md) +: The [analyzer](docs-content://manage-data/data-store/text-analysis.md) which should be used for the `text` field, both at index-time and at search-time (unless overridden by the [`search_analyzer`](/reference/elasticsearch/mapping-reference/search-analyzer.md)). Defaults to a custom delimiter-based analyzer. This analyzer applies a lowercase filter then splits on whitespace, and the followings character: `=, ?, :, [, ], {, }, ", \, '`. + +[`index_options`](/reference/elasticsearch/mapping-reference/index-options.md) +: What information should be stored in the index, for search and highlighting purposes. Valid values are `docs` and `positions`. Defaults to `docs`. + +[`meta`](/reference/elasticsearch/mapping-reference/mapping-field-meta.md) +: Metadata about the field. + From 657670c43a44771f54d582a2345c56ad7ca5a2fd Mon Sep 17 00:00:00 2001 From: Parker Timmins Date: Thu, 2 Oct 2025 09:02:48 -0500 Subject: [PATCH 02/22] Some syntax fixes --- .../elasticsearch/mapping-reference/text.md | 33 +++++++++---------- 1 file changed, 16 insertions(+), 17 deletions(-) diff --git a/docs/reference/elasticsearch/mapping-reference/text.md b/docs/reference/elasticsearch/mapping-reference/text.md index cdd83b4b151ed..2251599dd8492 100644 --- a/docs/reference/elasticsearch/mapping-reference/text.md +++ b/docs/reference/elasticsearch/mapping-reference/text.md @@ -349,21 +349,21 @@ This functionality is in technical preview and may be changed or removed in a fu :::: A variant of [`text`](#text-field-type) with improved space efficiency for log data. -Internally, it decomposed values into static parts that are likely to be shared between many values, and dynamic parts that tend to vary between values. -The static parts will usually come from the explanatory text of a log message, and the dynamic parts will be the variables which were interpolated into the logs. +Internally, it decomposes values into static parts that are likely to be shared among many values, and dynamic parts that tend to vary. +The static parts usually come from the explanatory text of a log message, while the dynamic parts are the variables that were interpolated into the logs. This decomposition allows for improved compression on log-like data. -We call the static portion of the value, the `template`. -Though the `template` cannot be accessed directly, a separate field called `.template_id` is accessible. -This field is a hash of the `template` and can be used to group similar values. -As this feature is in technical preview, the internal structure of the `template` is subject to change. -Because of this, the `template_id` is also subject to future changes. +We call the static portion of the value the `template`. +Although the template cannot be accessed directly, a separate field called `.template_id` is accessible. +This field is a hash of the template and can be used to group similar values. +As this feature is in technical preview, the internal structure of the template is subject to change. +Because of this, `.template_id` is also subject to future changes. Unlike most mapping types, `pattern_text` does not support multiple values for a given field per document. -If a document is created with multiple values for a pattern text field, an error will be returned. +If a document is created with multiple values for a pattern_text field, an error will be returned. -Analysis is configurable, but defaults to a delimiter-based analyzer. -This analyzer applies a lowercase filter then splits on whitespace, and the followings delimiters: `=, ?, :, [, ], {, }, ", \, '`. +Analysis is configurable but defaults to a delimiter-based analyzer. +This analyzer applies a lowercase filter and then splits on whitespace and the following delimiters: `=`, `?`, `:`, `[`, `]`, `{`, `}`, `"`, `\`, `'`. [span queries](/reference/query-languages/query-dsl/span-queries.md) are not supported with this field, use [interval queries](/reference/query-languages/query-dsl/query-dsl-intervals-query.md) instead, or the [`text`](#text-field-type) field type if you absolutely need span queries. @@ -371,16 +371,14 @@ Like `text`, `pattern_text` does not support sorting and has only limited suppor ### Phrase matching Pattern text supports an `index_options` parameter with valid values of `docs` and `positions`. -The default values is `docs`, which makes `pattern_text` behave similarly to `match_only_text` for phrase queries. +The default value is `docs`, which makes `pattern_text` behave similarly to `match_only_text` for phrase queries. Specifically, positions are not stored, which reduces the index size at the cost of slowing down phrase queries. If `index_options` is set to `positions`, positions are stored and `pattern_text` will support fast phrase queries. -In both case, all queries return a constant score of 1.0. +In both cases, all queries return a constant score of 1.0. ### Index sorting for improved compression - -The compression provided by `pattern_text` can be improved significantly if the index is sorted by the `template_id`. -For example, of typical approach would be to sort first by `message.template_id`, then by `@timestamp`, as in the following example. - +The compression provided by `pattern_text` can be significantly improved if the index is sorted by the `template_id` field. +For example, a typical approach would be to sort first by `message.template_id`, then by `@timestamp`, as shown in the following example. ```console PUT logs @@ -412,7 +410,8 @@ PUT logs The following mapping parameters are accepted: [`analyzer`](/reference/elasticsearch/mapping-reference/analyzer.md) -: The [analyzer](docs-content://manage-data/data-store/text-analysis.md) which should be used for the `text` field, both at index-time and at search-time (unless overridden by the [`search_analyzer`](/reference/elasticsearch/mapping-reference/search-analyzer.md)). Defaults to a custom delimiter-based analyzer. This analyzer applies a lowercase filter then splits on whitespace, and the followings character: `=, ?, :, [, ], {, }, ", \, '`. +: The [analyzer](docs-content://manage-data/data-store/text-analysis.md) which should be used for the `pattern_text` field, both at index-time and at search-time (unless overridden by the [`search_analyzer`](/reference/elasticsearch/mapping-reference/search-analyzer.md)). Defaults to a custom delimiter-based analyzer. +This analyzer applies a lowercase filter and then splits on whitespace and the following delimiters: `=`, `?`, `:`, `[`, `]`, `{`, `}`, `"`, `\`, `'`. [`index_options`](/reference/elasticsearch/mapping-reference/index-options.md) : What information should be stored in the index, for search and highlighting purposes. Valid values are `docs` and `positions`. Defaults to `docs`. From 9ec090fe4251ef42322d8a1f350f921e0e8c332f Mon Sep 17 00:00:00 2001 From: Parker Timmins Date: Thu, 2 Oct 2025 09:07:38 -0500 Subject: [PATCH 03/22] Incorrect sort settings --- docs/reference/elasticsearch/mapping-reference/text.md | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/docs/reference/elasticsearch/mapping-reference/text.md b/docs/reference/elasticsearch/mapping-reference/text.md index 2251599dd8492..4216b992eec63 100644 --- a/docs/reference/elasticsearch/mapping-reference/text.md +++ b/docs/reference/elasticsearch/mapping-reference/text.md @@ -385,12 +385,10 @@ PUT logs { "settings": { "index": { - "sort": { - "field": ["message.template_id", "@timestamp"], - "order": ["asc", "desc"] - } + "sort.field": [ "message.template_id", "@timestamp" ], + "sort.order": [ "asc", "desc" ] } - } + }, "mappings": { "properties": { "@timestamp": { From db7491fcc1a6ebfe1baa5bedd11576fa53b103c4 Mon Sep 17 00:00:00 2001 From: Parker Timmins Date: Thu, 2 Oct 2025 10:47:37 -0500 Subject: [PATCH 04/22] broken link --- docs/reference/elasticsearch/mapping-reference/text.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/reference/elasticsearch/mapping-reference/text.md b/docs/reference/elasticsearch/mapping-reference/text.md index 4216b992eec63..458c8e8a8ecf0 100644 --- a/docs/reference/elasticsearch/mapping-reference/text.md +++ b/docs/reference/elasticsearch/mapping-reference/text.md @@ -11,7 +11,7 @@ The text family includes the following field types: * [`text`](#text-field-type), the traditional field type for full-text content such as the body of an email or the description of a product. * [`match_only_text`](#match-only-text-field-type), a space-optimized variant of `text` that disables scoring and performs slower on queries that need positions. It is best suited for indexing log messages. -* [`pattern_text`](#pattern-text-text-field-type), a variant of `text` with improved space efficiency when storing log messages. +* [`pattern_text`](#pattern-text-field-type), a variant of `text` with improved space efficiency when storing log messages. ## Text field type [text-field-type] From fa2d7313a03ff761aade696387f104318cc0e810 Mon Sep 17 00:00:00 2001 From: Parker Timmins Date: Thu, 2 Oct 2025 11:31:44 -0500 Subject: [PATCH 05/22] Add applies_to badge --- docs/reference/elasticsearch/mapping-reference/text.md | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/docs/reference/elasticsearch/mapping-reference/text.md b/docs/reference/elasticsearch/mapping-reference/text.md index 458c8e8a8ecf0..a48160dffd528 100644 --- a/docs/reference/elasticsearch/mapping-reference/text.md +++ b/docs/reference/elasticsearch/mapping-reference/text.md @@ -343,8 +343,12 @@ The following mapping parameters are accepted: ## Pattern text field type [pattern-text-field-type] -::::{warning} +```{applies_to} +serverless: preview +stack: preview 9.2 +``` +::::{warning} This functionality is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features. :::: From 67aa9dd7c2b25cee46be9bf24d199ee13d6175e4 Mon Sep 17 00:00:00 2001 From: Parker Timmins Date: Tue, 7 Oct 2025 08:46:41 -0500 Subject: [PATCH 06/22] Update docs/reference/elasticsearch/mapping-reference/text.md Co-authored-by: Liam Thompson --- docs/reference/elasticsearch/mapping-reference/text.md | 3 --- 1 file changed, 3 deletions(-) diff --git a/docs/reference/elasticsearch/mapping-reference/text.md b/docs/reference/elasticsearch/mapping-reference/text.md index a48160dffd528..18aa22963b481 100644 --- a/docs/reference/elasticsearch/mapping-reference/text.md +++ b/docs/reference/elasticsearch/mapping-reference/text.md @@ -348,9 +348,6 @@ serverless: preview stack: preview 9.2 ``` -::::{warning} -This functionality is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features. -:::: A variant of [`text`](#text-field-type) with improved space efficiency for log data. Internally, it decomposes values into static parts that are likely to be shared among many values, and dynamic parts that tend to vary. From cd99011507a53ecbc949af2df53e48d812e4a9c4 Mon Sep 17 00:00:00 2001 From: Parker Timmins Date: Tue, 7 Oct 2025 08:47:28 -0500 Subject: [PATCH 07/22] Update docs/reference/elasticsearch/mapping-reference/text.md Co-authored-by: Liam Thompson --- docs/reference/elasticsearch/mapping-reference/text.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/reference/elasticsearch/mapping-reference/text.md b/docs/reference/elasticsearch/mapping-reference/text.md index 18aa22963b481..716ac8c0e6acd 100644 --- a/docs/reference/elasticsearch/mapping-reference/text.md +++ b/docs/reference/elasticsearch/mapping-reference/text.md @@ -349,7 +349,7 @@ stack: preview 9.2 ``` -A variant of [`text`](#text-field-type) with improved space efficiency for log data. +The `pattern_text` field type is a variant of [`text`](#text-field-type) with improved space efficiency for log data. Internally, it decomposes values into static parts that are likely to be shared among many values, and dynamic parts that tend to vary. The static parts usually come from the explanatory text of a log message, while the dynamic parts are the variables that were interpolated into the logs. This decomposition allows for improved compression on log-like data. From 180ba1ecbe30aa410ca9a005df2ed1e77f1bf3b9 Mon Sep 17 00:00:00 2001 From: Parker Timmins Date: Tue, 7 Oct 2025 08:48:07 -0500 Subject: [PATCH 08/22] Update docs/reference/elasticsearch/mapping-reference/text.md Co-authored-by: Liam Thompson --- docs/reference/elasticsearch/mapping-reference/text.md | 1 - 1 file changed, 1 deletion(-) diff --git a/docs/reference/elasticsearch/mapping-reference/text.md b/docs/reference/elasticsearch/mapping-reference/text.md index 716ac8c0e6acd..2fec92f5520e2 100644 --- a/docs/reference/elasticsearch/mapping-reference/text.md +++ b/docs/reference/elasticsearch/mapping-reference/text.md @@ -357,7 +357,6 @@ This decomposition allows for improved compression on log-like data. We call the static portion of the value the `template`. Although the template cannot be accessed directly, a separate field called `.template_id` is accessible. This field is a hash of the template and can be used to group similar values. -As this feature is in technical preview, the internal structure of the template is subject to change. Because of this, `.template_id` is also subject to future changes. Unlike most mapping types, `pattern_text` does not support multiple values for a given field per document. From fe11cdf1689a9b7246a90f571e613aff05bfd7b8 Mon Sep 17 00:00:00 2001 From: Parker Timmins Date: Tue, 7 Oct 2025 08:48:15 -0500 Subject: [PATCH 09/22] Update docs/reference/elasticsearch/mapping-reference/text.md Co-authored-by: Liam Thompson --- docs/reference/elasticsearch/mapping-reference/text.md | 1 - 1 file changed, 1 deletion(-) diff --git a/docs/reference/elasticsearch/mapping-reference/text.md b/docs/reference/elasticsearch/mapping-reference/text.md index 2fec92f5520e2..8c193d2285d44 100644 --- a/docs/reference/elasticsearch/mapping-reference/text.md +++ b/docs/reference/elasticsearch/mapping-reference/text.md @@ -357,7 +357,6 @@ This decomposition allows for improved compression on log-like data. We call the static portion of the value the `template`. Although the template cannot be accessed directly, a separate field called `.template_id` is accessible. This field is a hash of the template and can be used to group similar values. -Because of this, `.template_id` is also subject to future changes. Unlike most mapping types, `pattern_text` does not support multiple values for a given field per document. If a document is created with multiple values for a pattern_text field, an error will be returned. From 75aa09de3a44d388427f9fd34a302067a73f9855 Mon Sep 17 00:00:00 2001 From: Parker Timmins Date: Tue, 7 Oct 2025 08:49:21 -0500 Subject: [PATCH 10/22] Update docs/reference/elasticsearch/mapping-reference/text.md Co-authored-by: Liam Thompson --- docs/reference/elasticsearch/mapping-reference/text.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/docs/reference/elasticsearch/mapping-reference/text.md b/docs/reference/elasticsearch/mapping-reference/text.md index 8c193d2285d44..49fcb35212283 100644 --- a/docs/reference/elasticsearch/mapping-reference/text.md +++ b/docs/reference/elasticsearch/mapping-reference/text.md @@ -358,6 +358,8 @@ We call the static portion of the value the `template`. Although the template cannot be accessed directly, a separate field called `.template_id` is accessible. This field is a hash of the template and can be used to group similar values. +### Limitations + Unlike most mapping types, `pattern_text` does not support multiple values for a given field per document. If a document is created with multiple values for a pattern_text field, an error will be returned. From 54269848ba3633977a76c6cb8e6dc5027d479474 Mon Sep 17 00:00:00 2001 From: Parker Timmins Date: Tue, 7 Oct 2025 08:55:45 -0500 Subject: [PATCH 11/22] Reorder so limitations are in separate section --- docs/reference/elasticsearch/mapping-reference/text.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/reference/elasticsearch/mapping-reference/text.md b/docs/reference/elasticsearch/mapping-reference/text.md index 49fcb35212283..f7593e33770d3 100644 --- a/docs/reference/elasticsearch/mapping-reference/text.md +++ b/docs/reference/elasticsearch/mapping-reference/text.md @@ -358,14 +358,14 @@ We call the static portion of the value the `template`. Although the template cannot be accessed directly, a separate field called `.template_id` is accessible. This field is a hash of the template and can be used to group similar values. +Analysis is configurable but defaults to a delimiter-based analyzer. +This analyzer applies a lowercase filter and then splits on whitespace and the following delimiters: `=`, `?`, `:`, `[`, `]`, `{`, `}`, `"`, `\`, `'`. + ### Limitations Unlike most mapping types, `pattern_text` does not support multiple values for a given field per document. If a document is created with multiple values for a pattern_text field, an error will be returned. -Analysis is configurable but defaults to a delimiter-based analyzer. -This analyzer applies a lowercase filter and then splits on whitespace and the following delimiters: `=`, `?`, `:`, `[`, `]`, `{`, `}`, `"`, `\`, `'`. - [span queries](/reference/query-languages/query-dsl/span-queries.md) are not supported with this field, use [interval queries](/reference/query-languages/query-dsl/query-dsl-intervals-query.md) instead, or the [`text`](#text-field-type) field type if you absolutely need span queries. Like `text`, `pattern_text` does not support sorting and has only limited support for aggregations. From 4e6f29cb4c630f588aee6f5970ab1d931df7aedb Mon Sep 17 00:00:00 2001 From: Parker Timmins Date: Tue, 7 Oct 2025 08:56:18 -0500 Subject: [PATCH 12/22] Add mention of subscription --- docs/reference/elasticsearch/mapping-reference/text.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/docs/reference/elasticsearch/mapping-reference/text.md b/docs/reference/elasticsearch/mapping-reference/text.md index f7593e33770d3..7135a31362ac9 100644 --- a/docs/reference/elasticsearch/mapping-reference/text.md +++ b/docs/reference/elasticsearch/mapping-reference/text.md @@ -347,7 +347,9 @@ The following mapping parameters are accepted: serverless: preview stack: preview 9.2 ``` - +:::{note} +This feature requires a [subscription](https://www.elastic.co/subscriptions). +::: The `pattern_text` field type is a variant of [`text`](#text-field-type) with improved space efficiency for log data. Internally, it decomposes values into static parts that are likely to be shared among many values, and dynamic parts that tend to vary. From 5fd13e8c06e48347f95012cd51e74755aaed2d26 Mon Sep 17 00:00:00 2001 From: Parker Timmins Date: Tue, 7 Oct 2025 09:01:34 -0500 Subject: [PATCH 13/22] mention standard analyzer is supported --- docs/reference/elasticsearch/mapping-reference/text.md | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/docs/reference/elasticsearch/mapping-reference/text.md b/docs/reference/elasticsearch/mapping-reference/text.md index 7135a31362ac9..fc4308e5aefcc 100644 --- a/docs/reference/elasticsearch/mapping-reference/text.md +++ b/docs/reference/elasticsearch/mapping-reference/text.md @@ -411,8 +411,9 @@ PUT logs The following mapping parameters are accepted: [`analyzer`](/reference/elasticsearch/mapping-reference/analyzer.md) -: The [analyzer](docs-content://manage-data/data-store/text-analysis.md) which should be used for the `pattern_text` field, both at index-time and at search-time (unless overridden by the [`search_analyzer`](/reference/elasticsearch/mapping-reference/search-analyzer.md)). Defaults to a custom delimiter-based analyzer. -This analyzer applies a lowercase filter and then splits on whitespace and the following delimiters: `=`, `?`, `:`, `[`, `]`, `{`, `}`, `"`, `\`, `'`. +: The [analyzer](docs-content://manage-data/data-store/text-analysis.md) which should be used for the `pattern_text` field, both at index-time and at search-time (unless overridden by the [`search_analyzer`](/reference/elasticsearch/mapping-reference/search-analyzer.md)). +Supports a delimiter-based analyzer and the standard analyzer, as is used in `match_only_text` mappings. +Defaults to the delimiter-based analyzer, which applies a lowercase filter and then splits on whitespace and the following delimiters: `=`, `?`, `:`, `[`, `]`, `{`, `}`, `"`, `\`, `'`. [`index_options`](/reference/elasticsearch/mapping-reference/index-options.md) : What information should be stored in the index, for search and highlighting purposes. Valid values are `docs` and `positions`. Defaults to `docs`. From 64b2bd50cb088ad325d70a23fd7cd40c14abdf8c Mon Sep 17 00:00:00 2001 From: Parker Timmins Date: Tue, 7 Oct 2025 09:39:24 -0500 Subject: [PATCH 14/22] Split text family types into separate docs pages --- .../mapping-reference/field-data-types.md | 4 +- .../mapping-reference/match-only-text.md | 43 ++++++++++ .../mapping-reference/pattern-text.md | 85 +++++++++++++++++++ .../mapping-reference/text-type-family.md | 15 ++++ .../elasticsearch/mapping-reference/text.md | 10 --- 5 files changed, 145 insertions(+), 12 deletions(-) create mode 100644 docs/reference/elasticsearch/mapping-reference/match-only-text.md create mode 100644 docs/reference/elasticsearch/mapping-reference/pattern-text.md create mode 100644 docs/reference/elasticsearch/mapping-reference/text-type-family.md diff --git a/docs/reference/elasticsearch/mapping-reference/field-data-types.md b/docs/reference/elasticsearch/mapping-reference/field-data-types.md index 0265546eedbec..a941af41fcc0a 100644 --- a/docs/reference/elasticsearch/mapping-reference/field-data-types.md +++ b/docs/reference/elasticsearch/mapping-reference/field-data-types.md @@ -77,8 +77,8 @@ Dates ### Text search types [text-search-types] -[`text` fields](/reference/elasticsearch/mapping-reference/text.md) -: The text family, including `text` and `match_only_text`. Analyzed, unstructured text. +[`text` fields](/reference/elasticsearch/mapping-reference/text-type-family.md) +: The text family, including `text`, `match_only_text`, and `pattern_text`. Analyzed, unstructured text. [`annotated-text`](/reference/elasticsearch-plugins/mapper-annotated-text.md) : Text containing special markup. Used for identifying named entities. diff --git a/docs/reference/elasticsearch/mapping-reference/match-only-text.md b/docs/reference/elasticsearch/mapping-reference/match-only-text.md new file mode 100644 index 0000000000000..4b563969119e0 --- /dev/null +++ b/docs/reference/elasticsearch/mapping-reference/match-only-text.md @@ -0,0 +1,43 @@ +--- +navigation_title: "Match Only Text" +mapped_pages: + - https://www.elastic.co/guide/en/elasticsearch/reference/current/match-only-text.html +--- + +## Match-only text field type [match-only-text-field-type] + +A variant of [`text`](#text-field-type) that trades scoring and efficiency of positional queries for space efficiency. This field effectively stores data the same way as a `text` field that only indexes documents (`index_options: docs`) and disables norms (`norms: false`). Term queries perform as fast if not faster as on `text` fields, however queries that need positions such as the [`match_phrase` query](/reference/query-languages/query-dsl/query-dsl-match-query-phrase.md) perform slower as they need to look at the `_source` document to verify whether a phrase matches. All queries return constant scores that are equal to 1.0. + +Analysis is not configurable: text is always analyzed with the [default analyzer](docs-content://manage-data/data-store/text-analysis/specify-an-analyzer.md#specify-index-time-default-analyzer) ([`standard`](/reference/text-analysis/analysis-standard-analyzer.md) by default). + +[span queries](/reference/query-languages/query-dsl/span-queries.md) are not supported with this field, use [interval queries](/reference/query-languages/query-dsl/query-dsl-intervals-query.md) instead, or the [`text`](#text-field-type) field type if you absolutely need span queries. + +Other than that, `match_only_text` supports the same queries as `text`. And like `text`, it does not support sorting and has only limited support for aggregations. + +```console +PUT logs +{ + "mappings": { + "properties": { + "@timestamp": { + "type": "date" + }, + "message": { + "type": "match_only_text" + } + } + } +} +``` + + +## Parameters for match-only text fields [match-only-text-params] + +The following mapping parameters are accepted: + +[`fields`](/reference/elasticsearch/mapping-reference/multi-fields.md) +: Multi-fields allow the same string value to be indexed in multiple ways for different purposes, such as one field for search and a multi-field for sorting and aggregations, or the same string value analyzed by different analyzers. + +[`meta`](/reference/elasticsearch/mapping-reference/mapping-field-meta.md) +: Metadata about the field. + diff --git a/docs/reference/elasticsearch/mapping-reference/pattern-text.md b/docs/reference/elasticsearch/mapping-reference/pattern-text.md new file mode 100644 index 0000000000000..f67a6be9b214a --- /dev/null +++ b/docs/reference/elasticsearch/mapping-reference/pattern-text.md @@ -0,0 +1,85 @@ +--- +navigation_title: "Pattern Text" +mapped_pages: + - https://www.elastic.co/guide/en/elasticsearch/reference/current/pattern-text.html +--- + +## Pattern text field type [pattern-text-field-type] +```{applies_to} +serverless: preview +stack: preview 9.2 +``` +:::{note} +This feature requires a [subscription](https://www.elastic.co/subscriptions). +::: + +The `pattern_text` field type is a variant of [`text`](#text-field-type) with improved space efficiency for log data. +Internally, it decomposes values into static parts that are likely to be shared among many values, and dynamic parts that tend to vary. +The static parts usually come from the explanatory text of a log message, while the dynamic parts are the variables that were interpolated into the logs. +This decomposition allows for improved compression on log-like data. + +We call the static portion of the value the `template`. +Although the template cannot be accessed directly, a separate field called `.template_id` is accessible. +This field is a hash of the template and can be used to group similar values. + +Analysis is configurable but defaults to a delimiter-based analyzer. +This analyzer applies a lowercase filter and then splits on whitespace and the following delimiters: `=`, `?`, `:`, `[`, `]`, `{`, `}`, `"`, `\`, `'`. + +## Limitations + +Unlike most mapping types, `pattern_text` does not support multiple values for a given field per document. +If a document is created with multiple values for a pattern_text field, an error will be returned. + +[span queries](/reference/query-languages/query-dsl/span-queries.md) are not supported with this field, use [interval queries](/reference/query-languages/query-dsl/query-dsl-intervals-query.md) instead, or the [`text`](#text-field-type) field type if you absolutely need span queries. + +Like `text`, `pattern_text` does not support sorting and has only limited support for aggregations. + +## Phrase matching +Pattern text supports an `index_options` parameter with valid values of `docs` and `positions`. +The default value is `docs`, which makes `pattern_text` behave similarly to `match_only_text` for phrase queries. +Specifically, positions are not stored, which reduces the index size at the cost of slowing down phrase queries. +If `index_options` is set to `positions`, positions are stored and `pattern_text` will support fast phrase queries. +In both cases, all queries return a constant score of 1.0. + +## Index sorting for improved compression +The compression provided by `pattern_text` can be significantly improved if the index is sorted by the `template_id` field. +For example, a typical approach would be to sort first by `message.template_id`, then by `@timestamp`, as shown in the following example. + +```console +PUT logs +{ + "settings": { + "index": { + "sort.field": [ "message.template_id", "@timestamp" ], + "sort.order": [ "asc", "desc" ] + } + }, + "mappings": { + "properties": { + "@timestamp": { + "type": "date" + }, + "message": { + "type": "pattern_text" + } + } + } +} +``` + + +## Parameters for pattern text fields [pattern-text-params] + +The following mapping parameters are accepted: + +[`analyzer`](/reference/elasticsearch/mapping-reference/analyzer.md) +: The [analyzer](docs-content://manage-data/data-store/text-analysis.md) which should be used for the `pattern_text` field, both at index-time and at search-time (unless overridden by the [`search_analyzer`](/reference/elasticsearch/mapping-reference/search-analyzer.md)). +Supports a delimiter-based analyzer and the standard analyzer, as is used in `match_only_text` mappings. +Defaults to the delimiter-based analyzer, which applies a lowercase filter and then splits on whitespace and the following delimiters: `=`, `?`, `:`, `[`, `]`, `{`, `}`, `"`, `\`, `'`. + +[`index_options`](/reference/elasticsearch/mapping-reference/index-options.md) +: What information should be stored in the index, for search and highlighting purposes. Valid values are `docs` and `positions`. Defaults to `docs`. + +[`meta`](/reference/elasticsearch/mapping-reference/mapping-field-meta.md) +: Metadata about the field. + diff --git a/docs/reference/elasticsearch/mapping-reference/text-type-family.md b/docs/reference/elasticsearch/mapping-reference/text-type-family.md new file mode 100644 index 0000000000000..8e6780b8ce413 --- /dev/null +++ b/docs/reference/elasticsearch/mapping-reference/text-type-family.md @@ -0,0 +1,15 @@ +--- +navigation_title: "Text type family" +mapped_pages: + - https://www.elastic.co/guide/en/elasticsearch/reference/current/text-type-family.html +--- + +# Text type family [text] + + +The text family includes the following field types: + +* [`text`](/reference/elasticsearch/mapping-reference/text.md), the traditional field type for full-text content such as the body of an email or the description of a product. +* [`match_only_text`](/reference/elasticsearch/mapping-reference/match-only-text.md), a space-optimized variant of `text` that disables scoring and performs slower on queries that need positions. It is best suited for indexing log messages. +* [`pattern_text`](/reference/elasticsearch/mapping-reference/pattern-text.md), a variant of `text` with improved space efficiency when storing log messages. + diff --git a/docs/reference/elasticsearch/mapping-reference/text.md b/docs/reference/elasticsearch/mapping-reference/text.md index fc4308e5aefcc..95800f8add15e 100644 --- a/docs/reference/elasticsearch/mapping-reference/text.md +++ b/docs/reference/elasticsearch/mapping-reference/text.md @@ -4,16 +4,6 @@ mapped_pages: - https://www.elastic.co/guide/en/elasticsearch/reference/current/text.html --- -# Text type family [text] - - -The text family includes the following field types: - -* [`text`](#text-field-type), the traditional field type for full-text content such as the body of an email or the description of a product. -* [`match_only_text`](#match-only-text-field-type), a space-optimized variant of `text` that disables scoring and performs slower on queries that need positions. It is best suited for indexing log messages. -* [`pattern_text`](#pattern-text-field-type), a variant of `text` with improved space efficiency when storing log messages. - - ## Text field type [text-field-type] A field to index full-text values, such as the body of an email or the description of a product. These fields are `analyzed`, that is they are passed through an [analyzer](docs-content://manage-data/data-store/text-analysis.md) to convert the string into a list of individual terms before being indexed. The analysis process allows Elasticsearch to search for individual words *within* each full text field. Text fields are not used for sorting and seldom used for aggregations (although the [significant text aggregation](/reference/aggregations/search-aggregations-bucket-significanttext-aggregation.md) is a notable exception). From 1f2c619050494793d3cb6fa0ee21e773799f2526 Mon Sep 17 00:00:00 2001 From: Parker Timmins Date: Tue, 7 Oct 2025 09:41:01 -0500 Subject: [PATCH 15/22] Remove match_only_text and pattern_text from text docs page --- .../elasticsearch/mapping-reference/text.md | 119 ------------------ 1 file changed, 119 deletions(-) diff --git a/docs/reference/elasticsearch/mapping-reference/text.md b/docs/reference/elasticsearch/mapping-reference/text.md index 95800f8add15e..4ee53aef222c5 100644 --- a/docs/reference/elasticsearch/mapping-reference/text.md +++ b/docs/reference/elasticsearch/mapping-reference/text.md @@ -292,122 +292,3 @@ PUT my-index-000001 } } ``` - - -## Match-only text field type [match-only-text-field-type] - -A variant of [`text`](#text-field-type) that trades scoring and efficiency of positional queries for space efficiency. This field effectively stores data the same way as a `text` field that only indexes documents (`index_options: docs`) and disables norms (`norms: false`). Term queries perform as fast if not faster as on `text` fields, however queries that need positions such as the [`match_phrase` query](/reference/query-languages/query-dsl/query-dsl-match-query-phrase.md) perform slower as they need to look at the `_source` document to verify whether a phrase matches. All queries return constant scores that are equal to 1.0. - -Analysis is not configurable: text is always analyzed with the [default analyzer](docs-content://manage-data/data-store/text-analysis/specify-an-analyzer.md#specify-index-time-default-analyzer) ([`standard`](/reference/text-analysis/analysis-standard-analyzer.md) by default). - -[span queries](/reference/query-languages/query-dsl/span-queries.md) are not supported with this field, use [interval queries](/reference/query-languages/query-dsl/query-dsl-intervals-query.md) instead, or the [`text`](#text-field-type) field type if you absolutely need span queries. - -Other than that, `match_only_text` supports the same queries as `text`. And like `text`, it does not support sorting and has only limited support for aggregations. - -```console -PUT logs -{ - "mappings": { - "properties": { - "@timestamp": { - "type": "date" - }, - "message": { - "type": "match_only_text" - } - } - } -} -``` - - -### Parameters for match-only text fields [match-only-text-params] - -The following mapping parameters are accepted: - -[`fields`](/reference/elasticsearch/mapping-reference/multi-fields.md) -: Multi-fields allow the same string value to be indexed in multiple ways for different purposes, such as one field for search and a multi-field for sorting and aggregations, or the same string value analyzed by different analyzers. - -[`meta`](/reference/elasticsearch/mapping-reference/mapping-field-meta.md) -: Metadata about the field. - - -## Pattern text field type [pattern-text-field-type] -```{applies_to} -serverless: preview -stack: preview 9.2 -``` -:::{note} -This feature requires a [subscription](https://www.elastic.co/subscriptions). -::: - -The `pattern_text` field type is a variant of [`text`](#text-field-type) with improved space efficiency for log data. -Internally, it decomposes values into static parts that are likely to be shared among many values, and dynamic parts that tend to vary. -The static parts usually come from the explanatory text of a log message, while the dynamic parts are the variables that were interpolated into the logs. -This decomposition allows for improved compression on log-like data. - -We call the static portion of the value the `template`. -Although the template cannot be accessed directly, a separate field called `.template_id` is accessible. -This field is a hash of the template and can be used to group similar values. - -Analysis is configurable but defaults to a delimiter-based analyzer. -This analyzer applies a lowercase filter and then splits on whitespace and the following delimiters: `=`, `?`, `:`, `[`, `]`, `{`, `}`, `"`, `\`, `'`. - -### Limitations - -Unlike most mapping types, `pattern_text` does not support multiple values for a given field per document. -If a document is created with multiple values for a pattern_text field, an error will be returned. - -[span queries](/reference/query-languages/query-dsl/span-queries.md) are not supported with this field, use [interval queries](/reference/query-languages/query-dsl/query-dsl-intervals-query.md) instead, or the [`text`](#text-field-type) field type if you absolutely need span queries. - -Like `text`, `pattern_text` does not support sorting and has only limited support for aggregations. - -### Phrase matching -Pattern text supports an `index_options` parameter with valid values of `docs` and `positions`. -The default value is `docs`, which makes `pattern_text` behave similarly to `match_only_text` for phrase queries. -Specifically, positions are not stored, which reduces the index size at the cost of slowing down phrase queries. -If `index_options` is set to `positions`, positions are stored and `pattern_text` will support fast phrase queries. -In both cases, all queries return a constant score of 1.0. - -### Index sorting for improved compression -The compression provided by `pattern_text` can be significantly improved if the index is sorted by the `template_id` field. -For example, a typical approach would be to sort first by `message.template_id`, then by `@timestamp`, as shown in the following example. - -```console -PUT logs -{ - "settings": { - "index": { - "sort.field": [ "message.template_id", "@timestamp" ], - "sort.order": [ "asc", "desc" ] - } - }, - "mappings": { - "properties": { - "@timestamp": { - "type": "date" - }, - "message": { - "type": "pattern_text" - } - } - } -} -``` - - -### Parameters for pattern text fields [pattern-text-params] - -The following mapping parameters are accepted: - -[`analyzer`](/reference/elasticsearch/mapping-reference/analyzer.md) -: The [analyzer](docs-content://manage-data/data-store/text-analysis.md) which should be used for the `pattern_text` field, both at index-time and at search-time (unless overridden by the [`search_analyzer`](/reference/elasticsearch/mapping-reference/search-analyzer.md)). -Supports a delimiter-based analyzer and the standard analyzer, as is used in `match_only_text` mappings. -Defaults to the delimiter-based analyzer, which applies a lowercase filter and then splits on whitespace and the following delimiters: `=`, `?`, `:`, `[`, `]`, `{`, `}`, `"`, `\`, `'`. - -[`index_options`](/reference/elasticsearch/mapping-reference/index-options.md) -: What information should be stored in the index, for search and highlighting purposes. Valid values are `docs` and `positions`. Defaults to `docs`. - -[`meta`](/reference/elasticsearch/mapping-reference/mapping-field-meta.md) -: Metadata about the field. - From dd93135e0aec58b907d929243111adc2482e7433 Mon Sep 17 00:00:00 2001 From: Parker Timmins Date: Tue, 7 Oct 2025 10:06:52 -0500 Subject: [PATCH 16/22] Fix a few build errors --- .../elasticsearch/mapping-reference/match-only-text.md | 2 +- .../elasticsearch/mapping-reference/pattern-text.md | 2 +- docs/reference/elasticsearch/mapping-reference/text.md | 2 +- .../elasticsearch/rest-apis/highlighting-settings.md | 6 +++--- 4 files changed, 6 insertions(+), 6 deletions(-) diff --git a/docs/reference/elasticsearch/mapping-reference/match-only-text.md b/docs/reference/elasticsearch/mapping-reference/match-only-text.md index 4b563969119e0..5a05d5e98f0eb 100644 --- a/docs/reference/elasticsearch/mapping-reference/match-only-text.md +++ b/docs/reference/elasticsearch/mapping-reference/match-only-text.md @@ -4,7 +4,7 @@ mapped_pages: - https://www.elastic.co/guide/en/elasticsearch/reference/current/match-only-text.html --- -## Match-only text field type [match-only-text-field-type] +# Match-only text field type [match-only-text-field-type] A variant of [`text`](#text-field-type) that trades scoring and efficiency of positional queries for space efficiency. This field effectively stores data the same way as a `text` field that only indexes documents (`index_options: docs`) and disables norms (`norms: false`). Term queries perform as fast if not faster as on `text` fields, however queries that need positions such as the [`match_phrase` query](/reference/query-languages/query-dsl/query-dsl-match-query-phrase.md) perform slower as they need to look at the `_source` document to verify whether a phrase matches. All queries return constant scores that are equal to 1.0. diff --git a/docs/reference/elasticsearch/mapping-reference/pattern-text.md b/docs/reference/elasticsearch/mapping-reference/pattern-text.md index f67a6be9b214a..8abf847195c5d 100644 --- a/docs/reference/elasticsearch/mapping-reference/pattern-text.md +++ b/docs/reference/elasticsearch/mapping-reference/pattern-text.md @@ -4,7 +4,7 @@ mapped_pages: - https://www.elastic.co/guide/en/elasticsearch/reference/current/pattern-text.html --- -## Pattern text field type [pattern-text-field-type] +# Pattern text field type [pattern-text-field-type] ```{applies_to} serverless: preview stack: preview 9.2 diff --git a/docs/reference/elasticsearch/mapping-reference/text.md b/docs/reference/elasticsearch/mapping-reference/text.md index 4ee53aef222c5..962e4a15e8482 100644 --- a/docs/reference/elasticsearch/mapping-reference/text.md +++ b/docs/reference/elasticsearch/mapping-reference/text.md @@ -4,7 +4,7 @@ mapped_pages: - https://www.elastic.co/guide/en/elasticsearch/reference/current/text.html --- -## Text field type [text-field-type] +# Text field type [text-field-type] A field to index full-text values, such as the body of an email or the description of a product. These fields are `analyzed`, that is they are passed through an [analyzer](docs-content://manage-data/data-store/text-analysis.md) to convert the string into a list of individual terms before being indexed. The analysis process allows Elasticsearch to search for individual words *within* each full text field. Text fields are not used for sorting and seldom used for aggregations (although the [significant text aggregation](/reference/aggregations/search-aggregations-bucket-significanttext-aggregation.md) is a notable exception). diff --git a/docs/reference/elasticsearch/rest-apis/highlighting-settings.md b/docs/reference/elasticsearch/rest-apis/highlighting-settings.md index a1874cc637f9d..ff3fe35f45439 100644 --- a/docs/reference/elasticsearch/rest-apis/highlighting-settings.md +++ b/docs/reference/elasticsearch/rest-apis/highlighting-settings.md @@ -51,10 +51,10 @@ encoder : Indicates if the snippet should be HTML encoded: `default` (no encoding) or `html` (HTML-escape the snippet text and then insert the highlighting tags) fields -: Specifies the fields to retrieve highlights for. You can use wildcards to specify fields. For example, you could specify `comment_*` to get highlights for all [text](/reference/elasticsearch/mapping-reference/text.md), [match_only_text](/reference/elasticsearch/mapping-reference/text.md#match-only-text-field-type), and [keyword](/reference/elasticsearch/mapping-reference/keyword.md) fields that start with `comment_`. +: Specifies the fields to retrieve highlights for. You can use wildcards to specify fields. For example, you could specify `comment_*` to get highlights for all [text](/reference/elasticsearch/mapping-reference/text.md), [match_only_text](/reference/elasticsearch/mapping-reference/match-only-text.md), [pattern_text](/reference/elasticsearch/mapping-reference/pattern-text.md), and [keyword](/reference/elasticsearch/mapping-reference/keyword.md) fields that start with `comment_`. ::::{note} - Only text, match_only_text, and keyword fields are highlighted when you use wildcards. If you use a custom mapper and want to highlight on a field anyway, you must explicitly specify that field name. + Only text, match_only_text, pattern_text, and keyword fields are highlighted when you use wildcards. If you use a custom mapper and want to highlight on a field anyway, you must explicitly specify that field name. :::: $$$fragmenter$$$ @@ -147,4 +147,4 @@ tags_schema $$$highlighter-type$$$ type -: The highlighter to use: `unified`, `plain`, or `fvh`. Defaults to `unified`. \ No newline at end of file +: The highlighter to use: `unified`, `plain`, or `fvh`. Defaults to `unified`. From 90933cab528e8585e0e585747d9dba03bef3911d Mon Sep 17 00:00:00 2001 From: Parker Timmins Date: Tue, 7 Oct 2025 13:53:44 -0500 Subject: [PATCH 17/22] Add redirects, fix some anchors --- docs/redirects.yml | 5 +++++ .../elasticsearch/mapping-reference/match-only-text.md | 4 ++-- .../elasticsearch/mapping-reference/pattern-text.md | 2 +- 3 files changed, 8 insertions(+), 3 deletions(-) diff --git a/docs/redirects.yml b/docs/redirects.yml index 5d0c161511b8a..f316cdc943a99 100644 --- a/docs/redirects.yml +++ b/docs/redirects.yml @@ -105,3 +105,8 @@ redirects: 'reference/query-languages/esql/kibana/docs/functions/st_geohash_to_long.md': 'reference/query-languages/esql/esql-functions-operators.md' 'reference/query-languages/esql/kibana/docs/functions/st_geotile_to_long.md': 'reference/query-languages/esql/esql-functions-operators.md' 'reference/query-languages/esql/kibana/docs/functions/st_geohex_to_long.md': 'reference/query-languages/esql/esql-functions-operators.md' + + 'reference/elasticsearch/mapping-reference/text.md': + anchors: + 'match-only-text-field-type': 'reference/elasticsearch/mapping-reference/match-only-text.md#match-only-text-field-type' + 'match-only-text-params': 'reference/elasticsearch/mapping-reference/match-only-text.md#match-only-text-params' diff --git a/docs/reference/elasticsearch/mapping-reference/match-only-text.md b/docs/reference/elasticsearch/mapping-reference/match-only-text.md index 5a05d5e98f0eb..8db554d046488 100644 --- a/docs/reference/elasticsearch/mapping-reference/match-only-text.md +++ b/docs/reference/elasticsearch/mapping-reference/match-only-text.md @@ -6,11 +6,11 @@ mapped_pages: # Match-only text field type [match-only-text-field-type] -A variant of [`text`](#text-field-type) that trades scoring and efficiency of positional queries for space efficiency. This field effectively stores data the same way as a `text` field that only indexes documents (`index_options: docs`) and disables norms (`norms: false`). Term queries perform as fast if not faster as on `text` fields, however queries that need positions such as the [`match_phrase` query](/reference/query-languages/query-dsl/query-dsl-match-query-phrase.md) perform slower as they need to look at the `_source` document to verify whether a phrase matches. All queries return constant scores that are equal to 1.0. +A variant of [`text`](/reference/elasticsearch/mapping-reference/text.md) that trades scoring and efficiency of positional queries for space efficiency. This field effectively stores data the same way as a `text` field that only indexes documents (`index_options: docs`) and disables norms (`norms: false`). Term queries perform as fast if not faster as on `text` fields, however queries that need positions such as the [`match_phrase` query](/reference/query-languages/query-dsl/query-dsl-match-query-phrase.md) perform slower as they need to look at the `_source` document to verify whether a phrase matches. All queries return constant scores that are equal to 1.0. Analysis is not configurable: text is always analyzed with the [default analyzer](docs-content://manage-data/data-store/text-analysis/specify-an-analyzer.md#specify-index-time-default-analyzer) ([`standard`](/reference/text-analysis/analysis-standard-analyzer.md) by default). -[span queries](/reference/query-languages/query-dsl/span-queries.md) are not supported with this field, use [interval queries](/reference/query-languages/query-dsl/query-dsl-intervals-query.md) instead, or the [`text`](#text-field-type) field type if you absolutely need span queries. +[span queries](/reference/query-languages/query-dsl/span-queries.md) are not supported with this field, use [interval queries](/reference/query-languages/query-dsl/query-dsl-intervals-query.md) instead, or the [`text`](/reference/elasticsearch/mapping-reference/text.md) field type if you absolutely need span queries. Other than that, `match_only_text` supports the same queries as `text`. And like `text`, it does not support sorting and has only limited support for aggregations. diff --git a/docs/reference/elasticsearch/mapping-reference/pattern-text.md b/docs/reference/elasticsearch/mapping-reference/pattern-text.md index 8abf847195c5d..7b914cf37c1ba 100644 --- a/docs/reference/elasticsearch/mapping-reference/pattern-text.md +++ b/docs/reference/elasticsearch/mapping-reference/pattern-text.md @@ -13,7 +13,7 @@ stack: preview 9.2 This feature requires a [subscription](https://www.elastic.co/subscriptions). ::: -The `pattern_text` field type is a variant of [`text`](#text-field-type) with improved space efficiency for log data. +The `pattern_text` field type is a variant of [`text`](/reference/elasticsearch/mapping-reference/text.md) with improved space efficiency for log data. Internally, it decomposes values into static parts that are likely to be shared among many values, and dynamic parts that tend to vary. The static parts usually come from the explanatory text of a log message, while the dynamic parts are the variables that were interpolated into the logs. This decomposition allows for improved compression on log-like data. From db1ede374486e294298d4763b21bfb5fc6074fe9 Mon Sep 17 00:00:00 2001 From: Parker Timmins Date: Tue, 7 Oct 2025 14:20:56 -0500 Subject: [PATCH 18/22] Change redirect syntax --- docs/redirects.yml | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/docs/redirects.yml b/docs/redirects.yml index f316cdc943a99..860d4ac06ed97 100644 --- a/docs/redirects.yml +++ b/docs/redirects.yml @@ -107,6 +107,8 @@ redirects: 'reference/query-languages/esql/kibana/docs/functions/st_geohex_to_long.md': 'reference/query-languages/esql/esql-functions-operators.md' 'reference/elasticsearch/mapping-reference/text.md': - anchors: - 'match-only-text-field-type': 'reference/elasticsearch/mapping-reference/match-only-text.md#match-only-text-field-type' - 'match-only-text-params': 'reference/elasticsearch/mapping-reference/match-only-text.md#match-only-text-params' + to: 'reference/elasticsearch/mapping-reference/text.md' + anchors: { } # pass-through unlisted anchors in the `many` ruleset + many: + - to: 'reference/elasticsearch/mapping-reference/match-only-text.md' + anchors: { 'match-only-text-field-type', 'match-only-text-params' } From acb57e2a22e848e7c9eb6150bf021effb41767a0 Mon Sep 17 00:00:00 2001 From: Parker Timmins Date: Tue, 7 Oct 2025 14:30:36 -0500 Subject: [PATCH 19/22] Add to toc.yml --- docs/reference/elasticsearch/toc.yml | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/docs/reference/elasticsearch/toc.yml b/docs/reference/elasticsearch/toc.yml index fa1e66aa2364c..c1e67d31ff9eb 100644 --- a/docs/reference/elasticsearch/toc.yml +++ b/docs/reference/elasticsearch/toc.yml @@ -174,7 +174,11 @@ toc: - file: mapping-reference/semantic-text.md - file: mapping-reference/shape.md - file: mapping-reference/sparse-vector.md - - file: mapping-reference/text.md + - file: mapping-reference/text-type-family.md + children: + - file: mapping-reference/text.md + - file: mapping-reference/pattern-text.md + - file: mapping-reference/match-only-text-text.md - file: mapping-reference/token-count.md - file: mapping-reference/unsigned-long.md - file: mapping-reference/version.md From 823863f0fbe48fb8f5765a1622eacf92b3bfae25 Mon Sep 17 00:00:00 2001 From: Parker Timmins Date: Tue, 7 Oct 2025 14:33:06 -0500 Subject: [PATCH 20/22] Fix incorrect file name --- docs/reference/elasticsearch/toc.yml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/reference/elasticsearch/toc.yml b/docs/reference/elasticsearch/toc.yml index c1e67d31ff9eb..748929ea120a4 100644 --- a/docs/reference/elasticsearch/toc.yml +++ b/docs/reference/elasticsearch/toc.yml @@ -178,7 +178,7 @@ toc: children: - file: mapping-reference/text.md - file: mapping-reference/pattern-text.md - - file: mapping-reference/match-only-text-text.md + - file: mapping-reference/match-only-text.md - file: mapping-reference/token-count.md - file: mapping-reference/unsigned-long.md - file: mapping-reference/version.md From ab3d63fc252b1112690c122515b59c152e8a207f Mon Sep 17 00:00:00 2001 From: Parker Timmins Date: Tue, 7 Oct 2025 14:37:39 -0500 Subject: [PATCH 21/22] Another incorrect anchor --- docs/reference/elasticsearch/mapping-reference/pattern-text.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/reference/elasticsearch/mapping-reference/pattern-text.md b/docs/reference/elasticsearch/mapping-reference/pattern-text.md index 7b914cf37c1ba..3bf04523cee66 100644 --- a/docs/reference/elasticsearch/mapping-reference/pattern-text.md +++ b/docs/reference/elasticsearch/mapping-reference/pattern-text.md @@ -30,7 +30,7 @@ This analyzer applies a lowercase filter and then splits on whitespace and the f Unlike most mapping types, `pattern_text` does not support multiple values for a given field per document. If a document is created with multiple values for a pattern_text field, an error will be returned. -[span queries](/reference/query-languages/query-dsl/span-queries.md) are not supported with this field, use [interval queries](/reference/query-languages/query-dsl/query-dsl-intervals-query.md) instead, or the [`text`](#text-field-type) field type if you absolutely need span queries. +[span queries](/reference/query-languages/query-dsl/span-queries.md) are not supported with this field, use [interval queries](/reference/query-languages/query-dsl/query-dsl-intervals-query.md) instead, or the [`text`](/reference/elasticsearch/mapping-reference/text.md) field type if you absolutely need span queries. Like `text`, `pattern_text` does not support sorting and has only limited support for aggregations. From a6d1c80358076332d16f83ad0bf380e752637f09 Mon Sep 17 00:00:00 2001 From: Parker Timmins Date: Tue, 7 Oct 2025 14:57:49 -0500 Subject: [PATCH 22/22] Change pattern_text description --- .../elasticsearch/mapping-reference/text-type-family.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/reference/elasticsearch/mapping-reference/text-type-family.md b/docs/reference/elasticsearch/mapping-reference/text-type-family.md index 8e6780b8ce413..d6a6d85553ff5 100644 --- a/docs/reference/elasticsearch/mapping-reference/text-type-family.md +++ b/docs/reference/elasticsearch/mapping-reference/text-type-family.md @@ -11,5 +11,5 @@ The text family includes the following field types: * [`text`](/reference/elasticsearch/mapping-reference/text.md), the traditional field type for full-text content such as the body of an email or the description of a product. * [`match_only_text`](/reference/elasticsearch/mapping-reference/match-only-text.md), a space-optimized variant of `text` that disables scoring and performs slower on queries that need positions. It is best suited for indexing log messages. -* [`pattern_text`](/reference/elasticsearch/mapping-reference/pattern-text.md), a variant of `text` with improved space efficiency when storing log messages. +* [`pattern_text`](/reference/elasticsearch/mapping-reference/pattern-text.md), a variant of `text` which is optimized for log messages which contain sequences that are shared between many messages. By compressing these shared sequences, `pattern_text` provides improved space efficiency relative to `match_only_text`.