Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
8698e45
First version of pattern_text docs
parkertimmins Oct 1, 2025
657670c
Some syntax fixes
parkertimmins Oct 2, 2025
9ec090f
Incorrect sort settings
parkertimmins Oct 2, 2025
db7491f
broken link
parkertimmins Oct 2, 2025
fa2d731
Add applies_to badge
parkertimmins Oct 2, 2025
67aa9dd
Update docs/reference/elasticsearch/mapping-reference/text.md
parkertimmins Oct 7, 2025
cd99011
Update docs/reference/elasticsearch/mapping-reference/text.md
parkertimmins Oct 7, 2025
180ba1e
Update docs/reference/elasticsearch/mapping-reference/text.md
parkertimmins Oct 7, 2025
fe11cdf
Update docs/reference/elasticsearch/mapping-reference/text.md
parkertimmins Oct 7, 2025
75aa09d
Update docs/reference/elasticsearch/mapping-reference/text.md
parkertimmins Oct 7, 2025
5426984
Reorder so limitations are in separate section
parkertimmins Oct 7, 2025
4e6f29c
Add mention of subscription
parkertimmins Oct 7, 2025
5fd13e8
mention standard analyzer is supported
parkertimmins Oct 7, 2025
64b2bd5
Split text family types into separate docs pages
parkertimmins Oct 7, 2025
1f2c619
Remove match_only_text and pattern_text from text docs page
parkertimmins Oct 7, 2025
438454b
Merge branch 'main' into parker/pattern-text-docs
parkertimmins Oct 7, 2025
dd93135
Fix a few build errors
parkertimmins Oct 7, 2025
90933ca
Add redirects, fix some anchors
parkertimmins Oct 7, 2025
db1ede3
Change redirect syntax
parkertimmins Oct 7, 2025
acb57e2
Add to toc.yml
parkertimmins Oct 7, 2025
823863f
Fix incorrect file name
parkertimmins Oct 7, 2025
ab3d63f
Another incorrect anchor
parkertimmins Oct 7, 2025
a6d1c80
Change pattern_text description
parkertimmins Oct 7, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions docs/redirects.yml
Original file line number Diff line number Diff line change
Expand Up @@ -105,3 +105,10 @@ redirects:
'reference/query-languages/esql/kibana/docs/functions/st_geohash_to_long.md': 'reference/query-languages/esql/esql-functions-operators.md'
'reference/query-languages/esql/kibana/docs/functions/st_geotile_to_long.md': 'reference/query-languages/esql/esql-functions-operators.md'
'reference/query-languages/esql/kibana/docs/functions/st_geohex_to_long.md': 'reference/query-languages/esql/esql-functions-operators.md'

'reference/elasticsearch/mapping-reference/text.md':
to: 'reference/elasticsearch/mapping-reference/text.md'
anchors: { } # pass-through unlisted anchors in the `many` ruleset
many:
- to: 'reference/elasticsearch/mapping-reference/match-only-text.md'
anchors: { 'match-only-text-field-type', 'match-only-text-params' }
Original file line number Diff line number Diff line change
Expand Up @@ -77,8 +77,8 @@ Dates

### Text search types [text-search-types]

[`text` fields](/reference/elasticsearch/mapping-reference/text.md)
: The text family, including `text` and `match_only_text`. Analyzed, unstructured text.
[`text` fields](/reference/elasticsearch/mapping-reference/text-type-family.md)
: The text family, including `text`, `match_only_text`, and `pattern_text`. Analyzed, unstructured text.

[`annotated-text`](/reference/elasticsearch-plugins/mapper-annotated-text.md)
: Text containing special markup. Used for identifying named entities.
Expand Down
43 changes: 43 additions & 0 deletions docs/reference/elasticsearch/mapping-reference/match-only-text.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
---
navigation_title: "Match Only Text"
mapped_pages:
- https://www.elastic.co/guide/en/elasticsearch/reference/current/match-only-text.html
---

# Match-only text field type [match-only-text-field-type]

A variant of [`text`](/reference/elasticsearch/mapping-reference/text.md) that trades scoring and efficiency of positional queries for space efficiency. This field effectively stores data the same way as a `text` field that only indexes documents (`index_options: docs`) and disables norms (`norms: false`). Term queries perform as fast if not faster as on `text` fields, however queries that need positions such as the [`match_phrase` query](/reference/query-languages/query-dsl/query-dsl-match-query-phrase.md) perform slower as they need to look at the `_source` document to verify whether a phrase matches. All queries return constant scores that are equal to 1.0.

Analysis is not configurable: text is always analyzed with the [default analyzer](docs-content://manage-data/data-store/text-analysis/specify-an-analyzer.md#specify-index-time-default-analyzer) ([`standard`](/reference/text-analysis/analysis-standard-analyzer.md) by default).

[span queries](/reference/query-languages/query-dsl/span-queries.md) are not supported with this field, use [interval queries](/reference/query-languages/query-dsl/query-dsl-intervals-query.md) instead, or the [`text`](/reference/elasticsearch/mapping-reference/text.md) field type if you absolutely need span queries.

Other than that, `match_only_text` supports the same queries as `text`. And like `text`, it does not support sorting and has only limited support for aggregations.

```console
PUT logs
{
"mappings": {
"properties": {
"@timestamp": {
"type": "date"
},
"message": {
"type": "match_only_text"
}
}
}
}
```


## Parameters for match-only text fields [match-only-text-params]

The following mapping parameters are accepted:

[`fields`](/reference/elasticsearch/mapping-reference/multi-fields.md)
: Multi-fields allow the same string value to be indexed in multiple ways for different purposes, such as one field for search and a multi-field for sorting and aggregations, or the same string value analyzed by different analyzers.

[`meta`](/reference/elasticsearch/mapping-reference/mapping-field-meta.md)
: Metadata about the field.

85 changes: 85 additions & 0 deletions docs/reference/elasticsearch/mapping-reference/pattern-text.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
---
navigation_title: "Pattern Text"
mapped_pages:
- https://www.elastic.co/guide/en/elasticsearch/reference/current/pattern-text.html
---

# Pattern text field type [pattern-text-field-type]
```{applies_to}
serverless: preview
stack: preview 9.2
```
:::{note}
This feature requires a [subscription](https://www.elastic.co/subscriptions).
:::

The `pattern_text` field type is a variant of [`text`](/reference/elasticsearch/mapping-reference/text.md) with improved space efficiency for log data.
Internally, it decomposes values into static parts that are likely to be shared among many values, and dynamic parts that tend to vary.
The static parts usually come from the explanatory text of a log message, while the dynamic parts are the variables that were interpolated into the logs.
This decomposition allows for improved compression on log-like data.

We call the static portion of the value the `template`.
Although the template cannot be accessed directly, a separate field called `<field_name>.template_id` is accessible.
This field is a hash of the template and can be used to group similar values.

Analysis is configurable but defaults to a delimiter-based analyzer.
This analyzer applies a lowercase filter and then splits on whitespace and the following delimiters: `=`, `?`, `:`, `[`, `]`, `{`, `}`, `"`, `\`, `'`.

## Limitations

Unlike most mapping types, `pattern_text` does not support multiple values for a given field per document.
If a document is created with multiple values for a pattern_text field, an error will be returned.

[span queries](/reference/query-languages/query-dsl/span-queries.md) are not supported with this field, use [interval queries](/reference/query-languages/query-dsl/query-dsl-intervals-query.md) instead, or the [`text`](/reference/elasticsearch/mapping-reference/text.md) field type if you absolutely need span queries.

Like `text`, `pattern_text` does not support sorting and has only limited support for aggregations.

## Phrase matching
Pattern text supports an `index_options` parameter with valid values of `docs` and `positions`.
The default value is `docs`, which makes `pattern_text` behave similarly to `match_only_text` for phrase queries.
Specifically, positions are not stored, which reduces the index size at the cost of slowing down phrase queries.
If `index_options` is set to `positions`, positions are stored and `pattern_text` will support fast phrase queries.
In both cases, all queries return a constant score of 1.0.

## Index sorting for improved compression
The compression provided by `pattern_text` can be significantly improved if the index is sorted by the `template_id` field.
For example, a typical approach would be to sort first by `message.template_id`, then by `@timestamp`, as shown in the following example.

```console
PUT logs
{
"settings": {
"index": {
"sort.field": [ "message.template_id", "@timestamp" ],
"sort.order": [ "asc", "desc" ]
}
},
"mappings": {
"properties": {
"@timestamp": {
"type": "date"
},
"message": {
"type": "pattern_text"
}
}
}
}
```


## Parameters for pattern text fields [pattern-text-params]

The following mapping parameters are accepted:

[`analyzer`](/reference/elasticsearch/mapping-reference/analyzer.md)
: The [analyzer](docs-content://manage-data/data-store/text-analysis.md) which should be used for the `pattern_text` field, both at index-time and at search-time (unless overridden by the [`search_analyzer`](/reference/elasticsearch/mapping-reference/search-analyzer.md)).
Supports a delimiter-based analyzer and the standard analyzer, as is used in `match_only_text` mappings.
Defaults to the delimiter-based analyzer, which applies a lowercase filter and then splits on whitespace and the following delimiters: `=`, `?`, `:`, `[`, `]`, `{`, `}`, `"`, `\`, `'`.

[`index_options`](/reference/elasticsearch/mapping-reference/index-options.md)
: What information should be stored in the index, for search and highlighting purposes. Valid values are `docs` and `positions`. Defaults to `docs`.

[`meta`](/reference/elasticsearch/mapping-reference/mapping-field-meta.md)
: Metadata about the field.

15 changes: 15 additions & 0 deletions docs/reference/elasticsearch/mapping-reference/text-type-family.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
---
navigation_title: "Text type family"
mapped_pages:
- https://www.elastic.co/guide/en/elasticsearch/reference/current/text-type-family.html
---

# Text type family [text]


The text family includes the following field types:

* [`text`](/reference/elasticsearch/mapping-reference/text.md), the traditional field type for full-text content such as the body of an email or the description of a product.
* [`match_only_text`](/reference/elasticsearch/mapping-reference/match-only-text.md), a space-optimized variant of `text` that disables scoring and performs slower on queries that need positions. It is best suited for indexing log messages.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about this?

Suggested change
* [`match_only_text`](/reference/elasticsearch/mapping-reference/match-only-text.md), a space-optimized variant of `text` that disables scoring and performs slower on queries that need positions. It is best suited for indexing log messages.
* [`match_only_text`](/reference/elasticsearch/mapping-reference/match-only-text.md), a variant of `text` field type with limited functionality. Scoring is always disabled and always uses the `standard` analyzer. It suited for match only free text uses cases. Meaning that the fact that there is a match is important, but scoring and where the match happens isn't relevant. Note hat positional queries are possible, but are slow.

* [`pattern_text`](/reference/elasticsearch/mapping-reference/pattern-text.md), a variant of `text` which is optimized for log messages which contain sequences that are shared between many messages. By compressing these shared sequences, `pattern_text` provides improved space efficiency relative to `match_only_text`.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jordan-powers and @martijnvg What do ya'll think of this blurb? I'm having trouble coming up with a description that is succinct and described the difference between pattern_text and match_only_text.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Match-only text is targeted for match only cases, and relevance and where matches happens are not important. This can be achieved with text field type as well. So it is more pre-configured instance of text for specific use cases. This means less storage is needed at the cost of losing normalization and fast positional queries.

The pattern_test field type is more suite to index short repeating messages like log messages. That gives a real space saving benefit.

So I would document the differences with these things in mind.


51 changes: 1 addition & 50 deletions docs/reference/elasticsearch/mapping-reference/text.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,16 +4,7 @@ mapped_pages:
- https://www.elastic.co/guide/en/elasticsearch/reference/current/text.html
---

# Text type family [text]


The text family includes the following field types:

* [`text`](#text-field-type), the traditional field type for full-text content such as the body of an email or the description of a product.
* [`match_only_text`](#match-only-text-field-type), a space-optimized variant of `text` that disables scoring and performs slower on queries that need positions. It is best suited for indexing log messages.


## Text field type [text-field-type]
# Text field type [text-field-type]

A field to index full-text values, such as the body of an email or the description of a product. These fields are `analyzed`, that is they are passed through an [analyzer](docs-content://manage-data/data-store/text-analysis.md) to convert the string into a list of individual terms before being indexed. The analysis process allows Elasticsearch to search for individual words *within* each full text field. Text fields are not used for sorting and seldom used for aggregations (although the [significant text aggregation](/reference/aggregations/search-aggregations-bucket-significanttext-aggregation.md) is a notable exception).

Expand Down Expand Up @@ -301,43 +292,3 @@ PUT my-index-000001
}
}
```


## Match-only text field type [match-only-text-field-type]

A variant of [`text`](#text-field-type) that trades scoring and efficiency of positional queries for space efficiency. This field effectively stores data the same way as a `text` field that only indexes documents (`index_options: docs`) and disables norms (`norms: false`). Term queries perform as fast if not faster as on `text` fields, however queries that need positions such as the [`match_phrase` query](/reference/query-languages/query-dsl/query-dsl-match-query-phrase.md) perform slower as they need to look at the `_source` document to verify whether a phrase matches. All queries return constant scores that are equal to 1.0.

Analysis is not configurable: text is always analyzed with the [default analyzer](docs-content://manage-data/data-store/text-analysis/specify-an-analyzer.md#specify-index-time-default-analyzer) ([`standard`](/reference/text-analysis/analysis-standard-analyzer.md) by default).

[span queries](/reference/query-languages/query-dsl/span-queries.md) are not supported with this field, use [interval queries](/reference/query-languages/query-dsl/query-dsl-intervals-query.md) instead, or the [`text`](#text-field-type) field type if you absolutely need span queries.

Other than that, `match_only_text` supports the same queries as `text`. And like `text`, it does not support sorting and has only limited support for aggregations.

```console
PUT logs
{
"mappings": {
"properties": {
"@timestamp": {
"type": "date"
},
"message": {
"type": "match_only_text"
}
}
}
}
```


### Parameters for match-only text fields [match-only-text-params]

The following mapping parameters are accepted:

[`fields`](/reference/elasticsearch/mapping-reference/multi-fields.md)
: Multi-fields allow the same string value to be indexed in multiple ways for different purposes, such as one field for search and a multi-field for sorting and aggregations, or the same string value analyzed by different analyzers.

[`meta`](/reference/elasticsearch/mapping-reference/mapping-field-meta.md)
: Metadata about the field.


Original file line number Diff line number Diff line change
Expand Up @@ -51,10 +51,10 @@ encoder
: Indicates if the snippet should be HTML encoded: `default` (no encoding) or `html` (HTML-escape the snippet text and then insert the highlighting tags)

fields
: Specifies the fields to retrieve highlights for. You can use wildcards to specify fields. For example, you could specify `comment_*` to get highlights for all [text](/reference/elasticsearch/mapping-reference/text.md), [match_only_text](/reference/elasticsearch/mapping-reference/text.md#match-only-text-field-type), and [keyword](/reference/elasticsearch/mapping-reference/keyword.md) fields that start with `comment_`.
: Specifies the fields to retrieve highlights for. You can use wildcards to specify fields. For example, you could specify `comment_*` to get highlights for all [text](/reference/elasticsearch/mapping-reference/text.md), [match_only_text](/reference/elasticsearch/mapping-reference/match-only-text.md), [pattern_text](/reference/elasticsearch/mapping-reference/pattern-text.md), and [keyword](/reference/elasticsearch/mapping-reference/keyword.md) fields that start with `comment_`.

::::{note}
Only text, match_only_text, and keyword fields are highlighted when you use wildcards. If you use a custom mapper and want to highlight on a field anyway, you must explicitly specify that field name.
Only text, match_only_text, pattern_text, and keyword fields are highlighted when you use wildcards. If you use a custom mapper and want to highlight on a field anyway, you must explicitly specify that field name.
::::

$$$fragmenter$$$
Expand Down Expand Up @@ -147,4 +147,4 @@ tags_schema
$$$highlighter-type$$$

type
: The highlighter to use: `unified`, `plain`, or `fvh`. Defaults to `unified`.
: The highlighter to use: `unified`, `plain`, or `fvh`. Defaults to `unified`.
6 changes: 5 additions & 1 deletion docs/reference/elasticsearch/toc.yml
Original file line number Diff line number Diff line change
Expand Up @@ -174,7 +174,11 @@ toc:
- file: mapping-reference/semantic-text.md
- file: mapping-reference/shape.md
- file: mapping-reference/sparse-vector.md
- file: mapping-reference/text.md
- file: mapping-reference/text-type-family.md
children:
- file: mapping-reference/text.md
- file: mapping-reference/pattern-text.md
- file: mapping-reference/match-only-text.md
- file: mapping-reference/token-count.md
- file: mapping-reference/unsigned-long.md
- file: mapping-reference/version.md
Expand Down
Loading