Skip to content
Merged
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
67 changes: 33 additions & 34 deletions docs/reference/elasticsearch/mapping-reference/semantic-text.md
Original file line number Diff line number Diff line change
Expand Up @@ -112,32 +112,11 @@ to create the endpoint. If not specified, the {{infer}} endpoint defined by
`inference_id` will be used at both index and query time.

`index_options`
: (Optional, string) Specifies the index options to override default values
: (Optional, object) Specifies the index options to override default values
for the field. Currently, `dense_vector` index options are supported.
For text embeddings, `index_options` may match any allowed
[dense_vector index options](/reference/elasticsearch/mapping-reference/dense-vector.md#dense-vector-index-options).

An example of how to set index_options for a `semantic_text` field:

```console
PUT my-index-000004
{
"mappings": {
"properties": {
"inference_field": {
"type": "semantic_text",
"inference_id": "my-text-embedding-endpoint",
"index_options": {
"dense_vector": {
"type": "int4_flat"
}
}
}
}
}
}
```

`chunking_settings`
: (Optional, object) Settings for chunking text into smaller passages.
If specified, these will override the chunking settings set in the {{infer-cap}}
Expand Down Expand Up @@ -315,18 +294,38 @@ specified. It enables you to quickstart your semantic search by providing
automatic {{infer}} and a dedicated query so you don’t need to provide further
details.

In case you want to customize data indexing, use the [
`sparse_vector`](/reference/elasticsearch/mapping-reference/sparse-vector.md)
or [`dense_vector`](/reference/elasticsearch/mapping-reference/dense-vector.md)
field types and create an ingest pipeline with
an [{{infer}} processor](/reference/enrich-processor/inference-processor.md) to
generate the
embeddings. [This tutorial](docs-content://solutions/search/semantic-search/semantic-search-inference.md)
walks you through the process. In these cases - when you use `sparse_vector` or
`dense_vector` field types instead of the `semantic_text` field type to
customize indexing - using the [
`semantic_query`](/reference/query-languages/query-dsl/query-dsl-semantic-query.md)
is not supported for querying the field data.
If you want to override those defaults and customize the embeddings that
`semantic_text` stores, you can do so by modifying <<semantic-text-params,
parameters>>:

- Use `index_options` to specify alternate index options such as specific
`dense_vector` quantization methods
- Use `chunking_settings` to override the chunking strategy associated with the
{{infer}} endpoint, or completely disable chunking using the `none` type

Here is an example of how to set these parameters for a text embedding endpoint:

```console
PUT my-index-000004
{
"mappings": {
"properties": {
"inference_field": {
"type": "semantic_text",
"inference_id": "my-text-embedding-endpoint",
"index_options": {
"dense_vector": {
"type": "int4_flat"
}
},
"chunking_settings": {
"type": "none"
}
}
}
}
}
```

## Updates to `semantic_text` fields [update-script]

Expand Down
Loading