|
2 | 2 | navigation_title: "Semantic text"
|
3 | 3 | mapped_pages:
|
4 | 4 | - https://www.elastic.co/guide/en/elasticsearch/reference/current/semantic-text.html
|
| 5 | +applies_to: |
| 6 | + stack: ga 9.0 |
| 7 | + serverless: ga |
5 | 8 | ---
|
6 | 9 |
|
7 | 10 | # Semantic text field type [semantic-text]
|
@@ -29,7 +32,8 @@ service.
|
29 | 32 | Using `semantic_text`, you won’t need to specify how to generate embeddings for
|
30 | 33 | your data, or how to index it. The {{infer}} endpoint automatically determines
|
31 | 34 | the embedding generation, indexing, and query to use.
|
32 |
| -Newly created indices with `semantic_text` fields using dense embeddings will be |
| 35 | + |
| 36 | +{applies_to}`stack: ga 9.1` Newly created indices with `semantic_text` fields using dense embeddings will be |
33 | 37 | [quantized](/reference/elasticsearch/mapping-reference/dense-vector.md#dense-vector-quantization)
|
34 | 38 | to `bbq_hnsw` automatically.
|
35 | 39 |
|
@@ -182,6 +186,15 @@ For more details on chunking and how to configure chunking settings,
|
182 | 186 | see [Configuring chunking](https://www.elastic.co/docs/api/doc/elasticsearch/group/endpoint-inference)
|
183 | 187 | in the Inference API documentation.
|
184 | 188 |
|
| 189 | +Refer |
| 190 | +to [this tutorial](docs-content://solutions/search/semantic-search/semantic-search-semantic-text.md) |
| 191 | +to learn more about semantic search using `semantic_text`. |
| 192 | + |
| 193 | +### Pre-chunking [pre-chunking] |
| 194 | +```{applies_to} |
| 195 | +stack: ga 9.1 |
| 196 | +``` |
| 197 | + |
185 | 198 | You can pre-chunk the input by sending it to Elasticsearch as an array of
|
186 | 199 | strings.
|
187 | 200 | Example:
|
@@ -228,10 +241,6 @@ PUT test-index/_doc/1
|
228 | 241 | * Others (such as `elastic` and `elasticsearch`) will automatically truncate
|
229 | 242 | the input.
|
230 | 243 |
|
231 |
| -Refer |
232 |
| -to [this tutorial](docs-content://solutions/search/semantic-search/semantic-search-semantic-text.md) |
233 |
| -to learn more about semantic search using `semantic_text`. |
234 |
| - |
235 | 244 | ## Extracting relevant fragments from semantic text [semantic-text-highlighting]
|
236 | 245 |
|
237 | 246 | You can extract the most relevant fragments from a semantic text field by using
|
@@ -295,6 +304,11 @@ specified. It enables you to quickstart your semantic search by providing
|
295 | 304 | automatic {{infer}} and a dedicated query so you don’t need to provide further
|
296 | 305 | details.
|
297 | 306 |
|
| 307 | +### Customizing using `semantic_text` parameters [custom-by-parameters] |
| 308 | +```{applies_to} |
| 309 | +stack: ga 9.1 |
| 310 | +``` |
| 311 | + |
298 | 312 | If you want to override those defaults and customize the embeddings that
|
299 | 313 | `semantic_text` indexes, you can do so by
|
300 | 314 | modifying [parameters](#semantic-text-params):
|
@@ -328,6 +342,24 @@ PUT my-index-000004
|
328 | 342 | }
|
329 | 343 | ```
|
330 | 344 |
|
| 345 | +### Customizing using ingest pipelines [custom-by-pipelines] |
| 346 | +```{applies_to} |
| 347 | +stack: ga 9.0 |
| 348 | +``` |
| 349 | + |
| 350 | +In case you want to customize data indexing, use the |
| 351 | +[`sparse_vector`](/reference/elasticsearch/mapping-reference/sparse-vector.md) |
| 352 | +or [`dense_vector`](/reference/elasticsearch/mapping-reference/dense-vector.md) |
| 353 | +field types and create an ingest pipeline with an |
| 354 | +[{{infer}} processor](/reference/enrich-processor/inference-processor.md) to |
| 355 | +generate the embeddings. |
| 356 | +[This tutorial](docs-content://solutions/search/semantic-search/semantic-search-inference.md) |
| 357 | +walks you through the process. In these cases - when you use `sparse_vector` or |
| 358 | +`dense_vector` field types instead of the `semantic_text` field type to |
| 359 | +customize indexing - using the |
| 360 | +[`semantic_query`](/reference/query-languages/query-dsl/query-dsl-semantic-query.md) |
| 361 | +is not supported for querying the field data. |
| 362 | + |
331 | 363 | ## Updates to `semantic_text` fields [update-script]
|
332 | 364 |
|
333 | 365 | For indices containing `semantic_text` fields, updates that use scripts have the
|
|
0 commit comments