diff --git a/explore-analyze/machine-learning/nlp/inference-processing.md b/explore-analyze/machine-learning/nlp/inference-processing.md index e9f668b529..6452b7cee7 100644 --- a/explore-analyze/machine-learning/nlp/inference-processing.md +++ b/explore-analyze/machine-learning/nlp/inference-processing.md @@ -5,7 +5,7 @@ mapped_pages: # Inference processing [ingest-pipeline-search-inference] -When you create an index through the **Content** UI, a set of default ingest pipelines are also created, including a ML inference pipeline. The [ML inference pipeline](../../../solutions/search/ingest-for-search.md#ingest-pipeline-search-details-specific-ml-reference) uses inference processors to analyze fields and enrich documents with the output. Inference processors use ML trained models, so you need to use a built-in model or [deploy a trained model in your cluster^](ml-nlp-deploy-models.md) to use this feature. +When you create an index through the **Content** UI, a set of default ingest pipelines are also created, including a ML inference pipeline. The [ML inference pipeline](/solutions/search/search-pipelines.md#ingest-pipeline-search-details-specific-ml-reference) uses inference processors to analyze fields and enrich documents with the output. Inference processors use ML trained models, so you need to use a built-in model or [deploy a trained model in your cluster^](ml-nlp-deploy-models.md) to use this feature. This guide focuses on the ML inference pipeline, its use, and how to manage it. @@ -129,7 +129,7 @@ To ensure the ML inference pipeline will be run when ingesting documents, you mu ## Learn More [ingest-pipeline-search-inference-learn-more] -* See [Overview](../../../solutions/search/ingest-for-search.md#ingest-pipeline-search-in-enterprise-search) for information on the various pipelines that are created. +* See [Overview](/solutions/search/search-pipelines.md#ingest-pipeline-search-in-enterprise-search) for information on the various pipelines that are created. * Learn about [ELSER](ml-nlp-elser.md), Elastic’s proprietary retrieval model for semantic search with sparse vectors. * [NER HuggingFace Models](https://huggingface.co/models?library=pytorch&pipeline_tag=token-classification&sort=downloads) * [Text Classification HuggingFace Models](https://huggingface.co/models?library=pytorch&pipeline_tag=text-classification&sort=downloads) diff --git a/raw-migrated-files/elasticsearch/elasticsearch-reference/es-ingestion-overview.md b/raw-migrated-files/elasticsearch/elasticsearch-reference/es-ingestion-overview.md deleted file mode 100644 index 5f6cac5c92..0000000000 --- a/raw-migrated-files/elasticsearch/elasticsearch-reference/es-ingestion-overview.md +++ /dev/null @@ -1,41 +0,0 @@ -# Add data to {{es}} [es-ingestion-overview] - -There are multiple ways to ingest data into {{es}}. The option that you choose depends on whether you’re working with timestamped data or non-timestamped data, where the data is coming from, its complexity, and more. - -::::{tip} -You can load [sample data](../../../manage-data/ingest.md#_add_sample_data) into your {{es}} cluster using {{kib}}, to get started quickly. - -:::: - - - -## General content [es-ingestion-overview-general-content] - -General content is data that does not have a timestamp. This could be data like vector embeddings, website content, product catalogs, and more. For general content, you have the following options for adding data to {{es}} indices: - -* [API](https://www.elastic.co/guide/en/elasticsearch/reference/current/docs.html): Use the {{es}} [Document APIs](https://www.elastic.co/guide/en/elasticsearch/reference/current/docs.html) to index documents directly, using the Dev Tools [Console](../../../explore-analyze/query-filter/tools/console.md), or cURL. - - If you’re building a website or app, then you can call Elasticsearch APIs using an [{{es}} client](https://www.elastic.co/guide/en/elasticsearch/client/index.html) in the programming language of your choice. If you use the Python client, then check out the `elasticsearch-labs` repo for various [example notebooks](https://github.com/elastic/elasticsearch-labs/tree/main/notebooks/search/python-examples). - -* [File upload](../../../manage-data/ingest.md#upload-data-kibana): Use the {{kib}} file uploader to index single files for one-off testing and exploration. The GUI guides you through setting up your index and field mappings. -* [Web crawler](https://github.com/elastic/crawler): Extract and index web page content into {{es}} documents. -* [Connectors](https://www.elastic.co/guide/en/elasticsearch/reference/current/es-connectors.html): Sync data from various third-party data sources to create searchable, read-only replicas in {{es}}. - - -## Timestamped data [es-ingestion-overview-timestamped] - -Timestamped data in {{es}} refers to datasets that include a timestamp field. If you use the [Elastic Common Schema (ECS)](https://www.elastic.co/guide/en/ecs/{{ecs_version}}/ecs-reference.html), this field is named `@timestamp`. This could be data like logs, metrics, and traces. - -For timestamped data, you have the following options for adding data to {{es}} data streams: - -* [Elastic Agent and Fleet](https://www.elastic.co/guide/en/fleet/current/fleet-overview.html): The preferred way to index timestamped data. Each Elastic Agent based integration includes default ingestion rules, dashboards, and visualizations to start analyzing your data right away. You can use the Fleet UI in {{kib}} to centrally manage Elastic Agents and their policies. -* [Beats](https://www.elastic.co/guide/en/beats/libbeat/current/beats-reference.html): If your data source isn’t supported by Elastic Agent, use Beats to collect and ship data to Elasticsearch. You install a separate Beat for each type of data to collect. -* [Logstash](https://www.elastic.co/guide/en/logstash/current/introduction.html): Logstash is an open source data collection engine with real-time pipelining capabilities that supports a wide variety of data sources. You might use this option because neither Elastic Agent nor Beats supports your data source. You can also use Logstash to persist incoming data, or if you need to send the data to multiple destinations. -* [Language clients](../../../manage-data/ingest/ingesting-data-from-applications.md): The linked tutorials demonstrate how to use {{es}} programming language clients to ingest data from an application. In these examples, {{es}} is running on Elastic Cloud, but the same principles apply to any {{es}} deployment. - -::::{tip} -If you’re interested in data ingestion pipelines for timestamped data, use the decision tree in the [Elastic Cloud docs](../../../manage-data/ingest.md#ec-data-ingest-pipeline) to understand your options. - -:::: - - diff --git a/raw-migrated-files/toc.yml b/raw-migrated-files/toc.yml index 7e36e3a085..1e2a87c775 100644 --- a/raw-migrated-files/toc.yml +++ b/raw-migrated-files/toc.yml @@ -603,7 +603,6 @@ toc: - file: elasticsearch/elasticsearch-reference/document-level-security.md - file: elasticsearch/elasticsearch-reference/documents-indices.md - file: elasticsearch/elasticsearch-reference/elasticsearch-intro-deploy.md - - file: elasticsearch/elasticsearch-reference/es-ingestion-overview.md - file: elasticsearch/elasticsearch-reference/es-security-principles.md - file: elasticsearch/elasticsearch-reference/esql-examples.md - file: elasticsearch/elasticsearch-reference/esql-getting-started.md @@ -621,7 +620,6 @@ toc: - file: elasticsearch/elasticsearch-reference/index-modules-analysis.md - file: elasticsearch/elasticsearch-reference/index-modules-mapper.md - file: elasticsearch/elasticsearch-reference/ingest-enriching-data.md - - file: elasticsearch/elasticsearch-reference/ingest-pipeline-search.md - file: elasticsearch/elasticsearch-reference/ingest.md - file: elasticsearch/elasticsearch-reference/install-elasticsearch.md - file: elasticsearch/elasticsearch-reference/ip-filtering.md diff --git a/solutions/search/ingest-for-search.md b/solutions/search/ingest-for-search.md index a1190d32aa..0fc7cd3129 100644 --- a/solutions/search/ingest-for-search.md +++ b/solutions/search/ingest-for-search.md @@ -6,35 +6,45 @@ mapped_urls: - https://www.elastic.co/guide/en/serverless/current/elasticsearch-ingest-your-data.html --- -# Ingest for search +# Ingest for search use cases -% What needs to be done: Lift-and-shift +% ---- +% navigation_title: "Ingest for search use cases" +% ---- -% Scope notes: guidance on what ingest options you might want to use for search - connectors, crawler ... +$$$elasticsearch-ingest-time-series-data$$$ +::::{note} +This page covers ingest methods specifically for search use cases. If you're working with a different use case, refer to the [ingestion overview](/manage-data/ingest.md) for more options. +:::: -% Use migrated content from existing pages that map to this page: +Search use cases usually focus on general **content**, typically text-heavy data that does not have a timestamp. This could be data like knowledge bases, website content, product catalogs, and more. -% - [ ] ./raw-migrated-files/elasticsearch/elasticsearch-reference/es-ingestion-overview.md -% - [ ] ./raw-migrated-files/docs-content/serverless/elasticsearch-ingest-data-through-api.md -% - [ ] ./raw-migrated-files/elasticsearch/elasticsearch-reference/ingest-pipeline-search.md -% - [ ] ./raw-migrated-files/docs-content/serverless/elasticsearch-ingest-your-data.md +Once you've decided how to [deploy Elastic](/deploy-manage/index.md), the next step is getting your content into {{es}}. Your choice of ingestion method depends on where your content lives and how you need to access it. -% Internal links rely on the following IDs being on this page (e.g. as a heading ID, paragraph ID, etc): +There are several methods to ingest data into {{es}} for search use cases. Choose one or more based on your requirements. -$$$elasticsearch-ingest-time-series-data$$$ +::::{tip} +If you just want to do a quick test, you can load [sample data](/manage-data/ingest/sample-data.md) into your {{es}} cluster using the UI. +:::: + +## Use APIs [es-ingestion-overview-apis] -$$$ingest-pipeline-search-details-specific-ml-reference$$$ +You can use the [`_bulk` API](https://www.elastic.co/docs/api/doc/elasticsearch/v8/group/endpoint-document) to add data to your {{es}} indices, using any HTTP client, including the [{{es}} client libraries](/solutions/search/site-or-app/clients.md). -$$$ingest-pipeline-search-in-enterprise-search$$$ +While the {{es}} APIs can be used for any data type, Elastic provides specialized tools that optimize ingestion for specific use cases. -$$$ingest-pipeline-search-details-generic-reference$$$ +## Use specialized tools [es-ingestion-overview-general-content] -$$$ingest-pipeline-search-details-specific-custom-reference$$$ +You can use these specialized tools to add general content to {{es}} indices. -$$$ingest-pipeline-search-details-specific-reference-processors$$$ +| Method | Description | Notes | +|--------|-------------|-------| +| [**Web crawler**](https://github.com/elastic/crawler) | Programmatically discover and index content from websites and knowledge bases | Crawl public-facing web content or internal sites accessible via HTTP proxy | +| [**Search connectors**]() | Third-party integrations to popular content sources like databases, cloud storage, and business applications | Choose from a range of Elastic-built connectors or build your own in Python using the Elastic connector framework| +| [**File upload**](/manage-data/ingest/tools/upload-data-files.md)| One-off manual uploads through the UI | Useful for testing or very small-scale use cases, but not recommended for production workflows | -$$$ingest-pipeline-search-details-specific$$$ +### Process data at ingest time -$$$ingest-pipeline-search-pipeline-settings-using-the-api$$$ +You can also transform and enrich your content at ingest time using [ingest pipelines](/manage-data/ingest/transform-enrich/ingest-pipelines.md). -$$$ingest-pipeline-search-pipeline-settings$$$ \ No newline at end of file +The Elastic UI has a set of tools for creating and managing indices optimized for search use cases. You can also manage your ingest pipelines in this UI. Learn more in [](search-pipelines.md). \ No newline at end of file diff --git a/raw-migrated-files/elasticsearch/elasticsearch-reference/ingest-pipeline-search.md b/solutions/search/search-pipelines.md similarity index 81% rename from raw-migrated-files/elasticsearch/elasticsearch-reference/ingest-pipeline-search.md rename to solutions/search/search-pipelines.md index 5e27a802f0..78ede152ea 100644 --- a/raw-migrated-files/elasticsearch/elasticsearch-reference/ingest-pipeline-search.md +++ b/solutions/search/search-pipelines.md @@ -1,9 +1,8 @@ -# Ingest pipelines in Search [ingest-pipeline-search] +# Ingest pipelines for search use cases [ingest-pipeline-search] You can manage ingest pipelines through Elasticsearch APIs or Kibana UIs. -The **Content** UI under **Search** has a set of tools for creating and managing indices optimized for search use cases (non time series data). You can also manage your ingest pipelines in this UI. - +The **Content** UI under **Search** has a set of tools for creating and managing indices optimized for search use cases (non-time series data). You can also manage your ingest pipelines in this UI. ## Find pipelines in Content UI [ingest-pipeline-search-where] @@ -18,12 +17,11 @@ To find this tab in the Kibana UI: The tab is highlighted in this screenshot: -:::{image} ../../../images/elasticsearch-reference-ingest-pipeline-ent-search-ui.png +:::{image} /images/elasticsearch-reference-ingest-pipeline-ent-search-ui.png :alt: ingest pipeline ent search ui :class: screenshot ::: - ## Overview [ingest-pipeline-search-in-enterprise-search] These tools can be particularly helpful by providing a layer of customization and post-processing of documents. For example: @@ -34,11 +32,11 @@ These tools can be particularly helpful by providing a layer of customization an It can be a lot of work to set up and manage production-ready pipelines from scratch. Considerations such as error handling, conditional execution, sequencing, versioning, and modularization must all be taken into account. -To this end, when you create indices for search use cases, (including [Elastic web crawler](https://www.elastic.co/guide/en/enterprise-search/current/crawler.html), [connectors](https://www.elastic.co/guide/en/elasticsearch/reference/current/es-connectors.html). , and API indices), each index already has a pipeline set up with several processors that optimize your content for search. +To this end, when you create indices for search use cases, (including web crawler, search connectors and API indices), each index already has a pipeline set up with several processors that optimize your content for search. -This pipeline is called `search-default-ingestion`. While it is a "managed" pipeline (meaning it should not be tampered with), you can view its details via the Kibana UI or the Elasticsearch API. You can also [read more about its contents below](../../../solutions/search/ingest-for-search.md#ingest-pipeline-search-details-generic-reference). +This pipeline is called `search-default-ingestion`. While it is a "managed" pipeline (meaning it should not be tampered with), you can view its details via the Kibana UI or the Elasticsearch API. You can also [read more about its contents below](#ingest-pipeline-search-details-generic-reference). -You can control whether you run some of these processors. While all features are enabled by default, they are eligible for opt-out. For [Elastic crawler](https://www.elastic.co/guide/en/enterprise-search/current/crawler.html) and [connectors](https://www.elastic.co/guide/en/elasticsearch/reference/current/es-connectors.html). , you can opt out (or back in) per index, and your choices are saved. For API indices, you can opt out (or back in) by including specific fields in your documents. [See below for details](../../../solutions/search/ingest-for-search.md#ingest-pipeline-search-pipeline-settings-using-the-api). +You can control whether you run some of these processors. While all features are enabled by default, they are eligible for opt-out. For [Elastic crawler](https://www.elastic.co/guide/en/enterprise-search/current/crawler.html) and [connectors](https://www.elastic.co/guide/en/elasticsearch/reference/current/es-connectors.html). , you can opt out (or back in) per index, and your choices are saved. For API indices, you can opt out (or back in) by including specific fields in your documents. [See below for details](#ingest-pipeline-search-pipeline-settings-using-the-api). At the deployment level, you can change the default settings for all new indices. This will not effect existing indices. @@ -48,7 +46,7 @@ Each index also provides the capability to easily create index-specific ingest p 2. `@custom` 3. `@ml-inference` -Like `search-default-ingestion`, the first of these is "managed", but the other two can and should be modified to fit your needs. You can view these pipelines using the platform tools (Kibana UI, Elasticsearch API), and can also [read more about their content below](../../../solutions/search/ingest-for-search.md#ingest-pipeline-search-details-specific). +Like `search-default-ingestion`, the first of these is "managed", but the other two can and should be modified to fit your needs. You can view these pipelines using the platform tools (Kibana UI, Elasticsearch API), and can also [read more about their content below](#ingest-pipeline-search-details-specific). ## Pipeline Settings [ingest-pipeline-search-pipeline-settings] @@ -97,10 +95,10 @@ If the pipeline is not specified, the underscore-prefixed fields will actually b ### `search-default-ingestion` Reference [ingest-pipeline-search-details-generic-reference] -You can access this pipeline with the [Elasticsearch Ingest Pipelines API](https://www.elastic.co/guide/en/elasticsearch/reference/current/get-pipeline-api.html) or via Kibana’s [Stack Management > Ingest Pipelines](../../../manage-data/ingest/transform-enrich/ingest-pipelines.md#create-manage-ingest-pipelines) UI. +You can access this pipeline with the [Elasticsearch Ingest Pipelines API](https://www.elastic.co/guide/en/elasticsearch/reference/current/get-pipeline-api.html) or via Kibana’s [Stack Management > Ingest Pipelines](/manage-data/ingest/transform-enrich/ingest-pipelines.md#create-manage-ingest-pipelines) UI. ::::{warning} -This pipeline is a "managed" pipeline. That means that it is not intended to be edited. Editing/updating this pipeline manually could result in unintended behaviors, or difficulty in upgrading in the future. If you want to make customizations, we recommend you utilize index-specific pipelines (see below), specifically [the `@custom` pipeline](../../../solutions/search/ingest-for-search.md#ingest-pipeline-search-details-specific-custom-reference). +This pipeline is a "managed" pipeline. That means that it is not intended to be edited. Editing/updating this pipeline manually could result in unintended behaviors, or difficulty in upgrading in the future. If you want to make customizations, we recommend you utilize index-specific pipelines (see below), specifically [the `@custom` pipeline](#ingest-pipeline-search-details-specific-custom-reference). :::: @@ -118,12 +116,12 @@ This pipeline is a "managed" pipeline. That means that it is not intended to be #### Control flow parameters [ingest-pipeline-search-details-generic-reference-params] -The `search-default-ingestion` pipeline does not always run all processors. It utilizes a feature of ingest pipelines to [conditionally run processors](../../../manage-data/ingest/transform-enrich/ingest-pipelines.md#conditionally-run-processor) based on the contents of each individual document. +The `search-default-ingestion` pipeline does not always run all processors. It utilizes a feature of ingest pipelines to [conditionally run processors](/manage-data/ingest/transform-enrich/ingest-pipelines.md#conditionally-run-processor) based on the contents of each individual document. * `_extract_binary_content` - if this field is present and has a value of `true` on a source document, the pipeline will attempt to run the `attachment`, `set_body`, and `remove_replacement_chars` processors. Note that the document will also need an `_attachment` field populated with base64-encoded binary data in order for the `attachment` processor to have any output. If the `_extract_binary_content` field is missing or `false` on a source document, these processors will be skipped. * `_reduce_whitespace` - if this field is present and has a value of `true` on a source document, the pipeline will attempt to run the `remove_extra_whitespace` and `trim` processors. These processors only apply to the `body` field. If the `_reduce_whitespace` field is missing or `false` on a source document, these processors will be skipped. -Crawler, Native Connectors, and Connector Clients will automatically add these control flow parameters based on the settings in the index’s Pipeline tab. To control what settings any new indices will have upon creation, see the deployment wide content settings. See [Pipeline Settings](../../../solutions/search/ingest-for-search.md#ingest-pipeline-search-pipeline-settings). +Crawler, Native Connectors, and Connector Clients will automatically add these control flow parameters based on the settings in the index’s Pipeline tab. To control what settings any new indices will have upon creation, see the deployment wide content settings. See [Pipeline Settings](#ingest-pipeline-search-pipeline-settings). ### Index-specific ingest pipelines [ingest-pipeline-search-details-specific] @@ -139,7 +137,7 @@ The "copy and customize" button is not available at all Elastic subscription lev #### `` Reference [ingest-pipeline-search-details-specific-reference] -This pipeline looks and behaves a lot like the [`search-default-ingestion` pipeline](../../../solutions/search/ingest-for-search.md#ingest-pipeline-search-details-generic-reference), but with [two additional processors](../../../solutions/search/ingest-for-search.md#ingest-pipeline-search-details-specific-reference-processors). +This pipeline looks and behaves a lot like the [`search-default-ingestion` pipeline](#ingest-pipeline-search-details-generic-reference), but with [two additional processors](#ingest-pipeline-search-details-specific-reference-processors). ::::{warning} You should not rename this pipeline. @@ -148,7 +146,7 @@ You should not rename this pipeline. ::::{warning} -This pipeline is a "managed" pipeline. That means that it is not intended to be edited. Editing/updating this pipeline manually could result in unintended behaviors, or difficulty in upgrading in the future. If you want to make customizations, we recommend you utilize [the `@custom` pipeline](../../../solutions/search/ingest-for-search.md#ingest-pipeline-search-details-specific-custom-reference). +This pipeline is a "managed" pipeline. That means that it is not intended to be edited. Editing/updating this pipeline manually could result in unintended behaviors, or difficulty in upgrading in the future. If you want to make customizations, we recommend you utilize [the `@custom` pipeline](#ingest-pipeline-search-details-specific-custom-reference). :::: @@ -156,7 +154,7 @@ This pipeline is a "managed" pipeline. That means that it is not intended to be ##### Processors [ingest-pipeline-search-details-specific-reference-processors] -In addition to the processors inherited from the [`search-default-ingestion` pipeline](../../../solutions/search/ingest-for-search.md#ingest-pipeline-search-details-generic-reference), the index-specific pipeline also defines: +In addition to the processors inherited from the [`search-default-ingestion` pipeline](#ingest-pipeline-search-details-generic-reference), the index-specific pipeline also defines: * `index_ml_inference_pipeline` - this uses the [Pipeline](https://www.elastic.co/guide/en/elasticsearch/reference/current/pipeline-processor.html) processor to run the `@ml-inference` pipeline. This processor will only be run if the source document includes a `_run_ml_inference` field with the value `true`. * `index_custom_pipeline` - this uses the [Pipeline](https://www.elastic.co/guide/en/elasticsearch/reference/current/pipeline-processor.html) processor to run the `@custom` pipeline. @@ -168,7 +166,7 @@ Like the `search-default-ingestion` pipeline, the `` pipeline does n * `_run_ml_inference` - if this field is present and has a value of `true` on a source document, the pipeline will attempt to run the `index_ml_inference_pipeline` processor. If the `_run_ml_inference` field is missing or `false` on a source document, this processor will be skipped. -Crawler, Native Connectors, and Connector Clients will automatically add these control flow parameters based on the settings in the index’s Pipeline tab. To control what settings any new indices will have upon creation, see the deployment wide content settings. See [Pipeline Settings](../../../solutions/search/ingest-for-search.md#ingest-pipeline-search-pipeline-settings). +Crawler, Native Connectors, and Connector Clients will automatically add these control flow parameters based on the settings in the index’s Pipeline tab. To control what settings any new indices will have upon creation, see the deployment wide content settings. See [Pipeline Settings](#ingest-pipeline-search-pipeline-settings). #### `@ml-inference` Reference [ingest-pipeline-search-details-specific-ml-reference] @@ -194,7 +192,7 @@ The `monitor_ml` Elasticsearch cluster permission is required in order to manage This pipeline is empty to start (no processors), but can be added to via the Kibana UI either through the Pipelines tab of your index, or from the **Stack Management > Ingest Pipelines** page. Unlike the `search-default-ingestion` pipeline and the `` pipeline, this pipeline is NOT "managed". -You are encouraged to make additions and edits to this pipeline, provided its name remains the same. This provides a convenient hook from which to add custom processing and transformations for your data. Be sure to read the [docs for ingest pipelines](../../../manage-data/ingest/transform-enrich/ingest-pipelines.md) to see what options are available. +You are encouraged to make additions and edits to this pipeline, provided its name remains the same. This provides a convenient hook from which to add custom processing and transformations for your data. Be sure to read the [docs for ingest pipelines](/manage-data/ingest/transform-enrich/ingest-pipelines.md) to see what options are available. ::::{warning} You should not rename this pipeline. @@ -206,9 +204,9 @@ You should not rename this pipeline. ## Upgrading notes [ingest-pipeline-search-upgrading-notes] ::::{dropdown} Expand to see upgrading notes -* `app_search_crawler` - Since 8.3, {{app-search-crawler}} has utilized this pipeline to power its binary content extraction. You can read more about this pipeline and its usage in the [App Search Guide](https://www.elastic.co/guide/en/app-search/current/web-crawler-reference.html#web-crawler-reference-binary-content-extraction). When upgrading from 8.3 to 8.5+, be sure to note any changes that you made to the `app_search_crawler` pipeline. These changes should be re-applied to each index’s `@custom` pipeline in order to ensure a consistent data processing experience. In 8.5+, the [index setting to enable binary content](../../../solutions/search/ingest-for-search.md#ingest-pipeline-search-pipeline-settings) is required **in addition** to the configurations mentioned in the [App Search Guide](https://www.elastic.co/guide/en/app-search/current/web-crawler-reference.html#web-crawler-reference-binary-content-extraction). -* `ent_search_crawler` - Since 8.4, the Elastic web crawler has utilized this pipeline to power its binary content extraction. You can read more about this pipeline and its usage in the [Elastic web crawler Guide](https://www.elastic.co/guide/en/enterprise-search/current/crawler-managing.html#crawler-managing-binary-content). When upgrading from 8.4 to 8.5+, be sure to note any changes that you made to the `ent_search_crawler` pipeline. These changes should be re-applied to each index’s `@custom` pipeline in order to ensure a consistent data processing experience. In 8.5+, the [index setting to enable binary content](../../../solutions/search/ingest-for-search.md#ingest-pipeline-search-pipeline-settings) is required **in addition** to the configurations mentioned in the [Elastic web crawler Guide](https://www.elastic.co/guide/en/enterprise-search/current/crawler-managing.html#crawler-managing-binary-content). +* `app_search_crawler` - Since 8.3, {{app-search-crawler}} has utilized this pipeline to power its binary content extraction. You can read more about this pipeline and its usage in the [App Search Guide](https://www.elastic.co/guide/en/app-search/current/web-crawler-reference.html#web-crawler-reference-binary-content-extraction). When upgrading from 8.3 to 8.5+, be sure to note any changes that you made to the `app_search_crawler` pipeline. These changes should be re-applied to each index’s `@custom` pipeline in order to ensure a consistent data processing experience. In 8.5+, the [index setting to enable binary content](#ingest-pipeline-search-pipeline-settings) is required **in addition** to the configurations mentioned in the [App Search Guide](https://www.elastic.co/guide/en/app-search/current/web-crawler-reference.html#web-crawler-reference-binary-content-extraction). +* `ent_search_crawler` - Since 8.4, the Elastic web crawler has utilized this pipeline to power its binary content extraction. You can read more about this pipeline and its usage in the [Elastic web crawler Guide](https://www.elastic.co/guide/en/enterprise-search/current/crawler-managing.html#crawler-managing-binary-content). When upgrading from 8.4 to 8.5+, be sure to note any changes that you made to the `ent_search_crawler` pipeline. These changes should be re-applied to each index’s `@custom` pipeline in order to ensure a consistent data processing experience. In 8.5+, the [index setting to enable binary content](#ingest-pipeline-search-pipeline-settings) is required **in addition** to the configurations mentioned in the [Elastic web crawler Guide](https://www.elastic.co/guide/en/enterprise-search/current/crawler-managing.html#crawler-managing-binary-content). * `ent-search-generic-ingestion` - Since 8.5, Native Connectors, Connector Clients, and new (>8.4) Elastic web crawler indices all made use of this pipeline by default. This pipeline evolved into the `search-default-ingestion` pipeline. -* `search-default-ingestion` - Since 9.0, Connectors have made use of this pipeline by default. You can [read more about this pipeline](../../../solutions/search/ingest-for-search.md#ingest-pipeline-search-details-generic-reference) above. As this pipeline is "managed", any modifications that were made to `app_search_crawler` and/or `ent_search_crawler` should NOT be made to `search-default-ingestion`. Instead, if such customizations are desired, you should utilize [Index-specific ingest pipelines](../../../solutions/search/ingest-for-search.md#ingest-pipeline-search-details-specific), placing all modifications in the `@custom` pipeline(s). +* `search-default-ingestion` - Since 9.0, Connectors have made use of this pipeline by default. You can [read more about this pipeline](#ingest-pipeline-search-details-generic-reference) above. As this pipeline is "managed", any modifications that were made to `app_search_crawler` and/or `ent_search_crawler` should NOT be made to `search-default-ingestion`. Instead, if such customizations are desired, you should utilize [Index-specific ingest pipelines](#ingest-pipeline-search-details-specific), placing all modifications in the `@custom` pipeline(s). -:::: +:::: \ No newline at end of file diff --git a/solutions/toc.yml b/solutions/toc.yml index dd37ca2d89..237052b942 100644 --- a/solutions/toc.yml +++ b/solutions/toc.yml @@ -619,6 +619,8 @@ toc: - file: search/building-search-in-your-app-or-site.md - file: search/search-templates.md - file: search/ingest-for-search.md + children: + - file: search/search-pipelines.md - file: search/full-text.md children: - file: search/full-text/search-with-synonyms.md