diff --git a/docs/docset.yml b/docs/docset.yml index 15bd674a5fb5e..831b83809a381 100644 --- a/docs/docset.yml +++ b/docs/docset.yml @@ -111,3 +111,6 @@ subs: feat-imp: "feature importance" feat-imp-cap: "Feature importance" nlp: "natural language processing" + index-manage-app: "Index Management" + connectors-app: "Connectors" + ingest-pipelines-app: "Ingest Pipelines" \ No newline at end of file diff --git a/docs/reference/search-connectors/api-tutorial.md b/docs/reference/search-connectors/api-tutorial.md index 7b691f44e9c64..dfb5fc5dee4ac 100644 --- a/docs/reference/search-connectors/api-tutorial.md +++ b/docs/reference/search-connectors/api-tutorial.md @@ -6,34 +6,31 @@ applies_to: elasticsearch: ga mapped_pages: - https://www.elastic.co/guide/en/elasticsearch/reference/current/es-connectors-tutorial-api.html +description: Use APIs to synchronize data from a PostgreSQL data source into Elasticsearch. --- # Connector API tutorial [es-connectors-tutorial-api] -Learn how to set up a self-managed connector using the [{{es}} Connector APIs](https://www.elastic.co/docs/api/doc/elasticsearch/group/endpoint-connector). +Learn how to set up a self-managed connector using the [{{es}} connector APIs]({{es-apis}}group/endpoint-connector). -For this example we’ll use the connectors-postgresql,PostgreSQL connector to sync data from a PostgreSQL database to {{es}}. We’ll spin up a simple PostgreSQL instance in Docker with some example data, create a connector, and sync the data to {{es}}. You can follow the same steps to set up a connector for another data source. +For this example we’ll use the [PostgreSQL connector](/reference/search-connectors/es-connectors-postgresql.md) to sync data from a PostgreSQL database to {{es}}. We’ll spin up a simple PostgreSQL instance in Docker with some example data, create a connector, and sync the data to {{es}}. You can follow the same steps to set up a connector for another data source. ::::{tip} -This tutorial focuses on running a self-managed connector on your own infrastructure, and managing syncs using the Connector APIs. See connectors for an overview of how connectors work. +This tutorial focuses on running a self-managed connector on your own infrastructure, and managing syncs using the connector APIs. If you’re just getting started with {{es}}, this tutorial might be a bit advanced. Refer to [quickstart](docs-content://solutions/search/get-started.md) for a more beginner-friendly introduction to {{es}}. -If you’re just getting started with connectors, you might want to start in the UI first. Check out this tutorial that focuses on managing connectors using the UI: - -* [Self-managed connector tutorial](/reference/search-connectors/es-postgresql-connector-client-tutorial.md). Set up a self-managed PostgreSQL connector. +If you’re just getting started with connectors, you might want to start in the UI first. Check out this tutorial that focuses on managing connectors using the UI: [](/reference/search-connectors/es-postgresql-connector-client-tutorial.md). :::: - -### Prerequisites [es-connectors-tutorial-api-prerequisites] +## Prerequisites [es-connectors-tutorial-api-prerequisites] * You should be familiar with how connectors, connectors work, to understand how the API calls relate to the overall connector setup. * You need to have [Docker Desktop](https://www.docker.com/products/docker-desktop/) installed. * You need to have {{es}} running, and an API key to access it. Refer to the next section for details, if you don’t have an {{es}} deployment yet. - -### Set up {{es}} [es-connectors-tutorial-api-setup-es] +## Set up {{es}} [es-connectors-tutorial-api-setup-es] If you already have an {{es}} deployment on Elastic Cloud (*Hosted deployment* or *Serverless project*), you’re good to go. To spin up {{es}} in local dev mode in Docker for testing purposes, open the collapsible section below. @@ -73,7 +70,8 @@ Note: With {{es}} running locally, you will need to pass the username and passwo ::::{admonition} Running API calls -You can run API calls using the [Dev Tools Console](docs-content://explore-analyze/query-filter/tools/console.md) in Kibana, using `curl` in your terminal, or with our programming language clients. Our example widget allows you to copy code examples in both Dev Tools Console syntax and curl syntax. To use curl, you’ll need to add authentication headers to your request. +You can run API calls using the [Dev Tools Console](docs-content://explore-analyze/query-filter/tools/console.md) in Kibana, using `curl` in your terminal, or with our programming language clients. +To use curl, you’ll need to add authentication headers to your request. Here’s an example of how to do that. Note that if you want the connector ID to be auto-generated, use the `POST _connector` endpoint. @@ -88,13 +86,11 @@ curl -s -X PUT http://localhost:9200/_connector/my-connector-id \ }' ``` -Refer to connectors-tutorial-api-create-api-key for instructions on creating an API key. +Refer to [](/reference/search-connectors/es-postgresql-connector-client-tutorial.md) for instructions on creating an API key. :::: - - -### Run PostgreSQL instance in Docker (optional) [es-connectors-tutorial-api-setup-postgres] +## Run PostgreSQL instance in Docker (optional) [es-connectors-tutorial-api-setup-postgres] For this tutorial, we’ll set up a PostgreSQL instance in Docker with some example data. Of course, you can **skip this step and use your own existing PostgreSQL instance** if you have one. Keep in mind that using a different instance might require adjustments to the connector configuration described in the next steps. @@ -105,7 +101,7 @@ Let’s launch a PostgreSQL container with a user and password, exposed at port docker run --name postgres -e POSTGRES_USER=myuser -e POSTGRES_PASSWORD=mypassword -p 5432:5432 -d postgres ``` -**Download and import example data** +### Download and import example data Next we need to create a directory to store our example dataset for this tutorial. In your terminal, run the following command: @@ -145,10 +141,9 @@ This tutorial uses a very basic setup. To use advanced functionality such as fil Now it’s time for the real fun! We’ll set up a connector to create a searchable mirror of our PostgreSQL data in {{es}}. +## Create a connector [es-connectors-tutorial-api-create-connector] -### Create a connector [es-connectors-tutorial-api-create-connector] - -We’ll use the [Create connector API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-connector-put) to create a PostgreSQL connector instance. +We’ll use the [create connector API]({{es-apis}}operation/operation-connector-put) to create a PostgreSQL connector instance. Run the following API call, using the [Dev Tools Console](docs-content://explore-analyze/query-filter/tools/console.md) or `curl`: @@ -171,10 +166,9 @@ Note that we specified the `my-connector-id` ID as a part of the `PUT` request. If you’d prefer to use an autogenerated ID, replace `PUT _connector/my-connector-id` with `POST _connector`. +## Set up the connector service [es-connectors-tutorial-api-deploy-connector] -### Run connector service [es-connectors-tutorial-api-deploy-connector] - -Now we’ll run the connector service so we can start syncing data from our PostgreSQL instance to {{es}}. We’ll use the steps outlined in connectors-run-from-docker. +Now we’ll run the connector service so we can start syncing data from our PostgreSQL instance to {{es}}. We’ll use the steps outlined in [](/reference/search-connectors/es-connectors-run-from-docker.md). When running the connectors service on your own infrastructure, you need to provide a configuration file with the following details: @@ -183,10 +177,9 @@ When running the connectors service on your own infrastructure, you need to prov * Your third-party data source type (`service_type`) * Your connector ID (`connector_id`) +### Create an API key [es-connectors-tutorial-api-create-api-key] -#### Create an API key [es-connectors-tutorial-api-create-api-key] - -If you haven’t already created an API key to access {{es}}, you can use the [_security/api_key](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-security-create-api-key) endpoint. +If you haven’t already created an API key to access {{es}}, you can use the [_security/api_key]({{es-apis}}operation/operation-security-create-api-key) endpoint. Here, we assume your target {{es}} index name is `music`. If you use a different index name, adjust the request body accordingly. @@ -225,9 +218,7 @@ You can also create an API key in the {{kib}} and Serverless UIs. :::: - - -#### Prepare the configuration file [es-connectors-tutorial-api-prepare-configuration-file] +### Prepare the configuration file [es-connectors-tutorial-api-prepare-configuration-file] Let’s create a directory and a `config.yml` file to store the connector configuration: @@ -249,8 +240,7 @@ connectors: We provide an [example configuration file](https://raw.githubusercontent.com/elastic/connectors/main/config.yml.example) in the `elastic/connectors` repository for reference. - -#### Run the connector service [es-connectors-tutorial-api-run-connector-service] +### Run the service [es-connectors-tutorial-api-run-connector-service] Now that we have the configuration file set up, we can run the connector service locally. This will point your connector instance at your {{es}} deployment. @@ -273,12 +263,11 @@ Verify your connector is connected by getting the connector status (should be `n GET _connector/my-connector-id ``` - -### Configure connector [es-connectors-tutorial-api-update-connector-configuration] +## Configure the connector [es-connectors-tutorial-api-update-connector-configuration] Now our connector instance is up and running, but it doesn’t yet know *where* to sync data from. The final piece of the puzzle is to configure our connector with details about our PostgreSQL instance. When setting up a connector in the Elastic Cloud or Serverless UIs, you’re prompted to add these details in the user interface. -But because this tutorial is all about working with connectors *programmatically*, we’ll use the [Update connector configuration API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-connector-update-configuration) to add our configuration details. +But because this tutorial is all about working with connectors *programmatically*, we’ll use the [update connector configuration API]({{es-apis}}operation/operation-connector-update-configuration) to add our configuration details. ::::{tip} Before configuring the connector, ensure that the configuration schema is registered by the service. For self-managed connectors, the schema registers on service startup (once the `config.yml` is populated). @@ -310,9 +299,7 @@ Configuration details are specific to the connector type. The keys and values wi :::: - - -### Sync data [es-connectors-tutorial-api-sync] +## Sync your data [es-connectors-tutorial-api-sync] We’re now ready to sync our PostgreSQL data to {{es}}. Run the following API call to start a full sync job: @@ -327,15 +314,13 @@ POST _connector/_sync_job To store data in {{es}}, the connector needs to create an index. When we created the connector, we specified the `music` index. The connector will create and configure this {{es}} index before launching the sync job. ::::{tip} -In the approach we’ve used here, the connector will use [dynamic mappings](docs-content://manage-data/data-store/mapping.md#mapping-dynamic) to automatically infer the data types of your fields. In a real-world scenario you would use the {{es}} [Create index API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-create) to first create the index with the desired field mappings and index settings. Defining your own mappings upfront gives you more control over how your data is indexed. +In the approach we’ve used here, the connector will use [dynamic mappings](docs-content://manage-data/data-store/mapping.md#mapping-dynamic) to automatically infer the data types of your fields. In a real-world scenario you would use the {{es}} [create index API]({{es-apis}}operation/operation-indices-create) to first create the index with the desired field mappings and index settings. Defining your own mappings upfront gives you more control over how your data is indexed. :::: +### Check sync status [es-connectors-tutorial-api-check-sync-status] - -#### Check sync status [es-connectors-tutorial-api-check-sync-status] - -Use the [Get sync job API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-connector-sync-job-get) to track the status and progress of the sync job. By default, the most recent job statuses are returned first. Run the following API call to check the status of the sync job: +Use the [get sync job API]({{es-apis}}operation/operation-connector-sync-job-get) to track the status and progress of the sync job. By default, the most recent job statuses are returned first. Run the following API call to check the status of the sync job: ```console GET _connector/_sync_job?connector_id=my-connector-id&size=1 @@ -345,6 +330,8 @@ The job document will be updated as the sync progresses, you can check it as oft Once the job completes, the status should be `completed` and `indexed_document_count` should be **622**. +## Query your data + Verify that data is present in the `music` index with the following API call: ```console @@ -357,8 +344,7 @@ GET music/_count GET music/_search ``` - -## Troubleshooting [es-connectors-tutorial-api-troubleshooting] +## Troubleshoot [es-connectors-tutorial-api-troubleshooting] Use the following command to inspect the latest sync job’s status: @@ -369,7 +355,7 @@ GET _connector/_sync_job?connector_id=my-connector-id&size=1 If the connector encountered any errors during the sync, you’ll find these in the `error` field. -### Cleaning up [es-connectors-tutorial-api-cleanup] +## Clean up [es-connectors-tutorial-api-cleanup] To delete the connector and its associated sync jobs run this command: @@ -397,13 +383,12 @@ docker stop docker rm ``` +## Next steps [es-connectors-tutorial-api-next-steps] -### Next steps [es-connectors-tutorial-api-next-steps] - -Congratulations! You’ve successfully set up a self-managed connector using the Connector APIs. +Congratulations! You’ve successfully set up a self-managed connector using the connector APIs. Here are some next steps to explore: -* Learn more about the [Connector APIs](https://www.elastic.co/docs/api/doc/elasticsearch/group/endpoint-connector). +* Learn more about the [connector APIs]({{es-apis}}group/endpoint-connector). * Learn how to deploy {{es}}, {{kib}}, and the connectors service using Docker Compose in our [quickstart guide](https://github.com/elastic/connectors/tree/main/scripts/stack#readme). diff --git a/docs/reference/search-connectors/connectors-ui-in-kibana.md b/docs/reference/search-connectors/connectors-ui-in-kibana.md index c5b4d72bad60c..c15fbd2f67284 100644 --- a/docs/reference/search-connectors/connectors-ui-in-kibana.md +++ b/docs/reference/search-connectors/connectors-ui-in-kibana.md @@ -11,10 +11,10 @@ mapped_pages: This document describes operations available to connectors using the UI. -In the Kibana or Serverless UI, find Connectors using the [global search field](docs-content://explore-analyze/query-filter/filtering.md#_finding_your_apps_and_objects). Here, you can view a summary of all your connectors and sync jobs, and to create new connectors. +In the Kibana or Serverless UI, find **{{connectors-app}}** using the [global search field](docs-content://explore-analyze/query-filter/filtering.md#_finding_your_apps_and_objects). Here, you can view a summary of all your connectors and sync jobs, and to create new connectors. ::::{tip} -In 8.12 we introduced a set of [Connector APIs](https://www.elastic.co/docs/api/doc/elasticsearch/group/endpoint-connector) to create and manage Elastic connectors and sync jobs, along with a [CLI tool](https://github.com/elastic/connectors/blob/main/docs/CLI.md). Use these tools if you’d like to work with connectors and sync jobs programmatically, without using the UI. +In 8.12 we introduced a set of [connector APIs]({{es-apis}}group/endpoint-connector) to create and manage Elastic connectors and sync jobs, along with a [CLI tool](https://github.com/elastic/connectors/blob/main/docs/CLI.md). Use these tools if you’d like to work with connectors and sync jobs programmatically, without using the UI. :::: @@ -24,13 +24,13 @@ In 8.12 we introduced a set of [Connector APIs](https://www.elastic.co/docs/api/ You connector writes data to an {{es}} index. -To create self-managed [**self-managed connector**](/reference/search-connectors/self-managed-connectors.md), use the buttons under **Search > Content > Connectors**. Once you’ve chosen the data source type you’d like to sync, you’ll be prompted to create an {{es}} index. +To create [self-managed connectors](/reference/search-connectors/self-managed-connectors.md), use the buttons under **{{es}} > Content > {{connectors-app}}**. Once you’ve chosen the data source type you’d like to sync, you’ll be prompted to create an {{es}} index. ## Manage connector indices [es-connectors-usage-indices] View and manage all Elasticsearch indices managed by connectors. -In the {{kib}} UI, navigate to **Search > Content > Connectors** from the main menu, or use the [global search field](docs-content://explore-analyze/query-filter/filtering.md#_finding_your_apps_and_objects). Here, you can view a list of connector indices and their attributes, including connector type health and ingestion status. +In the {{kib}} UI, navigate to **{{es}} > Content > {{connectors-app}}** from the main menu, or use the [global search field](docs-content://explore-analyze/query-filter/filtering.md#_finding_your_apps_and_objects). Here, you can view a list of connector indices and their attributes, including connector type health and ingestion status. Within this interface, you can choose to view the details for each existing index or delete an index. Or, you can [create a new connector index](#es-connectors-usage-index-create). @@ -41,21 +41,21 @@ These operations require access to Kibana and additional index privileges. {{es}} stores your data as documents in an index. Each index is made up of a set of fields and each field has a type (such as `keyword`, `boolean`, or `date`). -**Mapping** is the process of defining how a document, and the fields it contains, are stored and indexed. Connectors use [dynamic mapping](docs-content://manage-data/data-store/mapping/dynamic-field-mapping.md) to automatically create mappings based on the data fetched from the source. +Mapping is the process of defining how a document, and the fields it contains, are stored and indexed. Connectors use [dynamic mapping](docs-content://manage-data/data-store/mapping/dynamic-field-mapping.md) to automatically create mappings based on the data fetched from the source. -Index **settings** are configurations that can be adjusted on a per-index basis. They control things like the index’s performance, the resources it uses, and how it should handle operations. +Index settings are configurations that can be adjusted on a per-index basis. They control things like the index’s performance, the resources it uses, and how it should handle operations. -When you create an index with a connector, the index is created with *default* search-optimized field template mappings and index settings. Mappings for specific fields are then dynamically created based on the data fetched from the source. +When you create an index with a connector, the index is created with default search-optimized field template mappings and index settings. Mappings for specific fields are then dynamically created based on the data fetched from the source. You can inspect your index mappings in the following ways: -* **In the {{kib}} UI**: Navigate to **Search > Content > Indices > *YOUR-INDEX* > Index Mappings** -* **By API**: Use the [Get mapping API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-get-mapping) +* In the {{kib}} UI: Navigate to **{{es}} > Content > Indices > *YOUR-INDEX* > Index Mappings**. +* By API: Use the [get mapping API]({{es-apis}}operation/operation-indices-get-mapping). You can manually **edit** the mappings and settings via the {{es}} APIs: -* Use the [Put mapping API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-put-mapping) to update index mappings. -* Use the [Update index settings API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-put-settings) to update index settings. +* Use the [put mapping API]({{es-apis}}operation/operation-indices-put-mapping) to update index mappings. +* Use the [update index settings API]({{es-apis}}operation/operation-indices-put-settings) to update index settings. It’s important to note that these updates are more complex when the index already contains data. @@ -69,12 +69,12 @@ Updating mappings and settings is simpler when your index has no data. If you cr ### Customize mappings and settings after syncing data [es-connectors-usage-index-create-configure-existing-index-have-data] -Once data has been added to {{es}} using dynamic mappings, you can’t directly update existing field mappings. If you’ve already synced data into an index and want to change the mappings, you’ll need to [reindex your data](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-reindex). +Once data has been added to {{es}} using dynamic mappings, you can’t directly update existing field mappings. If you’ve already synced data into an index and want to change the mappings, you’ll need to [reindex your data]({{es-apis}}operation/operation-reindex). The workflow for these updates is as follows: -1. [Create](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-create) a new index with the desired mappings and settings. -2. [Reindex](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-reindex) your data from the old index into this new index. +1. [Create]({{es-apis}}operation/operation-indices-create) a new index with the desired mappings and settings. +2. [Reindex]({{es-apis}}operation/operation-reindex) your data from the old index into this new index. 3. Delete the old index. 4. (Optional) Use an [alias](docs-content://manage-data/data-store/aliases.md), if you want to retain the old index name. 5. Attach your connector to the new index or alias. @@ -84,9 +84,9 @@ The workflow for these updates is as follows: After creating an index to be managed by a connector, you can configure automatic, recurring syncs. -In the {{kib}} UI, navigate to **Search > Content > Connectors** from the main menu, or use the [global search field](docs-content://explore-analyze/query-filter/filtering.md#_finding_your_apps_and_objects). +In the {{kib}} UI, navigate to **{{es}} > Content > {{connectors-app}}** from the main menu, or use the [global search field](docs-content://explore-analyze/query-filter/filtering.md#_finding_your_apps_and_objects). -Choose the index to configure, and then choose the **Scheduling** tab. +Choose the connector and then the **Scheduling** tab. Within this interface, you can enable or disable scheduled: @@ -107,9 +107,9 @@ After you enable recurring syncs or sync once, the first sync will begin. (There After creating the index to be managed by a connector, you can request a single sync at any time. -In the {{kib}} UI, navigate to **Search > Content > Elasticsearch indices** from the main menu, or use the [global search field](docs-content://explore-analyze/query-filter/filtering.md#_finding_your_apps_and_objects). +In the {{kib}} UI, navigate to **{{es}} > Content > {{connectors-app}}** from the main menu, or use the [global search field](docs-content://explore-analyze/query-filter/filtering.md#_finding_your_apps_and_objects). -Then choose the index to sync. +Then choose the connector to sync. Regardless of which tab is active, the **Sync** button is always visible in the top right. Choose this button to reveal sync options: @@ -117,7 +117,7 @@ Regardless of which tab is active, the **Sync** button is always visible in the 2. Incremental content (if supported) 3. Access control (if supported) -Choose one of the options to request a sync. (There may be a short delay before the connector service begins the sync.) +Choose one of the options to request a sync. There may be a short delay before the connector service begins the sync. This operation requires access to Kibana and the `write` [indices privilege^](/reference/elasticsearch/security-privileges.md) for the `.elastic-connectors` index. @@ -126,9 +126,9 @@ This operation requires access to Kibana and the `write` [indices privilege^](/r After a sync has started, you can cancel the sync before it completes. -In the {{kib}} UI, navigate to **Search > Content > Elasticsearch indices** from the main menu, or use the [global search field](docs-content://explore-analyze/query-filter/filtering.md#_finding_your_apps_and_objects). +In the {{kib}} UI, navigate to **{{es}} > Content > {{connectors-app}}** from the main menu, or use the [global search field](docs-content://explore-analyze/query-filter/filtering.md#_finding_your_apps_and_objects). -Then choose the index with the running sync. +Then choose the connector with the running sync. Regardless of which tab is active, the **Sync** button is always visible in the top right. Choose this button to reveal sync options, and choose **Cancel Syncs** to cancel active syncs. This will cancel the running job, and marks all *pending* and *suspended* jobs as canceled as well. (There may be a short delay before the connector service cancels the syncs.) @@ -139,9 +139,9 @@ This operation requires access to Kibana and the `write` [indices privilege^](/r View the index details to see a variety of information that communicate the status of the index and connector. -In the {{kib}} UI, navigate to **Search > Content > Elasticsearch indices** from the main menu, or use the [global search field](docs-content://explore-analyze/query-filter/filtering.md#_finding_your_apps_and_objects). +In the {{kib}} UI, navigate to **{{es}} > Content > {{connectors-app}}** from the main menu, or use the [global search field](docs-content://explore-analyze/query-filter/filtering.md#_finding_your_apps_and_objects). -Then choose the index to view. +Then choose the connector to view. The **Overview** tab presents a variety of information, including: @@ -150,7 +150,7 @@ The **Overview** tab presents a variety of information, including: * The current ingestion status (see below for possible values). * The current document count. -Possible values of ingestion status: +Possible values of ingestion status include: * Incomplete - A connector that is not configured yet. * Configured - A connector that is configured. @@ -159,9 +159,8 @@ Possible values of ingestion status: * Connector failure - A connector that has not seen any update for more than 30 minutes. * Sync failure - A connector that failed in the last sync job. -This tab also displays the recent sync history, including sync status (see below for possible values). - -Possible values of sync status: +This tab also displays the recent sync history, including sync status. +Possible values of sync status include: * Sync pending - The initial job status, the job is pending to be picked up. * Sync in progress - The job is running. @@ -186,11 +185,9 @@ This operation requires access to Kibana and the `read` [indices privilege^](/re View the documents the connector has synced from the data. Additionally view the index mappings to determine the current document schema. -In the {{kib}} UI, navigate to **Search > Content > Elasticsearch indices** from the main menu, or use the [global search field](docs-content://explore-analyze/query-filter/filtering.md#_finding_your_apps_and_objects). - -Then choose the index to view. +In the {{kib}} UI, navigate to **{{es}} > Content > {{connectors-app}}** from the main menu, or use the [global search field](docs-content://explore-analyze/query-filter/filtering.md#_finding_your_apps_and_objects). -Choose the **Documents** tab to view the synced documents. Choose the **Index Mappings** tab to view the index mappings that were created by the connector. +Select the connector then the **Documents** tab to view the synced documents. Choose the **Mappings** tab to view the index mappings that were created by the connector. When setting up a new connector, ensure you are getting the documents and fields you were expecting from the data source. If not, see [Troubleshooting](/reference/search-connectors/es-connectors-troubleshooting.md) for help. @@ -203,7 +200,7 @@ See [Security](/reference/search-connectors/es-connectors-security.md) for secur Use [sync rules](/reference/search-connectors/es-sync-rules.md) to limit which documents are fetched from the data source, or limit which fetched documents are stored in Elastic. -In the {{kib}} UI, navigate to **Search > Content > Elasticsearch indices** from the main menu, or use the [global search field](docs-content://explore-analyze/query-filter/filtering.md#_finding_your_apps_and_objects). +In the {{kib}} UI, navigate to **{{es}} > Content > Elasticsearch indices** from the main menu, or use the [global search field](docs-content://explore-analyze/query-filter/filtering.md#_finding_your_apps_and_objects). Then choose the index to manage and choose the **Sync rules** tab. @@ -212,7 +209,5 @@ Then choose the index to manage and choose the **Sync rules** tab. Use [ingest pipelines](docs-content://solutions/search/ingest-for-search.md) to transform fetched data before it is stored in Elastic. -In the {{kib}} UI, navigate to **Search > Content > Elasticsearch indices** from the main menu, or use the [global search field](docs-content://explore-analyze/query-filter/filtering.md#_finding_your_apps_and_objects). - -Then choose the index to manage and choose the **Pipelines** tab. - +In the {{kib}} UI, navigate to **{{es}} > Content > {{connectors-app}}** from the main menu, or use the [global search field](docs-content://explore-analyze/query-filter/filtering.md#_finding_your_apps_and_objects). +Then choose the connector and view its **Pipelines** tab. diff --git a/docs/reference/search-connectors/es-postgresql-connector-client-tutorial.md b/docs/reference/search-connectors/es-postgresql-connector-client-tutorial.md index 54a275e8184fb..8cc8f9aee771b 100644 --- a/docs/reference/search-connectors/es-postgresql-connector-client-tutorial.md +++ b/docs/reference/search-connectors/es-postgresql-connector-client-tutorial.md @@ -3,28 +3,33 @@ navigation_title: "Tutorial" mapped_pages: - https://www.elastic.co/guide/en/elasticsearch/reference/current/es-postgresql-connector-client-tutorial.html - https://www.elastic.co/guide/en/starting-with-the-elasticsearch-platform-and-its-solutions/current/getting-started-appsearch.html +applies_to: + stack: ga + serverless: + elasticsearch: ga +description: Synchronize data from a PostgreSQL data source into Elasticsearch. --- -# PostgreSQL self-managed connector tutorial [es-postgresql-connector-client-tutorial] +# Set up a self-managed PostgreSQL connector - -This tutorial walks you through the process of creating a self-managed connector for a PostgreSQL data source. You’ll be using the [self-managed connector](/reference/search-connectors/self-managed-connectors.md) workflow in the Kibana UI. This means you’ll be deploying the connector on your own infrastructure. Refer to the [Elastic PostgreSQL connector reference](/reference/search-connectors/es-connectors-postgresql.md) for more information about this connector. - -In this exercise, you’ll be working in both the terminal (or your IDE) and the Kibana UI. +Elastic connectors enable you to create searchable, read-only replicas of your data sources in {{es}}. +This tutorial walks you through the process of creating a self-managed connector for a PostgreSQL data source. If you want to deploy a self-managed connector for another data source, use this tutorial as a blueprint. Refer to the list of available [connectors](/reference/search-connectors/index.md). ::::{tip} -Want to get started quickly testing a self-managed connector using Docker Compose? Refer to this [guide](https://github.com/elastic/connectors/tree/main/scripts/stack#readme) in the `elastic/connectors` repo for more information. +Want to get started quickly testing a self-managed connector and a self-managed cluster using Docker Compose? Refer to the [readme](https://github.com/elastic/connectors/tree/main/scripts/stack#readme) in the `elastic/connectors` repo for more information. :::: ## Prerequisites [es-postgresql-connector-client-tutorial-prerequisites] - ### Elastic prerequisites [es-postgresql-connector-client-tutorial-prerequisites-elastic] -First, ensure you satisfy the [prerequisites](/reference/search-connectors/self-managed-connectors.md#es-build-connector-prerequisites) for self-managed connectors. +You must satisfy the [prerequisites](/reference/search-connectors/self-managed-connectors.md#es-build-connector-prerequisites) for self-managed connectors. + ### PostgreSQL prerequisites [es-postgresql-connector-client-tutorial-postgresql-prerequisites] @@ -47,142 +52,119 @@ Then restart the PostgreSQL server. :::: +## Set up the connector +:::::{stepper} +::::{step} Create an Elasticsearch index +To store data in {{es}}, the connector needs to create an index. +By default, connectors use [dynamic mappings](docs-content://manage-data/data-store/mapping.md#mapping-dynamic) to automatically infer the data types of your fields. +If you [use APIs](/reference/search-connectors/api-tutorial.md) or {{es-serverless}}, you can create an index with the desired field mappings and index settings before you create the connector. +Defining your own mappings upfront gives you more control over how your data is indexed. -## Steps [es-postgresql-connector-client-tutorial-steps] - -To complete this tutorial, you’ll need to complete the following steps: - -1. [Create an Elasticsearch index](#es-postgresql-connector-client-tutorial-create-index) -2. [Set up the connector](#es-postgresql-connector-client-tutorial-setup-connector) -3. [Run the `connectors` connector service](#es-postgresql-connector-client-tutorial-run-connector-service) -4. [Sync your PostgreSQL data source](#es-postgresql-connector-client-tutorial-sync-data-source) - - -## Create an Elasticsearch index [es-postgresql-connector-client-tutorial-create-index] - -Elastic connectors enable you to create searchable, read-only replicas of your data sources in Elasticsearch. The first step in setting up your self-managed connector is to create an index. - -In the [Kibana^](docs-content://get-started/the-stack.md) UI, navigate to **Search > Content > Elasticsearch indices** from the main menu, or use the [global search field](docs-content://explore-analyze/query-filter/filtering.md#_finding_your_apps_and_objects). - -Create a new connector index: - -1. Under **Select an ingestion method** choose **Connector**. -2. Choose **PostgreSQL** from the list of connectors. -3. Name your index and optionally change the language analyzer to match the human language of your data source. (The index name you provide is automatically prefixed with `search-`.) -4. Save your changes. - -The index is created and ready to configure. - -::::{admonition} Gather Elastic details -:name: es-postgresql-connector-client-tutorial-gather-elastic-details - -Before you can configure the connector, you need to gather some details about your Elastic deployment: - -* **Elasticsearch endpoint**. - - * If you’re an Elastic Cloud user, find your deployment’s Elasticsearch endpoint in the Cloud UI under **Cloud > Deployments > > Elasticsearch**. - * If you’re running your Elastic deployment and the connector service in Docker, the default Elasticsearch endpoint is `http://host.docker.internal:9200`. - -* **API key.** You’ll need this key to configure the connector. Use an existing key or create a new one. -* **Connector ID**. Your unique connector ID is automatically generated when you create the connector. Find this in the Kibana UI. - +Navigate to **{{index-manage-app}}** or use the [global search field](docs-content://explore-analyze/find-and-organize/find-apps-and-objects.md). +Follow the index creation workflow then optionally define field mappings. +For example, to add semantic search capabilities, you could add an extra field that stores your vectors for semantic search. :::: +::::{step} Create the connector +Navigate to **{{connectors-app}}** or use the global search field. +Follow the connector creation process in the UI. For example: + +1. Select **PostgreSQL** from the list of connectors. +1. Edit the name and description for the connector. This will help your team identify the connector. +1. Gather configuration details. + Before you can proceed to the next step, you need to gather some details about your Elastic deployment: + + * Elasticsearch endpoint: + * If you’re an Elastic Cloud user, find your deployment’s Elasticsearch endpoint in the Cloud UI under **Cloud > Deployments > > Elasticsearch**. + * If you’re running your Elastic deployment and the connector service in Docker, the default Elasticsearch endpoint is `http://host.docker.internal:9200`. + * API key: You’ll need this key to configure the connector. Use an existing key or create a new one. + * Connector ID: Your unique connector ID is automatically generated when you create the connector. +:::: +::::{step} Run the connector service +You must run the connector code on your own infrastructure and link it to {{es}}. +You have two options: [Run with Docker](/reference/search-connectors/es-connectors-run-from-docker.md) and [Run from source](/reference/search-connectors/es-connectors-run-from-source.md). +For this example, we’ll use the latter method: + +1. Clone or fork the repository locally with the following command: `git clone https://github.com/elastic/connectors`. +1. Open the `config.yml` configuration file in your editor of choice. +1. Replace the values for `host`, `api_key`, and `connector_id` with the values you gathered earlier. Use the `service_type` value `postgresql` for this connector. + + :::{dropdown} Expand to see an example config.yml file + Replace the values for `host`, `api_key`, and `connector_id` with your own values. + Use the `service_type` value `postgresql` for this connector. + + ```yaml + elasticsearch: + host: ">" # Your Elasticsearch endpoint + api_key: "" # Your top-level Elasticsearch API key + ... + connectors: + - + connector_id: "" + api_key: "" # Your scoped connector index API key (optional). If not provided, the top-level API key is used. + service_type: "postgresql" + + sources: + # mongodb: connectors.sources.mongo:MongoDataSource + # s3: connectors.sources.s3:S3DataSource + # dir: connectors.sources.directory:DirectoryDataSource + # mysql: connectors.sources.mysql:MySqlDataSource + # network_drive: connectors.sources.network_drive:NASDataSource + # google_cloud_storage: connectors.sources.google_cloud_storage:GoogleCloudStorageDataSource + # azure_blob_storage: connectors.sources.azure_blob_storage:AzureBlobStorageDataSource + postgresql: connectors.sources.postgresql:PostgreSQLDataSource + # oracle: connectors.sources.oracle:OracleDataSource + # sharepoint: connectors.sources.sharepoint:SharepointDataSource + # mssql: connectors.sources.mssql:MSSQLDataSource + # jira: connectors.sources.jira:JiraDataSource + ``` + +1. Now that you’ve configured the connector code, you can run the connector service. In your terminal or IDE: + + 1. `cd` into the root of your `connectors` clone/fork. + 1. Run the following command: `make run`. + +The connector service should now be running. +The UI will let you know that the connector has successfully connected to {{es}}. + +:::{tip} +Here we’re working locally. In production setups, you’ll deploy the connector service to your own infrastructure. +::: +:::: +::::{step} Add your data source details + +Now your connector instance is up and running, but it doesn’t yet know where to sync data from. +The final piece of the puzzle is to configure your connector with details about the PostgreSQL instance. + +Return to **{{connectors-app}}** to complete the connector creation process in the UI. +Enter the following PostgreSQL instance details: + +* **Host**: The server host address for your PostgreSQL instance. +* **Port**: The port number for your PostgreSQL instance. +* **Username**: The username of the PostgreSQL account. +* **Password**: The password for that user. +* **Database**: The name of the PostgreSQL database. +* **Schema**: The schema of the PostgreSQL database. +* **Comma-separated list of tables**: `*` will fetch data from all tables in the configured database. + +:::{note} +Configuration details are specific to the connector type. +The keys and values will differ depending on which third-party data source you’re connecting to. +Refer to the [](/reference/search-connectors/es-connectors-postgresql.md) for more details. +::: +:::: +::::{step} Link your index +If you [use APIs](/reference/search-connectors/api-tutorial.md) or {{es-serverless}}, you can create an index or choose an existing index for use by the connector. +Otherwise, the index is created for you and uses dynamic mappings for the fields. +:::: +::::: +## Sync your data [es-postgresql-connector-client-tutorial-sync-data-source] +In the **{{connectors-app}}** page, you can launch a sync on-demand or on a schedule. +The connector will traverse the database and synchronize documents to your index. -## Set up the connector [es-postgresql-connector-client-tutorial-setup-connector] - -Once you’ve created an index, you can set up the connector. You will be guided through this process in the UI. - -1. **Edit the name and description for the connector.** This will help your team identify the connector. -2. **Clone and edit the connector service code.** For this example, we’ll use the [Python framework](https://github.com/elastic/connectors/tree/main). Follow these steps: - - * Clone or fork that repository locally with the following command: `git clone https://github.com/elastic/connectors`. - * Open the `config.yml` configuration file in your editor of choice. - * Replace the values for `host`, `api_key`, and `connector_id` with the values you gathered [earlier](#es-postgresql-connector-client-tutorial-gather-elastic-details). Use the `service_type` value `postgresql` for this connector. - - ::::{dropdown} Expand to see an example config.yml file - Replace the values for `host`, `api_key`, and `connector_id` with your own values. Use the `service_type` value `postgresql` for this connector. - - ```yaml - elasticsearch: - host: > # Your Elasticsearch endpoint - api_key: '' # Your top-level Elasticsearch API key - ... - connectors: - - - connector_id: "" - api_key: "'" # Your scoped connector index API key (optional). If not provided, the top-level API key is used. - service_type: "postgresql" - - - - # Self-managed connector settings - connector_id: '' # Your connector ID - service_type: 'postgresql' # The service type for your connector - - sources: - # mongodb: connectors.sources.mongo:MongoDataSource - # s3: connectors.sources.s3:S3DataSource - # dir: connectors.sources.directory:DirectoryDataSource - # mysql: connectors.sources.mysql:MySqlDataSource - # network_drive: connectors.sources.network_drive:NASDataSource - # google_cloud_storage: connectors.sources.google_cloud_storage:GoogleCloudStorageDataSource - # azure_blob_storage: connectors.sources.azure_blob_storage:AzureBlobStorageDataSource - postgresql: connectors.sources.postgresql:PostgreSQLDataSource - # oracle: connectors.sources.oracle:OracleDataSource - # sharepoint: connectors.sources.sharepoint:SharepointDataSource - # mssql: connectors.sources.mssql:MSSQLDataSource - # jira: connectors.sources.jira:JiraDataSource - ``` - - :::: - - - -## Run the connector service [es-postgresql-connector-client-tutorial-run-connector-service] - -Now that you’ve configured the connector code, you can run the connector service. - -In your terminal or IDE: - -1. `cd` into the root of your `connectors` clone/fork. -2. Run the following command: `make run`. - -The connector service should now be running. The UI will let you know that the connector has successfully connected to Elasticsearch. - -Here we’re working locally. In production setups, you’ll deploy the connector service to your own infrastructure. If you prefer to use Docker, refer to the [repo docs](https://github.com/elastic/connectors/tree/main/docs/DOCKER.md) for instructions. - - -## Sync your PostgreSQL data source [es-postgresql-connector-client-tutorial-sync-data-source] - - -### Enter your PostgreSQL data source details [es-postgresql-connector-client-tutorial-sync-data-source-details] - -Once you’ve configured the connector, you can use it to index your data source. - -You can now enter your PostgreSQL instance details in the Kibana UI. - -Enter the following information: - -* **Host**. Server host address for your PostgreSQL instance. -* **Port**. Port number for your PostgreSQL instance. -* **Username**. Username of the PostgreSQL account. -* **Password**. Password for that user. -* **Database**. Name of the PostgreSQL database. -* **Comma-separated list of tables**. `*` will fetch data from all tables in the configured database. - -Once you’ve entered all these details, select **Save configuration**. - - -### Launch a sync [es-postgresql-connector-client-tutorial-sync-data-source-launch-sync] - -If you navigate to the **Overview** tab in the Kibana UI, you can see the connector’s *ingestion status*. This should now have changed to **Configured**. - -It’s time to launch a sync by selecting the **Sync** button. - -If you navigate to the terminal window where you’re running the connector service, you should see output like the following: +If you navigate to the terminal window where you’re running the connector service, after a sync occurs you should see output like the following: ```shell [FMWK][13:22:26][INFO] Fetcher @@ -193,14 +175,20 @@ If you navigate to the terminal window where you’re running the connector serv (27 seconds) ``` -This confirms the connector has fetched records from your PostgreSQL table(s) and transformed them into documents in your Elasticsearch index. +This confirms the connector has fetched records from your PostgreSQL tables and transformed them into documents in your {{es}} index. + +If you verify your {{es}} documents and you’re happy with the results, set a recurring sync schedule. +This will ensure your searchable data in {{es}} is always up to date with changes to your PostgreSQL data source. -Verify your Elasticsearch documents in the **Documents** tab in the Kibana UI. +In **{{connectors-app}}**, click on the connector, and then click **Scheduling**. +For example, you can schedule your content to be synchronized at the top of every hour, as long as the connector is up and running. -If you’re happy with the results, set a recurring sync schedule in the **Scheduling** tab. This will ensure your *searchable* data in Elasticsearch is always up to date with changes to your PostgreSQL data source. +## Next steps +You just learned how to synchronize data from an external database to {{es}}. +For an overview of how to start searching and analyzing your data in Kibana, go to [Explore and analyze](docs-content://explore-analyze/index.md). -## Learn more [es-postgresql-connector-client-tutorial-learn-more] +Learn more: * [Overview of self-managed connectors and frameworks](/reference/search-connectors/self-managed-connectors.md) * [Elastic connector framework repository](https://github.com/elastic/connectors/tree/main) diff --git a/docs/reference/search-connectors/self-managed-connectors.md b/docs/reference/search-connectors/self-managed-connectors.md index 7892569d01714..a863a6a9790e3 100644 --- a/docs/reference/search-connectors/self-managed-connectors.md +++ b/docs/reference/search-connectors/self-managed-connectors.md @@ -18,7 +18,8 @@ Self-managed [Elastic connectors](/reference/search-connectors/index.md) are run ## Availability and Elastic prerequisites [es-build-connector-prerequisites] ::::{note} -Self-managed connectors currently don’t support Windows. Use this [compatibility matrix](https://www.elastic.co/support/matrix#matrix_os) to check which operating systems are supported by self-managed connectors. Find this information under **self-managed connectors** on that page. +Self-managed connectors currently don’t support Windows. Use this [compatibility matrix](https://www.elastic.co/support/matrix#matrix_os) to check which operating systems are supported by self-managed connectors. +% Find this information under **self-managed connectors** on that page. :::: @@ -28,7 +29,7 @@ Your Elastic deployment must include the following Elastic services: * **Elasticsearch** * **Kibana** -(A new Elastic Cloud deployment includes these services by default.) +A new {{ech}} deployment or {{es-serverless}} project includes these services by default. To run self-managed connectors, your self-deployed connector service version must match your Elasticsearch version. For example, if you’re running Elasticsearch 8.10.1, your connector service should be version 8.10.1.x. Elastic does not support deployments running mismatched versions (except during upgrades).