Skip to content

Commit 5665878

Browse files
Merge pull request #269979 from aahill/elasticsearch-data-source
adding Elasticsearch as a data source
2 parents 2545758 + 324cd05 commit 5665878

File tree

3 files changed

+66
-1
lines changed

3 files changed

+66
-1
lines changed

articles/ai-services/openai/concepts/use-your-data.md

Lines changed: 62 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -68,6 +68,7 @@ When you choose the following data sources, your data is ingested into an Azure
6868
|Upload files (preview) | Upload files from your local machine to be stored in an Azure Blob Storage database, and ingested into Azure AI Search. |
6969
|URL/Web address (preview) | Web content from the URLs is stored in Azure Blob Storage. |
7070
|Azure Blob Storage (preview) | Upload files from Azure Blob Storage to be ingested into an Azure AI Search index. |
71+
|Elasticsearch (preview) | Use an existing [Elasticsearch](https://www.elastic.co/guide/en/elasticsearch/reference/current/elasticsearch-intro.html) vector database.|
7172

7273
# [Azure AI Search](#tab/ai-search)
7374

@@ -101,7 +102,7 @@ If you're using your own index, you can customize the [field mapping](#index-fie
101102
|---------------------|------------------------|---------------------| -------- |
102103
| *keyword* | Keyword search | No additional pricing. |Performs fast and flexible query parsing and matching over searchable fields, using terms or phrases in any supported language, with or without operators.|
103104
| *semantic* | Semantic search | Additional pricing for [semantic search](/azure/search/semantic-search-overview#availability-and-pricing) usage. |Improves the precision and relevance of search results by using a reranker (with AI models) to understand the semantic meaning of query terms and documents returned by the initial search ranker|
104-
| *vector* | Vector search | No additional pricing |Enables you to find documents that are similar to a given query input based on the vector embeddings of the content. |
105+
| *vector* | Vector search | [Additional pricing](https://azure.microsoft.com/pricing/details/cognitive-services/openai-service/) on your Azure OpenAI account from calling the embedding model. |Enables you to find documents that are similar to a given query input based on the vector embeddings of the content. |
105106
| *hybrid (vector + keyword)* | A hybrid of vector search and keyword search | [Additional pricing](https://azure.microsoft.com/pricing/details/cognitive-services/openai-service/) on your Azure OpenAI account from calling the embedding model. |Performs similarity search over vector fields using vector embeddings, while also supporting flexible query parsing and full text search over alphanumeric fields using term queries.|
106107
| *hybrid (vector + keyword) + semantic* | A hybrid of vector search, semantic search, and keyword search. | [Additional pricing](https://azure.microsoft.com/pricing/details/cognitive-services/openai-service/) on your Azure OpenAI account from calling the embedding model, and additional pricing for [semantic search](/azure/search/semantic-search-overview#availability-and-pricing) usage. |Uses vector embeddings, language understanding, and flexible query parsing to create rich search experiences and generative AI apps that can handle complex and diverse information retrieval scenarios. |
107108

@@ -231,6 +232,66 @@ You can paste URLs and the service will store the webpage content, using it when
231232

232233
Once you have added the URL/web address for data ingestion, the web pages from your URL are fetched and saved to Azure Blob Storage with a container name: `webpage-<index name>`. Each URL will be saved into a different container within the account. Then the files are indexed into an Azure AI Search index, which is used for retrieval when you’re chatting with the model.
233234

235+
# [Elasticsearch (preview)](#tab/elasticsearch)
236+
237+
You can connect to your [Elasticsearch vector database](https://www.elastic.co/guide/en/elasticsearch/reference/current/elasticsearch-intro.html) and chat with your data.
238+
239+
### Prerequisites
240+
241+
* An Elasticsearch database
242+
* An embedding model. You can:
243+
* Use an existing Azure OpenAI `text-embedding-ada-002` embedding model, or
244+
* Bring your own embedding model hosted on Elasticsearch.
245+
* Prepare your data using the python notebook available on [GitHub](https://github.com/microsoft/sample-app-aoai-chatGPT/blob/main/notebooks/AzureOpenAI_OnYourData_Elasticsearch.ipynb).
246+
247+
### Request access
248+
249+
Using the Elasticsearch data source is a preview feature which is subject to the Limited Access Service terms in the [service-specific terms](https://www.microsoft.com/licensing/terms/productoffering/MicrosoftAzure/EAEAS) for Azure AI services. You must fill out and submit a [request form](https://aka.ms/aoaioydelasticsearchrequest) to request access to the Elasticsearch data source. The form requests information about your company and the scenario for which you plan to use the Elasticsearch data source. After you submit the form, the Azure AI services team will review it and email you with a decision within 10 business days.
250+
251+
### Connect Elasticsearch to Azure OpenAI On Your Data
252+
253+
1. Set up [Elasticsearch](https://www.elastic.co/guide/en/elasticsearch/reference/current/setup.html) and get your connection information.
254+
255+
You need to enter your [Elasticsearch endpoint](https://www.elastic.co/guide/en/cloud-on-k8s/current/k8s-request-elasticsearch-endpoint.html) and encoded API key to connect with your Elasticsearch database. Then, click **verify connection**.
256+
257+
258+
:::image type="content" source="../media/use-your-data/connect-elasticsearch.png" alt-text="A screenshot showing the connection screen for Elasticsearch." lightbox="../media/use-your-data/connect-elasticsearch.png":::
259+
260+
1. Select the index you want to connect with.
261+
262+
1. (optional) use a custom field mapping.
263+
264+
You can [customize the field mapping](#index-field-mapping-2) when you add your data source to define the fields that will get mapped when answering questions, or use the default values.
265+
266+
1. Choose the [search type](#search-types). Azure OpenAI On Your Data provides the following search types you can use when you add your data source.
267+
268+
1. Continue through the screens that appear and select **Save and close**.
269+
270+
### Search types
271+
272+
Azure OpenAI On Your Data provides the following search types you can use when you add your data source.
273+
274+
* [Keyword search](/azure/search/search-lucene-query-architecture)
275+
* [Vector search](/azure/search/vector-search-overview)
276+
277+
To enable vector search, you need an existing embedding model deployed in your Azure OpenAI resource or hosted on Elasticsearch. Select your embedding deployment when connecting your data, then select one of the vector search types under **Data management**.
278+
279+
| Search option | Retrieval type | Additional pricing? |Benefits|
280+
|---------------------|------------------------|---------------------| -------- |
281+
| *keyword* | Keyword search | No additional pricing. |Performs fast and flexible query parsing and matching over searchable fields, using terms or phrases in any supported language, with or without operators.|
282+
| *vector* | Vector search | [Additional pricing](https://azure.microsoft.com/pricing/details/cognitive-services/openai-service/) on your Azure OpenAI account from calling the embedding model. |Enables you to find documents that are similar to a given query input based on the vector embeddings of the content. |
283+
284+
285+
### Index field mapping
286+
287+
You can customize the [field mapping](#index-field-mapping) when you add your data source to define the fields that will get mapped when answering questions. To customize field mapping, select **Use custom field mapping** on the **Data Source** page when adding your data source. You can provide multiple fields for *content data*, and should include all fields that have text pertaining to your use case.
288+
289+
Mapping these fields correctly helps ensure the model has better response and citation quality. You can additionally configure this [in the API](../references/elasticsearch.md#fields-mapping-options) using the `fields_mapping` parameter.
290+
291+
### Use Elasticsearch as a data source via API
292+
293+
Along with using Elasticsearch databases in Azure OpenAI Studio, you can also use your Elasticsearch database using the [API](../references/elasticsearch.md).
294+
234295
---
235296

236297
### How data is ingested into Azure AI search
87.4 KB
Loading

articles/ai-services/openai/whats-new.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,10 @@ recommendations: false
1818

1919
## March 2024
2020

21+
### Elasticsearch database support for Azure OpenAI On Your Data
22+
23+
- You can now connect to an Elasticsearch vector database to be used with [Azure OpenAI On Your Data](./concepts/use-your-data.md?tabs=elasticsearch#supported-data-sources).
24+
2125
### 2024-02-01 general availability (GA) API released
2226

2327
This is the latest GA API release and is the replacement for the previous `2023-05-15` GA release. This release adds support for the latest Azure OpenAI GA features like Whisper, DALLE-3, fine-tuning, on your data, etc.

0 commit comments

Comments
 (0)