You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/ai-services/openai/concepts/use-your-data.md
+21-8Lines changed: 21 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,7 +8,7 @@ ms.service: azure-ai-openai
8
8
ms.topic: quickstart
9
9
author: aahill
10
10
ms.author: aahi
11
-
ms.date: 02/26/2024
11
+
ms.date: 04/08/2024
12
12
recommendations: false
13
13
ms.custom: references_regions
14
14
---
@@ -58,13 +58,19 @@ There's an [upload limit](../quotas-limits.md), and there are some caveats about
58
58
59
59
## Supported data sources
60
60
61
-
You need to connect to a data source to upload your data. When you want to use your data to chat with an Azure OpenAI model, your data is chunked in a search index so that relevant data can be found based on user queries. For some data sources such as uploading files from your local machine (preview) or data contained in a blob storage account (preview), Azure AI Search is used.
61
+
You need to connect to a data source to upload your data. When you want to use your data to chat with an Azure OpenAI model, your data is chunked in a search index so that relevant data can be found based on user queries.
62
62
63
-
When you choose the following data sources, your data is ingested into an Azure AI Search index.
63
+
The [Integrated Vector Database in Azure Cosmos DB for MongoDB](/azure/cosmos-db/mongodb/vcore/vector-search) natively supports integration with Azure OpenAI On Your Data.
64
+
65
+
For some data sources such as uploading files from your local machine (preview) or data contained in a blob storage account (preview), Azure AI Search is used. When you choose the following data sources, your data is ingested into an Azure AI Search index.
66
+
67
+
>[!TIP]
68
+
>If you use Azure Cosmos DB (except for its vCore-based API for MongoDB), you may be eligible for the [Azure AI Advantage offer](/azure/cosmos-db/ai-advantage), which provides the equivalent of up to $6,000 in Azure Cosmos DB throughput credits.
64
69
65
70
|Data source | Description |
66
71
|---------|---------|
67
72
|[Azure AI Search](/azure/search/search-what-is-azure-search)| Use an existing Azure AI Search index with Azure OpenAI On Your Data. |
73
+
|[Azure Cosmos DB](/azure/cosmos-db/introduction)| Azure Cosmos DB's API for Postgres and vCore-based API for MongoDB have natively integrated vector indexing and do not require Azure AI Search; however, its other APIs do require Azure AI Search for vector indexing. Azure Cosmos DB for NoSQL will offer a natively integrated vector database by mid-2024. |
68
74
|Upload files (preview) | Upload files from your local machine to be stored in an Azure Blob Storage database, and ingested into Azure AI Search. |
69
75
|URL/Web address (preview) | Web content from the URLs is stored in Azure Blob Storage. |
70
76
|Azure Blob Storage (preview) | Upload files from Azure Blob Storage to be ingested into an Azure AI Search index. |
@@ -133,15 +139,15 @@ If you want to implement additional value-based criteria for query execution, yo
# [Azure Cosmos DB for MongoDB vCore](#tab/mongo-db)
142
+
# [Vector Database in Azure Cosmos DB for MongoDB vCore](#tab/mongo-db)
137
143
138
144
### Prerequisites
139
145
*[Azure Cosmos DB for MongoDB vCore](/azure/cosmos-db/mongodb/vcore/introduction) account
140
146
* A deployed [embedding model](../concepts/understand-embeddings.md)
141
147
142
148
### Limitations
143
149
* Only Azure Cosmos DB for MongoDB vCore is supported.
144
-
* The search type is limited to [Azure Cosmos DB for MongoDB vCore vector search](/azure/cosmos-db/mongodb/vcore/vector-search) with an Azure OpenAI embedding model.
150
+
* The search type is limited to [Integrated Vector Database in Azure Cosmos DB for MongoDB vCore](/azure/cosmos-db/mongodb/vcore/vector-search) with an Azure OpenAI embedding model.
145
151
* This implementation works best on unstructured and spatial data.
146
152
147
153
### Data preparation
@@ -350,6 +356,10 @@ You can modify the following additional settings in the **Data parameters** sect
350
356
|**Retrieved documents**| This parameter is an integer that can be set to 3, 5, 10, or 20, and controls the number of document chunks provided to the large language model for formulating the final response. By default, this is set to 5. The search process can be noisy and sometimes, due to chunking, relevant information might be spread across multiple chunks in the search index. Selecting a top-K number, like 5, ensures that the model can extract relevant information, despite the inherent limitations of search and chunking. However, increasing the number too high can potentially distract the model. Additionally, the maximum number of documents that can be effectively used depends on the version of the model, as each has a different context size and capacity for handling documents. If you find that responses are missing important context, try increasing this parameter. This is the `topNDocuments` parameter in the API, and is 5 by default. |
351
357
| **Strictness** | Determines the system's aggressiveness in filtering search documents based on their similarity scores. The system queries Azure Search or other document stores, then decides which documents to provide to large language models like ChatGPT. Filtering out irrelevant documents can significantly enhance the performance of the end-to-end chatbot. Some documents are excluded from the top-K results if they have low similarity scores before forwarding them to the model. This is controlled by an integer value ranging from 1 to 5. Setting this value to 1 means that the system will minimally filter documents based on search similarity to the user query. Conversely, a setting of 5 indicates that the system will aggressively filter out documents, applying a very high similarity threshold. If you find that the chatbot omits relevant information, lower the filter's strictness (set the value closer to 1) to include more documents. Conversely, if irrelevant documents distract the responses, increase the threshold (set the value closer to 5). This is the `strictness` parameter in the API, and set to 3 by default. |
352
358
359
+
### Uncited references
360
+
361
+
It's possible for the model to return `"TYPE":"UNCITED_REFERENCE"` instead of `"TYPE":CONTENT` in the API for documents that are retrieved from the data source, but not included in the citation. This can be useful for debugging, and you can control this behavior by modifying the **strictness** and **retrieved documents** runtime parameters described above.
362
+
353
363
### System message
354
364
355
365
You can define a system message to steer the model's reply when using Azure OpenAI On Your Data. This message allows you to customize your replies on top of the retrieval augmented generation (RAG) pattern that Azure OpenAI On Your Data uses. The system message is used in addition to an internal base prompt to provide the experience. To support this, we truncate the system message after a specific [number of tokens](#token-usage-estimation-for-azure-openai-on-your-data) to ensure the model can answer questions using your data. If you are defining extra behavior on top of the default experience, ensure that your system prompt is detailed and explains the exact expected customization.
To troubleshoot a failed job, always look out for errors or warnings specified either in the API response or Azure OpenAI studio. Here are some of the common errors and warnings:
563
+
To troubleshoot failed operations, always look out for errors or warnings specified either in the API response or Azure OpenAI studio. Here are some of the common errors and warnings:
556
564
565
+
### Failed ingestion jobs
557
566
558
567
**Quota Limitations Issues**
559
568
@@ -583,6 +592,10 @@ Resolution:
583
592
584
593
This means the storage account isn't accessible with the given credentials. In this case, please review the storage account credentials passed to the API and ensure the storage account isn't hidden behind a private endpoint (if a private endpoint isn't configured for this resource).
585
594
595
+
### 503 errors when sending queries with Azure AI Search
596
+
597
+
Each user message can translate to multiple search queries, all of which get sent to the search resource in parallel. This can produce throttling behavior when the amount of search replicas and partitions is low. The maximum number of queries per second that a single partition and single replica can support may not be sufficient. In this case, consider increasing your replicas and partitions, or adding sleep/retry logic in your application. See the [Azure AI Search documentation](../../../search/performance-benchmarks.md) for more information.
598
+
586
599
## Regional availability and model support
587
600
588
601
You can use Azure OpenAI On Your Data with an Azure OpenAI resource in the following regions:
> Using GPT-35-Turbo models with the completion endpoint as described in this article remains in preview and is only possible with gpt-35-turbo verision (0301). We strongly recommend using the GA Chat Completion API/endpoint. The Chat Completion API is the recommended method of interacting with the GPT-35-Turbo models. The Chat Completion API is also the only way to access the GPT-4 models.
17
+
> Using GPT-3.5-Turbo models with the completion endpoint as described in this article remains in preview and is only possible with `gpt-35-turbo` version (0301) which is [slated for retirement as early as June 13th, 2024](../concepts/model-retirements.md#current-models). We strongly recommend using the [GA Chat Completion API/endpoint](./chatgpt.md). The Chat Completion API is the recommended method of interacting with the GPT-3.5-Turbo models. The Chat Completion API is also the only way to access the GPT-4 models.
18
18
19
-
The following code snippet shows the most basic way to use the GPT-35-Turbo models with ChatML. If this is your first time using these models programmatically we recommend starting with our [GPT-35-Turbo & GPT-4 Quickstart](../chatgpt-quickstart.md).
19
+
The following code snippet shows the most basic way to use the GPT-3.5-Turbo models with ChatML. If this is your first time using these models programmatically we recommend starting with our [GPT-35-Turbo & GPT-4 Quickstart](../chatgpt-quickstart.md).
20
+
21
+
> [!NOTE]
22
+
> In the Azure OpenAI documentation we refer to GPT-3.5-Turbo, and GPT-35-Turbo interchangeably. The official name of the model on OpenAI is `gpt-3.5-turbo`, but for Azure OpenAI due to Azure specific character constraints the underlying model name is `gpt-35-turbo`.
> The following parameters aren't available with the gpt-35-turbo model: `logprobs`, `best_of`, and `echo`. If you set any of these parameters, you'll get an error.
42
45
43
-
The `<|im_end|>` token indicates the end of a message. We recommend including`<|im_end|>` token as a stop sequence to ensure that the model stops generating text when it reaches the end of the message.
46
+
The `<|im_end|>` token indicates the end of a message. When using ChatML it is recommended to include`<|im_end|>` token as a stop sequence to ensure that the model stops generating text when it reaches the end of the message.
44
47
45
48
Consider setting `max_tokens` to a slightly higher value than normal such as 300 or 500. This ensures that the model doesn't stop generating text before it reaches the end of the message.
46
49
@@ -53,8 +56,6 @@ Unlike previous GPT-3 and GPT-3.5 models, the `gpt-35-turbo` model as well as th
53
56
54
57
You can find the model retirement dates for these models on our [models](../concepts/models.md) page.
Copy file name to clipboardExpand all lines: articles/ai-services/openai/how-to/chatgpt.md
+5-25Lines changed: 5 additions & 25 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -7,35 +7,15 @@ ms.author: mbullwin #delegenz
7
7
ms.service: azure-ai-openai
8
8
ms.custom: build-2023, build-2023-dataai
9
9
ms.topic: how-to
10
-
ms.date: 03/29/2024
10
+
ms.date: 04/05/2024
11
11
manager: nitinme
12
12
keywords: ChatGPT
13
-
zone_pivot_groups: openai-chat
14
13
---
15
14
16
-
# Learn how to work with the GPT-35-Turbo and GPT-4 models
15
+
# Learn how to work with the GPT-3.5-Turbo and GPT-4 models
17
16
18
-
The GPT-35-Turbo and GPT-4 models are language models that are optimized for conversational interfaces. The models behave differently than the older GPT-3 models. Previous models were text-in and text-out, meaning they accepted a prompt string and returned a completion to append to the prompt. However, the GPT-35-Turbo and GPT-4 models are conversation-in and message-out. The models expect input formatted in a specific chat-like transcript format, and return a completion that represents a model-written message in the chat. While this format was designed specifically for multi-turn conversations, you'll find it can also work well for non-chat scenarios too.
17
+
The GPT-3.5-Turbo and GPT-4 models are language models that are optimized for conversational interfaces. The models behave differently than the older GPT-3 models. Previous models were text-in and text-out, meaning they accepted a prompt string and returned a completion to append to the prompt. However, the GPT-3.5-Turbo and GPT-4 models are conversation-in and message-out. The models expect input formatted in a specific chat-like transcript format, and return a completion that represents a model-written message in the chat. While this format was designed specifically for multi-turn conversations, you'll find it can also work well for non-chat scenarios too.
19
18
20
-
In Azure OpenAI there are two different options for interacting with these type of models:
19
+
This article walks you through getting started with the GPT-3.5-Turbo and GPT-4 models. It's important to use the techniques described here to get the best results. If you try to interact with the models the same way you did with the older model series, the models will often be verbose and provide less useful responses.
21
20
22
-
- Chat Completion API.
23
-
- Completion API with Chat Markup Language (ChatML).
24
-
25
-
The Chat Completion API is a new dedicated API for interacting with the GPT-35-Turbo and GPT-4 models. This API is the preferred method for accessing these models. **It is also the only way to access the new GPT-4 models**.
26
-
27
-
ChatML uses the same [completion API](../reference.md#completions) that you use for other models like text-davinci-002, it requires a unique token based prompt format known as Chat Markup Language (ChatML). This provides lower level access than the dedicated Chat Completion API, but also requires additional input validation, only supports gpt-35-turbo models, and **the underlying format is more likely to change over time**.
28
-
29
-
This article walks you through getting started with the GPT-35-Turbo and GPT-4 models. It's important to use the techniques described here to get the best results. If you try to interact with the models the same way you did with the older model series, the models will often be verbose and provide less useful responses.
30
-
31
-
::: zone pivot="programming-language-chat-completions"
Copy file name to clipboardExpand all lines: articles/ai-services/openai/how-to/use-web-app.md
+2Lines changed: 2 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -52,6 +52,8 @@ Sample source code for the web app is available on [GitHub](https://github.com/m
52
52
53
53
We recommend pulling changes from the `main` branch for the web app's source code frequently to ensure you have the latest bug fixes, API version, and improvements. Additionally, the web app must be synchronized every time the API version being used is [retired](../api-version-deprecation.md#retiring-soon).
54
54
55
+
Consider either clicking the **watch** or **star** buttons on the web app's [GitHub](https://github.com/microsoft/sample-app-aoai-chatGPT) repo to be notified about changes and updates to the source code.
Copy file name to clipboardExpand all lines: articles/ai-services/openai/how-to/use-your-data-securely.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,7 +8,7 @@ ms.service: azure-ai-openai
8
8
ms.topic: how-to
9
9
author: aahill
10
10
ms.author: aahi
11
-
ms.date: 02/13/2024
11
+
ms.date: 04/05/2024
12
12
recommendations: false
13
13
---
14
14
@@ -332,7 +332,7 @@ Make sure your sign-in credential has `Cognitive Services OpenAI Contributor` ro
332
332
### Ingestion API
333
333
334
334
335
-
See the [ingestion API reference article](/azure/ai-services/openai/reference#start-an-ingestion-job) for details on the request and response objects used by the ingestion API.
335
+
See the [ingestion API reference article](/rest/api/azureopenai/ingestion-jobs?context=/azure/ai-services/openai/context/context) for details on the request and response objects used by the ingestion API.
0 commit comments