You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Support for containers is currently available with Document Intelligence version `2022-08-31 (GA)` only:
21
+
Support for containers is currently available with Document Intelligence version `2022-08-31 (GA)`for all models and `2023-07-31 (GA)` for Read and Layout only:
22
22
23
23
*[REST API `2022-08-31 (GA)`](https://westus.dev.cognitive.microsoft.com/docs/services/form-recognizer-api-2022-08-31/operations/AnalyzeDocument)
24
+
*[REST API `2023-07-31 (GA)`](https://westus.dev.cognitive.microsoft.com/docs/services/form-recognizer-api-2023-07-31/operations/AnalyzeDocument)
24
25
*[SDKs targeting `REST API 2022-08-31 (GA)`](../sdk-overview-v3-0.md)
26
+
*[SDKs targeting `REST API 2023-07-31 (GA)`](../sdk-overview-v3-1.md)
25
27
26
28
✔️ See [**Configure Document Intelligence v3.0 containers**](?view=doc-intel-3.0.0&preserve-view=true) for supported container documentation.
With Document Intelligence containers, you can build an application architecture optimized to take advantage of both robust cloud capabilities and edge locality. Containers provide a minimalist, isolated environment that can be easily deployed on-premises and in the cloud. In this article, we show you how to configure the Document Intelligence container run-time environment by using the `docker compose` command arguments. Document Intelligence features are supported by six Document Intelligence feature containers—**Layout**, **Business Card**,**ID Document**, **Receipt**, **Invoice**, **Custom**. These containers have both required and optional settings. For a few examples, see the [Example docker-compose.yml file](#example-docker-composeyml-file) section.
36
+
With Document Intelligence containers, you can build an application architecture optimized to take advantage of both robust cloud capabilities and edge locality. Containers provide a minimalist, isolated environment that can be easily deployed on-premises and in the cloud. In this article, we show you how to configure the Document Intelligence container run-time environment by using the `docker compose` command arguments. Document Intelligence features are supported by seven Document Intelligence feature containers—**Read**, **Layout**, **Business Card**,**ID Document**, **Receipt**, **Invoice**, **Custom**. These containers have both required and optional settings. For a few examples, see the [Example docker-compose.yml file](#example-docker-composeyml-file) section.
Support for containers is currently available with Document Intelligence version `2022-08-31 (GA)`:
20
+
Support for containers is currently available with Document Intelligence version `2022-08-31 (GA)` for all models and `2023-07-31 (GA)` for Read and Layout only:
21
21
22
22
*[REST API `2022-08-31 (GA)`](https://westus.dev.cognitive.microsoft.com/docs/services/form-recognizer-api-2022-08-31/operations/AnalyzeDocument)
23
+
*[REST API `2023-07-31 (GA)`](https://westus.dev.cognitive.microsoft.com/docs/services/form-recognizer-api-2023-07-31/operations/AnalyzeDocument)
23
24
*[SDKs targeting `REST API 2022-08-31 (GA)`](../sdk-overview-v3-0.md)
25
+
*[SDKs targeting `REST API 2023-07-31 (GA)`](../sdk-overview-v3-1.md)
24
26
25
27
✔️ See [**Document Intelligence v3.0 containers in disconnected environments**](?view=doc-intel-3.0.0&preserve-view=true) for supported container documentation.
Support for containers is currently available with Document Intelligence version `2022-08-31 (GA)` only:
21
+
Support for containers is currently available with Document Intelligence version `2022-08-31 (GA)`for all models and `2023-07-31 (GA)` for Read and Layout only:
22
22
23
23
*[REST API `2022-08-31 (GA)`](https://westus.dev.cognitive.microsoft.com/docs/services/form-recognizer-api-2022-08-31/operations/AnalyzeDocument)
24
+
*[REST API `2023-07-31 (GA)`](https://westus.dev.cognitive.microsoft.com/docs/services/form-recognizer-api-2023-07-31/operations/AnalyzeDocument)
24
25
*[SDKs targeting `REST API 2022-08-31 (GA)`](../sdk-overview-v3-0.md)
26
+
*[SDKs targeting `REST API 2023-07-31 (GA)`](../sdk-overview-v3-1.md)
25
27
26
28
✔️ See [**Document Intelligence container image tags**](?view=doc-intel-3.0.0&preserve-view=true) for supported container documentation.
27
29
@@ -51,6 +53,24 @@ The following containers support DocumentIntelligence v3.0 models and features:
Document Intelligence container images can be found within the [**Microsoft Artifact Registry** (also know as Microsoft Container Registry(MCR))](https://mcr.microsoft.com/catalog?search=document%20intelligence), the primary registry for all Microsoft published container images.
63
+
64
+
The following containers support DocumentIntelligence v3.0 models and features:
Copy file name to clipboardExpand all lines: articles/ai-services/document-intelligence/faq.yml
+14Lines changed: 14 additions & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -556,6 +556,20 @@ sections:
556
556
557
557
FOTT has a version that uses local storage. The version needs to be installed on a Windows machine. You can install it from [this location](https://github.com/microsoft/OCR-Form-Tools/releases/download/v2.1-ga/oflt-2.1.3-win32.exe). On the project page, specify the Label folder URI as /shared or /shared/sub-dir if your labeling files are in a sub directory. All other Document Intelligence Sample Labeling Tool behavior is the same as the hosted service.
558
558
559
+
- question: |
560
+
What is the best practice for scaling up?
561
+
answer: |
562
+
563
+
- For asynchronous calls, you can run multiple containers with shared storages. The container processing the POST Analyze call stores the output in the storage. Then, any other container can fetch the results from the storage and serve the GET calls. Thus, the request ID is not tied to a container.
564
+
565
+
- For synchronous calls, multiple containers can be run and only one container will serve a request. As it is a blocking call, any container from the pool can serve the request and send the response. Here, only one container will be tied to a request at a time and no polling is required.
566
+
567
+
- question: |
568
+
How can I set up containers with shared storage?
569
+
answer: |
570
+
571
+
The containers use ‘Mounts:Shared’ property while starting up for specifying the shared storage to store the processing files. You can refer the following [documentation](https://learn.microsoft.com/en-us/azure/ai-services/document-intelligence/containers/install-run?view=doc-intel-3.0.0&tabs=custom#create-a-folder-for-storing-internal-processing-shared-between-the-containers) to see the use of this property.
Copy file name to clipboardExpand all lines: articles/ai-services/openai/concepts/models.md
+13-10Lines changed: 13 additions & 10 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -74,24 +74,25 @@ You can also use the OpenAI text to speech voices via Azure AI Speech. To learn
74
74
75
75
## Model summary table and region availability
76
76
77
-
> [!IMPORTANT]
78
-
> Due to high demand:
79
-
>
80
-
> - South Central US is temporarily unavailable for creating new resources and deployments.
81
-
82
77
### GPT-4 and GPT-4 Turbo Preview models
83
78
84
-
85
79
GPT-4, GPT-4-32k, and GPT-4 Turbo with Vision are now available to all Azure OpenAI Service customers. Availability varies by region. If you don't see GPT-4 in your region, please check back later.
86
80
87
81
These models can only be used with the Chat Completion API.
88
82
89
-
GPT-4 version 0314 is the first version of the model released. Version 0613 is the second version of the model and adds function calling support.
83
+
GPT-4 version 0314 is the first version of the model released. Version 0613 is the second version of the model and adds function calling support.
90
84
91
85
See [model versions](../concepts/model-versions.md) to learn about how Azure OpenAI Service handles model version upgrades, and [working with models](../how-to/working-with-models.md) to learn how to view and configure the model version settings of your GPT-4 deployments.
92
86
93
87
> [!NOTE]
94
-
> Version `0613` of `gpt-4` and `gpt-4-32k` will be retired on June 13, 2024. Version `0314` of `gpt-4` and `gpt-4-32k` will be retired on July 5, 2024. See [model updates](../how-to/working-with-models.md#model-updates) for model upgrade behavior.
88
+
> Version `0314` of `gpt-4` and `gpt-4-32k` will be retired no earlier than July 5, 2024. Version `0613` of `gpt-4` and `gpt-4-32k` will be retired no earlier than September 30, 2024. See [model updates](../how-to/working-with-models.md#model-updates) for model upgrade behavior.
89
+
90
+
91
+
GPT-4 version 0125-preview is an updated version of the GPT-4 Turbo preview previously released as version 1106-preview. GPT-4 versio 0125-preview completes tasks such as code generation more completely compared to gpt-4-1106-preview. Because of this, depending on the task, customers may find that GPT-4-0125-preview generates more output compared to the gpt-4-1106-preview. We recommend customers compare the outputs of the new model. GPT-4-0125-preview also addresses bugs in gpt-4-1106-preview with UTF-8 handling for non-English languages.
92
+
93
+
> [!IMPORTANT]
94
+
>
95
+
> -`gpt-4` version 0125-preview replaces version 1106-preview. Deployments of `gpt-4` version 1106-preview set to "Auto-update to default" and "Upgrade when expired" will start to be upgraded on February 20, 2024 and will complete upgrades within 2 weeks. Deployments of `gpt-4` version 1106-preview set to "No autoupgrade" will stop working starting February 20, 2024. If you have a deployment of `gpt-4` version 1106-preview, you can test version `0125-preview` in the available regions below.
95
96
96
97
| Model ID | Max Request (tokens) | Training Data (up to) |
97
98
| --- | :--- | :---: |
@@ -100,14 +101,15 @@ See [model versions](../concepts/model-versions.md) to learn about how Azure Ope
**<sup>1</sup>** GPT-4 Turbo Preview = `gpt-4` (1106-preview). To deploy this model, under **Deployments** select model **gpt-4**. For **Model version** select **1106-preview**.
107
+
**<sup>1</sup>** GPT-4 Turbo Preview = `gpt-4` (0125-preview). To deploy this model, under **Deployments** select model **gpt-4**. For **Model version** select **0125-preview**.
106
108
107
109
**<sup>2</sup>** GPT-4 Turbo with Vision Preview = `gpt-4` (vision-preview). To deploy this model, under **Deployments** select model **gpt-4**. For **Model version** select **vision-preview**.
108
110
109
111
> [!CAUTION]
110
-
> We don't recommend using preview models in production. We will upgrade all deployments of preview models to a future stable version. Models designated preview do not follow the standard Azure OpenAI model lifecycle.
112
+
> We don't recommend using preview models in production. We will upgrade all deployments of preview models to future preview versions and a stable version. Models designated preview do not follow the standard Azure OpenAI model lifecycle.
111
113
112
114
> [!NOTE]
113
115
> Regions where GPT-4 (0314) & (0613) are listed as available have access to both the 8K and 32K versions of the model
@@ -121,6 +123,7 @@ See [model versions](../concepts/model-versions.md) to learn about how Azure Ope
121
123
| gpt-4 (0314) || East US <br> France Central <br> South Central US <br> UK South |
122
124
| gpt-4 (0613) | Australia East <br> Canada East <br> France Central <br> Sweden Central <br> Switzerland North | East US <br> East US 2 <br> Japan East <br> UK South |
123
125
| gpt-4 (1106-preview) | Australia East <br> Canada East <br> East US 2 <br> France Central <br> Norway East <br> South India <br> Sweden Central <br> UK South <br> West US ||
126
+
| gpt-4 (0125-preview) | East US <br> North Central US <br> South Central US <br> |
124
127
| gpt-4 (vision-preview) | Sweden Central <br> West US <br> Japan East| Switzerland North <br> Australia East |
Copy file name to clipboardExpand all lines: articles/ai-services/openai/concepts/use-your-data.md
+15-4Lines changed: 15 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -232,15 +232,26 @@ When you want to reuse the same URL/web address, you can select [Azure AI Search
232
232
---
233
233
234
234
235
-
## Custom parameters
235
+
## Ingestion parameters
236
236
237
-
You can modify the following additional settings in the **Data parameters** section in Azure OpenAI Studio and [the API](../reference.md#completions-extensions).
237
+
You can use the following parameter to change how your data is ingested in Azure OpenAI Studio, Azure AI Studio, and the ingestion API. Changing the parameter requires re-ingesting your data into Azure Search.
238
+
239
+
|Parameter name | Description |
240
+
|---------|---------|
241
+
|**Chunk size**| Azure OpenAI on your data processes your documents by splitting them into chunks before indexing them in Azure Search. The chunk size is the maximum number of tokens for any chunk in the search index. The default chunk size is 1024 tokens. However, given the uniqueness of your data, you may find a different chunk size (such as 256, 512, or 1536 tokens for example) more effective. Adjusting the chunk size can enhance the performance of the chat bot. While finding the optimal chunk size requires some trial and error, start by considering the nature of your dataset. A smaller chunk size is generally better for datasets with direct facts and less context, while a larger chunk size might be beneficial for more contextual information, though it can affect retrieval performance. This is the `chunkSize` parameter in the API.|
242
+
243
+
244
+
## Runtime parameters
245
+
246
+
You can modify the following additional settings in the **Data parameters** section in Azure OpenAI Studio and [the API](../reference.md#completions-extensions). You do not need to re-ingest your your data when you update these parameters.
238
247
239
248
240
249
|Parameter name | Description |
241
250
|---------|---------|
242
-
|**Retrieved documents**| Specifies the number of top-scoring documents from your data index used to generate responses. You might want to increase the value when you have short documents or want to provide more context. The default value is 5. This is the `topNDocuments` parameter in the API. |
243
-
|**Strictness**| Sets the threshold to categorize documents as relevant to your queries. Raising the value means a higher threshold for relevance and filters out more less-relevant documents for responses. Setting this value too high might cause the model to fail to generate responses due to limited available documents. The default value is 3. |
251
+
|**Limit responses to your data**| This flag configures the chatbot's approach to handling queries unrelated to the data source or when search documents are insufficient for a complete answer. When this setting is disabled, the model supplements its responses with its own knowledge in addition to your documents. When this setting is enabled, the model attempts to only rely on your documents for responses. This is the `inScope` parameter in the API. |
252
+
|**Top K Documents** | This parameter is an integer that can be set to 3, 5, 10, or 20, and controls the number of document chunks provided to the large language model for formulating the final response. By default, this is set to 5. The search process can be noisy and sometimes, due to chunking, relevant information may be spread across multiple chunks in the search index. Selecting a top-K number, like 5, ensures that the model can extract relevant information, despite the inherent limitations of search and chunking. However, increasing the number too high can potentially distract the model. Additionally, the maximum number of documents that can be effectively used depends on the version of the model, as each has a different context size and capacity for handling documents. If you find that responses are missing important context, try increasing this parameter. Conversely, if you think the model is providing irrelevant information alongside useful data, consider decreasing it. When experimenting with the [chunk size](#ingestion-parameters), we recommend adjusting the top-K parameter to achieve the best performance. Usually, it is beneficial to change the top-K value in the opposite direction of your chunk size adjustment. For example, if you decrease the chunk size from the default of 1024, you might want to increase the top-K value to 10 or 20. This ensures a similar amount of information is provided to the model, as reducing the chunk size decreases the amount of information in the 5 documents given to the model. This is the `topNDocuments` parameter in the API. |
253
+
| **Strictness** | Determines the system's aggressiveness in filtering search documents based on their similarity scores. The system queries Azure Search or other document stores, then decides which documents to provide to large language models like ChatGPT. Filtering out irrelevant documents can significantly enhance the performance of the end-to-end chatbot. Some documents are excluded from the top-K results if they have low similarity scores before forwarding them to the model. This is controlled by an integer value ranging from 1 to 5. Setting this value to 1 means that the system will minimally filter documents based on search similarity to the user query. Conversely, a setting of 5 indicates that the system will aggressively filter out documents, applying a very high similarity threshold. If you find that the chatbot omits relevant information, lower the filter's strictness (set the value closer to 1) to include more documents. Conversely, if irrelevant documents distract the responses, increase the threshold (set the value closer to 5). This is the `strictness` parameter in the API. |
0 commit comments