Skip to content

Commit bc44b9e

Browse files
authored
Merge pull request #260078 from MicrosoftDocs/main
12/1/2023 PM Publish
2 parents 510e995 + 5808c48 commit bc44b9e

File tree

118 files changed

+2245
-542
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

118 files changed

+2245
-542
lines changed

.openpublishing.publish.config.json

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -176,6 +176,12 @@
176176
"branch": "master",
177177
"branch_mapping": {}
178178
},
179+
{
180+
"path_to_root": "azure_powershell_scripts",
181+
"url": "https://github.com/Azure-Samples/azure-docs-powershell-samples",
182+
"branch": "main",
183+
"branch_mapping": {}
184+
},
179185
{
180186
"path_to_root": "powershell_scripts",
181187
"url": "https://github.com/Azure/azure-docs-powershell-samples",

articles/ai-services/document-intelligence/toc.yml

Lines changed: 15 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ items:
1818
- name: Changelog and release history
1919
displayName: latest, update, beta, package, preview, version
2020
href: changelog-release-history.md
21-
- name: "SDK targets: REST API 2023–10–31-preview"
21+
- name: "🆕 SDK targets: REST API 2023–10–31-preview"
2222
displayName: get started, installation, downloads, documentAnalysisClient, document analysis client, Azure AD, Azure Active Directory, identity, changelog, package, version,AzureKeyCredential, Azure key credential, key, endpoint
2323
href: sdk-overview-v4-0.md
2424
- name: "SDK targets: REST API 2023–7–31 latest (GA)"
@@ -107,20 +107,6 @@ items:
107107
- name: Back up and recover models
108108
displayName: disaster, recovery, region, copy, modelId, model ID
109109
href: disaster-recovery.md
110-
- name: Use containers
111-
items:
112-
- name: Install and run containers
113-
displayName: docker, docker-compose, container, compose, packages, az, acr, cli, import, pull, Azure AI Vision, kubernetes, cpu, EULA, billing
114-
href: containers/install-run.md
115-
- name: Configure containers
116-
displayName: docker, docker-compose, compose, containers, run, billing, keys, logging, EULA, telemetry, proxy
117-
href: containers/configuration.md
118-
- name: Container image tags
119-
displayName: registry, mcr
120-
href: containers/image-tags.md
121-
- name: Disconnected containers
122-
displayName: docker, commitment, plan, environment, usage
123-
href: containers/disconnected.md
124110
- name: Configure secure communications
125111
items:
126112
- name: Set up Virtual Networks
@@ -138,6 +124,20 @@ items:
138124
- name: Use Microsoft Entra authentication
139125
displayName: headers, subscription, access token, azure active directory, subdomain, role, service principal
140126
href: ../../ai-services/authentication.md?context=/azure/ai-services/document-intelligence/context/context
127+
- name: Containers
128+
items:
129+
- name: Install and run containers
130+
displayName: docker, docker-compose, container, compose, packages, az, acr, cli, import, pull, Azure AI Vision, kubernetes, cpu, EULA, billing
131+
href: containers/install-run.md
132+
- name: Configure containers
133+
displayName: docker, docker-compose, compose, containers, run, billing, keys, logging, EULA, telemetry, proxy
134+
href: containers/configuration.md
135+
- name: Container image tags
136+
displayName: registry, mcr
137+
href: containers/image-tags.md
138+
- name: Disconnected containers
139+
displayName: docker, commitment, plan, environment, usage
140+
href: containers/disconnected.md
141141
- name: Samples
142142
items:
143143
- name: "C#/.NET"
@@ -244,7 +244,6 @@ items:
244244
displayName: connectors, automate, automation, workflows
245245
href: tutorial-logic-apps.md
246246
- name: "Create Document Intelligence workflows with AI Builder"
247-
expanded: true
248247
items:
249248
- name: Custom
250249
href: /ai-builder/create-form-processing-model?toc=/azure/ai-services/document-intelligence/toc.json&bc=/azure/ai-services/document-intelligence/breadcrumb/toc.json

articles/ai-services/openai/concepts/use-your-data.md

Lines changed: 41 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -237,26 +237,15 @@ To add a new data source to Azure OpenAI on your data, you need the following Az
237237
| [Contributor](/azure/role-based-access-control/built-in-roles#contributor) | Your subscription, to access Azure Resource Manager. | You want to deploy a web app. |
238238
| [Cognitive Services Contributor Role](/azure/role-based-access-control/built-in-roles#cognitive-services-contributor) | The Azure AI Search resource, to access Azure OpenAI resource. | You want to deploy a [web app](#using-the-web-app). |
239239

240-
## Virtual network support & private endpoint support (Azure AI Search only)
240+
## Virtual network support & private endpoint support
241241

242-
> [!TIP]
243-
> For instructions on setting up your resources to work on a virtual private network or private endpoint, see [Use Azure OpenAI on your data securely](../how-to/use-your-data-securely.md)
244-
245-
### Azure OpenAI resources
246-
247-
You can protect Azure OpenAI resources in [virtual networks and private endpoints](/azure/ai-services/cognitive-services-virtual-networks) the same way as any Azure AI service.
248-
249-
### Azure AI Search resources
250-
251-
If you have an Azure AI Search resource protected by a private network, and want to allow Azure OpenAI on your data to access your search service, complete [an application form](https://aka.ms/applyacsvpnaoaioyd). The application will be reviewed in ten business days and you will be contacted via email about the results. If you are eligible, we will send a private endpoint request to your search service, and you will need to approve the request.
242+
* For instructions on setting up your resources to work on a virtual private network or private endpoint, see [Use Azure OpenAI on your data securely](../how-to/use-your-data-securely.md)
243+
* Azure OpenAI, Azure AI Search, and Azure Storage Accounts can be protected under private endpoints and virtual private networks.
252244

253-
:::image type="content" source="../media/use-your-data/approve-private-endpoint.png" alt-text="A screenshot showing private endpoint approval screen." lightbox="../media/use-your-data/approve-private-endpoint.png":::
245+
## Document-level access control
254246

255-
Learn more about the [manual approval workflow](/azure/private-link/private-endpoint-overview#access-to-a-private-link-resource-using-approval-workflow).
256-
257-
After you approve the request in your search service, you can start using the [chat completions extensions API](/azure/ai-services/openai/reference#completions-extensions). Public network access can be disabled for that search service.
258-
259-
## Document-level access control (Azure AI Search only)
247+
> [!NOTE]
248+
> Document-level access control is supported for Azure AI search only.
260249
261250
Azure OpenAI on your data lets you restrict the documents that can be used in responses for different users with Azure AI Search [security filters](/azure/search/search-security-trimming-for-azure-search-with-aad). When you enable document level access, the search results returned from Azure AI Search and used to generate a response will be trimmed based on user Microsoft Entra group membership. You can only enable document-level access on existing Azure AI Search indexes. To enable document-level access:
262251

@@ -308,7 +297,10 @@ When using the API, pass the `filter` parameter in each API request. For example
308297
* `my_group_ids` is the field name that you selected for **Permitted groups** during [fields mapping](#index-field-mapping).
309298
* `group_id1, group_id2` are groups attributed to the logged in user. The client application can retrieve and cache users' groups.
310299

311-
## Schedule automatic index refreshes (Azure AI Search only)
300+
## Schedule automatic index refreshes
301+
302+
> [!NOTE]
303+
> Automatic index refreshing is supported for Azure Blob storage only.
312304
313305
To keep your Azure AI Search index up-to-date with your latest data, you can schedule a refresh for it that runs automatically rather than manually updating it every time your data is updated. Automatic index refresh is only available when you choose **blob storage** as the data source. To enable an automatic index refresh:
314306

@@ -542,6 +534,37 @@ When you chat with a model, providing a history of the chat will help the model
542534
}
543535
```
544536

537+
## Token usage estimation for Azure OpenAI on your data
538+
539+
540+
| Model | Total tokens available | Max tokens for system message | Max tokens for model response |
541+
|-------------------------|------------------------|------------------------------------|------------------------------------|
542+
| ChatGPT Turbo (0301) 8k | 8000 | 400 | 1500 |
543+
| ChatGPT Turbo 16k | 16000 | 1000 | 3200 |
544+
| GPT-4 (8k) | 8000 | 400 | 1500 |
545+
| GPT-4 32k | 32000 | 2000 | 6400 |
546+
547+
The table above shows the total number of tokens available for each model type. It also determines the maximum number of tokens that can be used for the [system message](#system-message) and the model response. Additionally, the following also consume tokens:
548+
549+
550+
* The meta prompt (MP): if you limit responses from the model to the grounding data content (`inScope=True` in the API), the maximum number of tokens is 4036 tokens. Otherwise (for example if `inScope=False`) the maximum is 3444 tokens. This number is variable depending on the token length of the user question and conversation history. This estimate includes the base prompt as well as the query rewriting prompts for retrieval.
551+
* User question and history: Variable but capped at 2000 tokens.
552+
* Retrieved documents (chunks): The number of tokens used by the retrieved document chunks depends on multiple factors. The upper bound for this is the number of retrieved document chunks multiplied by the chunk size. It will, however, be truncated based on the tokens available tokens for the specific model being used after counting the rest of fields.
553+
554+
20% of the available tokens are reserved for the model response. The remaining 80% of available tokens include the meta prompt, the user question and conversation history, and the system message. The remaining token budget is used by the retrieved document chunks.
555+
556+
```python
557+
import tiktoken
558+
559+
class TokenEstimator(object):
560+
561+
GPT2_TOKENIZER = tiktoken.get_encoding("gpt2")
562+
563+
def estimate_tokens(self, text: str) -> int:
564+
return len(self.GPT2_TOKENIZER.encode(text))
565+
566+
token_output = TokenEstimator.estimate_tokens(input_text)
567+
```
545568

546569
## Next steps
547570
* [Get started using your data with Azure OpenAI](../use-your-data-quickstart.md)

articles/ai-services/openai/faq.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -212,7 +212,7 @@ sections:
212212
- question: |
213213
How can I update my index?
214214
answer:
215-
You can [schedule an automatic index refresh](concepts/use-your-data.md#schedule-automatic-index-refreshes-azure-ai-search-only), or upload additional data to your Azure Blob Container and use it as your data source when you create a new index. The new index will include all of the data in your container.
215+
You can [schedule an automatic index refresh](concepts/use-your-data.md#schedule-automatic-index-refreshes), or upload additional data to your Azure Blob Container and use it as your data source when you create a new index. The new index will include all of the data in your container.
216216
- question: |
217217
What file types does Azure OpenAI on your data support?
218218
answer:

articles/ai-services/openai/how-to/use-your-data-securely.md

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@ When you ingest data into Azure OpenAI on your data, the following process is us
2727

2828
1. The ingestion process is started when a client sends data to be processed.
2929
1. Ingestion assets (indexers, indexes, data sources, a [custom skill](/azure/search/cognitive-search-custom-skill-interface) and container in the search resource) are created in the Azure AI Search resource and Azure storage account.
30-
1. If the ingestion is triggered by a [scheduled refresh](../concepts/use-your-data.md#schedule-automatic-index-refreshes-azure-ai-search-only), the ingestion process starts at `[3]`.
30+
1. If the ingestion is triggered by a [scheduled refresh](../concepts/use-your-data.md#schedule-automatic-index-refreshes), the ingestion process starts at `[3]`.
3131
1. Azure OpenAI's `preprocessing-jobs` API implements the [Azure AI Search customer skill web API protocol](/azure/search/cognitive-search-custom-skill-web-api), and processes the documents in a queue.
3232
1. Azure OpenAI:
3333
1. Internally uses the indexer created earlier to crack the documents.
@@ -110,6 +110,8 @@ To set the managed identities via the management API, see [the management API re
110110

111111
## Security support for Azure AI Search
112112

113+
You can protect Azure OpenAI resources in [virtual networks and private endpoints](/azure/ai-services/cognitive-services-virtual-networks) the same way as any Azure AI service.
114+
113115
### Inbound security: authentication
114116
As Azure OpenAI will use managed identity to access Azure AI Search, you need to enable Azure AD based authentication in your Azure AI Search. To do it on Azure portal, select **Both** in the **Keys** tab in the Azure portal.
115117

@@ -133,7 +135,11 @@ To use Azure OpenAI Studio, you can't disable the API key based authentication f
133135
134136
### Inbound security: networking
135137

136-
Use **Selected networks** in the Azure portal. Azure AI Search doesn't support bypassing trusted services, so it is the most complex part in the setup. Create a private endpoint for theAzure OpenAI on your data (as a multitenant service managed by Microsoft), and link it to your Azure AI Search resource. This requires you to submit an [application form](https://aka.ms/applyacsvpnaoaioyd).
138+
Use **Selected networks** in the Azure portal. Azure AI Search doesn't support bypassing trusted services, so it is the most complex part in the setup. Create a private endpoint for the Azure OpenAI on your data resource (as a multitenant service managed by Microsoft), and link it to your Azure AI Search resource. This requires you to submit an [application form](https://aka.ms/applyacsvpnaoaioyd). The application will be reviewed in ten business days and you will be contacted via email about the results. If you are eligible, we will send a private endpoint request to your search service, and you will need to approve the request.
139+
140+
:::image type="content" source="../media/use-your-data/approve-private-endpoint.png" alt-text="A screenshot showing private endpoint approval screen." lightbox="../media/use-your-data/approve-private-endpoint.png":::
141+
142+
Learn more about the [manual approval workflow](/azure/private-link/private-endpoint-overview#access-to-a-private-link-resource-using-approval-workflow).
137143

138144
> [!NOTE]
139145
> To use Azure OpenAI Studio, you cannot disable public network access, and you need to add your local IP to the IP rules, because Azure AI Studio calls the search API from your browser to list available indexes.

0 commit comments

Comments
 (0)