Skip to content

Commit a69ff80

Browse files
authored
Merge pull request #1789 from MicrosoftDocs/main
12/3 11:00 AM IST Publish
2 parents 3b6e85f + e1ca57f commit a69ff80

File tree

2 files changed

+11
-5
lines changed

2 files changed

+11
-5
lines changed

articles/ai-services/openai/quotas-limits.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@ The following sections provide you with a quick guide to the default quotas and
2424

2525
| Limit Name | Limit Value |
2626
|--|--|
27-
| OpenAI resources per region per Azure subscription | 30 |
27+
| Azure OpenAI resources per region per Azure subscription | 30 |
2828
| Default DALL-E 2 quota limits | 2 concurrent requests |
2929
| Default DALL-E 3 quota limits| 2 capacity units (6 requests per minute)|
3030
| Default Whisper quota limits | 3 requests per minute |
@@ -44,8 +44,8 @@ The following sections provide you with a quick guide to the default quotas and
4444
| Max number of `/chat/completions` functions | 128 |
4545
| Max number of `/chat completions` tools | 128 |
4646
| Maximum number of Provisioned throughput units per deployment | 100,000 |
47-
| Max files per Assistant/thread | 10,000 when using the API or AI Foundry. 20 when using Azure OpenAI Studio.|
48-
| Max file size for Assistants & fine-tuning | 512 MB |
47+
| Max files per Assistant/thread | 10,000 when using the API or Azure AI Foundry portal. In Azure OpenAI Studio the limit was 20.|
48+
| Max file size for Assistants & fine-tuning | 512 MB<br/><br/>200 MB via Azure AI Foundry portal |
4949
| Max size for all uploaded files for Assistants |100 GB |
5050
| Assistants token limit | 2,000,000 token limit |
5151
| GPT-4o max images per request (# of images in the messages array/conversation history) | 50 |
@@ -181,7 +181,7 @@ To minimize issues related to rate limits, it's a good idea to use the following
181181

182182
### How to request increases to the default quotas and limits
183183

184-
Quota increase requests can be submitted from the [Quotas](./how-to/quota.md) page of Azure AI Foundry. Due to high demand, quota increase requests are being accepted and will be filled in the order they're received. Priority is given to customers who generate traffic that consumes the existing quota allocation, and your request might be denied if this condition isn't met.
184+
Quota increase requests can be submitted from the [Quotas](./how-to/quota.md) page in the Azure AI Foundry portal. Due to high demand, quota increase requests are being accepted and will be filled in the order they're received. Priority is given to customers who generate traffic that consumes the existing quota allocation, and your request might be denied if this condition isn't met.
185185

186186
For other rate limits, [submit a service request](../cognitive-services-support-options.md?context=/azure/ai-services/openai/context/context).
187187

articles/machine-learning/how-to-auto-train-forecast.md

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1314,7 +1314,13 @@ For a more detailed example, see the [demand forecasting with many models notebo
13141314

13151315
#### Training considerations for a many models run
13161316

1317-
The many models training and inference components conditionally partition your data according to the `partition_column_names` setting so each partition is in its own file. This process can be very slow or fail when data is very large. The recommendation is to partition your data manually before you run many models training or inference.
1317+
- The many models training and inference components conditionally partition your data according to the `partition_column_names` setting. This process results in each partition being in its own file. The process can be very slow or fail when data is very large. The recommendation is to partition your data manually before you run many models training or inference.
1318+
1319+
- During many models training, models are automatically registered in the workspace, and hence manual registration of models are not required. Models are named based on the partition on which they were trained and this is not customizable. Same for tags, these are not customizable, and we use these properties to auto detect models during inference.
1320+
1321+
- Deploying individual model is not at all scalable, and hence we provide `PipelineComponentBatchDeployment` to ease the deployment process. Please refer [demand forecasting with many models notebook](https://github.com/Azure/azureml-examples/blob/main/sdk/python/jobs/pipelines/1k_demand_forecast_pipeline/aml-demand-forecast-mm-pipeline/aml-demand-forecast-mm-pipeline.ipynb) to see this in action.
1322+
1323+
- During inference, appropriate models (latest version) are automatically selected based on the partition sent in the inference data. By default, latest models are selected from an experiment by providing `training_experiment_name` but you can override to select models from a particular training run by also providing `train_run_id`.
13181324

13191325
> [!NOTE]
13201326
> The default parallelism limit for a many models run within a subscription is set to 320. If your workload requires a higher limit, you can contact Microsoft support.

0 commit comments

Comments
 (0)