MicrosoftDocs
diff --git a/‎articles/ai-foundry/concepts/model-lifecycle-retirement.md‎
Lines changed: 6 additions & 0 deletions b/‎articles/ai-foundry/concepts/model-lifecycle-retirement.md‎
Lines changed: 6 additions & 0 deletions
diff --git a/‎articles/ai-foundry/concepts/models-featured.md‎
Lines changed: 3 additions & 2 deletions b/‎articles/ai-foundry/concepts/models-featured.md‎
Lines changed: 3 additions & 2 deletions
diff --git a/‎articles/ai-foundry/includes/region-availability-maas.md‎
Lines changed: 1 addition & 0 deletions b/‎articles/ai-foundry/includes/region-availability-maas.md‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎articles/ai-foundry/model-inference/concepts/models.md‎
Lines changed: 2 additions & 1 deletion b/‎articles/ai-foundry/model-inference/concepts/models.md‎
Lines changed: 2 additions & 1 deletion
diff --git a/‎articles/ai-foundry/model-inference/quotas-limits.md‎
Lines changed: 3 additions & 3 deletions b/‎articles/ai-foundry/model-inference/quotas-limits.md‎
Lines changed: 3 additions & 3 deletions
diff --git a/‎articles/ai-services/openai/includes/batch/batch-python.md‎
Lines changed: 34 additions & 13 deletions b/‎articles/ai-services/openai/includes/batch/batch-python.md‎
Lines changed: 34 additions & 13 deletions
diff --git a/‎articles/ai-services/openai/includes/batch/batch-rest.md‎
Lines changed: 26 additions & 11 deletions b/‎articles/ai-services/openai/includes/batch/batch-rest.md‎
Lines changed: 26 additions & 11 deletions
@@ -80,6 +80,12 @@ The following tables list the timelines for models that are on track for retirem
 | [Cohere-rerank-v3-english](https://ai.azure.com/explore/models/Cohere-rerank-v3-english/version/1/registry/azureml-cohere) | February 28, 2025 | March 31, 2025 | June 30, 2025 | [Cohere-rerank-v3.5-english](https://ai.azure.com/explore/models/Cohere-rerank-v3.5/version/1/registry/azureml-cohere) |
 | [Cohere-rerank-v3-multilingual](https://ai.azure.com/explore/models/Cohere-rerank-v3-multilingual/version/1/registry/azureml-cohere) | February 28, 2025 | March 31, 2025 | June 30, 2025 | [Cohere-rerank-v3.5-multilingual](https://ai.azure.com/explore/models/Cohere-rerank-v3.5/version/1/registry/azureml-cohere) |
 
+#### DeepSeek
+
+| Model | Legacy date (UTC) | Deprecation date (UTC) | Retirement date (UTC) | Suggested replacement model |
+|-------|-------------------|------------------------|-----------------------|-----------------------------|
+| [DeepSeek-V3](https://aka.ms/azureai/landing/DeepSeek-V3) | April 10, 2025 | May 31, 2025 | August 31, 2025 | [DeepSeek-V3-0324](https://aka.ms/azureai/landing/DeepSeek-V3-0324) |
+
 #### Meta
 
 | Model | Legacy date (UTC) | Deprecation date (UTC) | Retirement date (UTC) | Suggested replacement model |
 
@@ -141,11 +141,12 @@ For more examples of how to use Jais models, see the following examples:
 
 ## DeepSeek
 
-DeepSeek family of models includes DeepSeek-R1, which excels at reasoning tasks using a step-by-step training process, such as language, scientific reasoning, and coding tasks, and DeepSeek-V3, a Mixture-of-Experts (MoE) language model. 
+DeepSeek family of models includes DeepSeek-R1, which excels at reasoning tasks using a step-by-step training process, such as language, scientific reasoning, and coding tasks, DeepSeek-V3-0324, a Mixture-of-Experts (MoE) language model, and more. 
 
 | Model  | Type | Capabilities | 
 | ------ | ---- | --- | 
-| [DeepSeek-V3](https://ai.azure.com/explore/models/deepseek-v3/version/1/registry/azureml-deepseek) | [chat-completion](../model-inference/how-to/use-chat-completions.md?context=/azure/ai-foundry/context/context) | - **Input:** text (131,072 tokens) <br /> - **Output:** text (131,072 tokens) <br />  - **Tool calling:** No <br /> - **Response formats:** Text, JSON |
+| [DeekSeek-V3-0324](https://ai.azure.com/explore/models/deepseek-v3-0324/version/1/registry/azureml-deepseek) | [chat-completion](../model-inference/how-to/use-chat-completions.md?context=/azure/ai-foundry/context/context) | - **Input:** text (131,072 tokens) <br /> - **Output:** (131,072 tokens) <br /> - **Tool calling:** No <br /> - **Response formats:** Text, JSON |
+| [DeepSeek-V3](https://ai.azure.com/explore/models/deepseek-v3/version/1/registry/azureml-deepseek) <br />(Legacy) | [chat-completion](../model-inference/how-to/use-chat-completions.md?context=/azure/ai-foundry/context/context) | - **Input:** text (131,072 tokens) <br /> - **Output:** text (131,072 tokens) <br />  - **Tool calling:** No <br /> - **Response formats:** Text, JSON |
 | [DeepSeek-R1](https://ai.azure.com/explore/models/deepseek-r1/version/1/registry/azureml-deepseek) | [chat-completion with reasoning content](../model-inference/how-to/use-chat-reasoning.md?context=/azure/ai-foundry/context/context) | - **Input:** text (163,840 tokens) <br /> - **Output:** text (163,840 tokens) <br />  - **Tool calling:** No <br /> - **Response formats:** Text. |
 
 For a tutorial on DeepSeek-R1, see [Tutorial: Get started with DeepSeek-R1 reasoning model in Azure AI model inference](../model-inference/tutorials/get-started-deepseek-r1.md?context=/azure/ai-foundry/context/context).
 
@@ -41,6 +41,7 @@ Cohere Embed v3 -  Multilingual    |  [Microsoft Managed Countries/Regions](/par
 
 | Model | Offer Availability Region  | Hub/Project Region for Deployment  | Hub/Project Region for Fine tuning  |
 |---------|---------|---------|---------|
+DeepSeek-V3-0324                  | Not applicable | East US <br> East US 2 <br> North Central US <br> South Central US <br> West US <br> West US 3  | Not available       |
 DeepSeek-V3                       | Not applicable | East US <br> East US 2 <br> North Central US <br> South Central US <br> West US <br> West US 3  | Not available       |
 DeepSeek-R1                       | Not applicable | East US <br> East US 2 <br> North Central US <br> South Central US <br> West US <br> West US 3  | Not available       |
 
 
@@ -110,7 +110,8 @@ DeepSeek family of models includes DeepSeek-R1, which excels at reasoning tasks
 | Model  | Type | Tier | Capabilities |
 | ------ | ---- | --- | ------------ |
 | [DeekSeek-R1](https://ai.azure.com/explore/models/deepseek-r1/version/1/registry/azureml-deepseek) | chat-completion <br /> [(with reasoning content)](../how-to/use-chat-reasoning.md) | Global standard | - **Input:** text (163,840 tokens) <br /> - **Output:**  (163,840 tokens) <br /> - **Languages:** `en` and `zh` <br />  - **Tool calling:** No <br /> - **Response formats:** Text. |
-| [DeekSeek-V3](https://ai.azure.com/explore/models/deepseek-v3/version/1/registry/azureml-deepseek) | chat-completion | Global standard | - **Input:** text (131,072 tokens) <br /> - **Output:**  (131,072 tokens) <br /> - **Languages:** `en` and `zh` <br />  - **Tool calling:** No <br /> - **Response formats:** Text, JSON |
+| [DeekSeek-V3](https://ai.azure.com/explore/models/deepseek-v3/version/1/registry/azureml-deepseek) <br />(Legacy) | chat-completion | Global standard | - **Input:** text (131,072 tokens) <br /> - **Output:**  (131,072 tokens) <br /> - **Languages:** `en` and `zh` <br />  - **Tool calling:** No <br /> - **Response formats:** Text, JSON |
+| [DeekSeek-V3-0324](https://ai.azure.com/explore/models/deepseek-v3-0324/version/1/registry/azureml-deepseek) | chat-completion | Global standard | - **Input:** text (131,072 tokens) <br /> - **Output:**  (131,072 tokens) <br /> - **Languages:** `en` and `zh` <br />  - **Tool calling:** No <br /> - **Response formats:** Text, JSON |
 
 For a tutorial on DeepSeek-R1, see [Tutorial: Get started with DeepSeek-R1 reasoning model in Azure AI model inference](../tutorials/get-started-deepseek-r1.md).
 
 
@@ -32,9 +32,9 @@ Azure uses quotas and limits to prevent budget overruns due to fraud, and to hon
 | -------------------- | ------------------- | ----------- |
 | Tokens per minute    | Azure OpenAI models | Varies per model and SKU. See [limits for Azure OpenAI](../../ai-services/openai/quotas-limits.md). |
 | Requests per minute  | Azure OpenAI models | Varies per model and SKU. See [limits for Azure OpenAI](../../ai-services/openai/quotas-limits.md). |
-| Tokens per minute    | DeepSeek-R1         | 5,000,000 |
-| Requests per minute  | DeepSeek-R1         | 5,000     |
-| Concurrent requests  | DeepSeek-R1         | 300       |
+| Tokens per minute    | DeepSeek-R1<br />DeepSeek-V3-0324         | 5,000,000 |
+| Requests per minute  | DeepSeek-R1<br />DeepSeek-V3-0324         | 5,000     |
+| Concurrent requests  | DeepSeek-R1<br />DeepSeek-V3-0324         | 300       |
 | Tokens per minute    | Rest of models      | 400,000   |
 | Requests per minute  | Rest of models      | 1,000     |
 | Concurrent requests  | Rest of models      | 300       |
 
@@ -75,7 +75,7 @@ The `custom_id` is required to allow you to identify which individual batch requ
 
 ### Create input file
 
-For this article we'll create a file named `test.jsonl` and will copy the contents from standard input code block above to the file. You will need to modify and add your global batch deployment name to each line of the file. Save this file in the same directory that you're executing your Jupyter Notebook.
+For this article we'll create a file named `test.jsonl` and will copy the contents from standard input code block above to the file. You'll need to modify and add your global batch deployment name to each line of the file. Save this file in the same directory that you're executing your Jupyter Notebook.
 
 ## Upload batch file
 
@@ -101,10 +101,15 @@ client = AzureOpenAI(
 # Upload a file with a purpose of "batch"
 file = client.files.create(
   file=open("test.jsonl", "rb"), 
-  purpose="batch"
+  purpose="batch",
+  #extra_body={"expires_after":{"seconds": 1209600, "anchor": "created_at"}} # Optional you can set to a number between 1209600-2592000. This is equivalent to 14-30 days
 )
 
+
 print(file.model_dump_json(indent=2))
+
+#print(f"File expiration: {datetime.fromtimestamp(file.expires_at) if file.expires_at is not None else 'Not set'}")
+
 file_id = file.id
 ```
 
@@ -125,30 +130,41 @@ client = AzureOpenAI(
 # Upload a file with a purpose of "batch"
 file = client.files.create(
   file=open("test.jsonl", "rb"), 
-  purpose="batch"
+  purpose="batch",
+  #extra_body={"expires_after":{"seconds": 1209600, "anchor": "created_at"}} # Optional you can set to a number between 1209600-2592000. This is equivalent to 14-30 days
 )
 
+
 print(file.model_dump_json(indent=2))
+
+#print(f"File expiration: {datetime.fromtimestamp(file.expires_at) if file.expires_at is not None else 'Not set'}")
+
 file_id = file.id
 ```
 
 ---
 
+By uncommenting and adding `extra_body={"expires_after":{"seconds": 1209600, "anchor": "created_at"}}` you're setting our upload file to expire in 14 days. There's a max limit of 500 batch files per resource when no expiration is set. By setting a value for expiration the number of batch files per resource is increased to 10,000 files per resource. This feature isn't currently available in all regions. Output when file upload expiration is set:
+
 **Output:**
 
 ```json
 {
-  "id": "file-9f3a81d899b4442f98b640e4bc3535dd",
-  "bytes": 815,
-  "created_at": 1722476551,
+  "id": "file-655111ec9cfc44489d9af078f08116ef",
+  "bytes": 176064,
+  "created_at": 1743391067,
   "filename": "test.jsonl",
   "object": "file",
   "purpose": "batch",
-  "status": null,
+  "status": "processed",
+  "expires_at": 1744600667,
   "status_details": null
 }
+File expiration: 2025-04-13 23:17:47
 ```
 
+
+
 ## Create batch job
 
 Once your file has uploaded successfully you can submit the file for batch processing.
@@ -159,16 +175,21 @@ batch_response = client.batches.create(
     input_file_id=file_id,
     endpoint="/chat/completions",
     completion_window="24h",
+    #extra_body={"output_expires_after":{"seconds": 1209600, "anchor": "created_at"}} # Optional you can set to a number between 1209600-2592000. This is equivalent to 14-30 days
 )
 
+
 # Save batch ID for later use
 batch_id = batch_response.id
 
 print(batch_response.model_dump_json(indent=2))
+
 ```
 
+The default 500 max file limit per resource also applies to output files. Here you can uncomment this line to add  `extra_body={"output_expires_after":{"seconds": 1209600, "anchor": "created_at"}}` so that your output files expire in 14 days. By setting a value for expiration the number of batch files per resource is increased to 10,000 files per resource. This feature isn't currently available in all regions.
+
 > [!NOTE]
-> Currently the completion window must be set to 24h. If you set any other value than 24h your job will fail. Jobs taking longer than 24 hours will continue to execute until canceled.
+> Currently the completion window must be set to `24h`. If you set any other value than `24h` your job will fail. Jobs taking longer than 24 hours will continue to execute until canceled.
 
 **Output:**
 
@@ -178,7 +199,7 @@ print(batch_response.model_dump_json(indent=2))
   "completion_window": "24h",
   "created_at": 1722476583,
   "endpoint": null,
-  "input_file_id": "file-9f3a81d899b4442f98b640e4bc3535dd",
+  "input_file_id": "file-655111ec9cfc44489d9af078f08116ef",
   "object": "batch",
   "status": "validating",
   "cancelled_at": null,
@@ -201,7 +222,7 @@ print(batch_response.model_dump_json(indent=2))
 }
 ```
 
-If your batch jobs are so large that you are hitting the enqueued token limit even after maxing out the quota for your deployment, certain regions now support a new [fail fast](#queueing-batch-jobs) feature that allows you to queue multiple batch jobs with exponential backoff so once one large batch job completes the next can be kicked off automatically. To learn more about what regions support this feature and how to adapt your code to take advantage of it, see [queuing batch jobs](#queueing-batch-jobs).  
+If your batch jobs are so large that you're hitting the enqueued token limit even after maxing out the quota for your deployment, certain regions now support a new [fail fast](#queueing-batch-jobs) feature that allows you to queue multiple batch jobs with exponential backoff so once one large batch job completes the next can be kicked off automatically. To learn more about what regions support this feature and how to adapt your code to take advantage of it, see [queuing batch jobs](#queueing-batch-jobs).  
 
 ## Track batch job progress
 
@@ -311,7 +332,7 @@ if output_file_id:
 
 **Output:**
 
-For brevity, we are only including a single chat completion response of output. If you follow the steps in this article you should have three responses similar to the one below:
+For brevity, we're only including a single chat completion response of output. If you follow the steps in this article you should have three responses similar to the one below:
 
 ```json
 {
@@ -429,7 +450,7 @@ print(all_jobs)
 
 Use the REST API to list all batch jobs with additional sorting/filtering options.
 
-In the examples below we are providing the `generate_time_filter` function to make constructing the filter easier. If you don't wish to use this function the format of the filter string would look like `created_at gt 1728860560 and status eq 'Completed'`.
+In the examples below we're providing the `generate_time_filter` function to make constructing the filter easier. If you don't wish to use this function the format of the filter string would look like `created_at gt 1728860560 and status eq 'Completed'`.
 
 # [Python (Microsoft Entra ID)](#tab/python-secure)
 
@@ -626,7 +647,7 @@ else:
 
 ## Queueing batch jobs
 
-If your batch jobs are so large that you are hitting the enqueued token limit even after maxing out the quota for your deployment, certain regions now support a new fail fast feature that allows you to queue multiple batch jobs with exponential backoff. Once one large batch job completes and your enqueued token quota is once again available, the next batch job can be created and kicked off automatically. 
+If your batch jobs are so large that you're hitting the enqueued token limit even after maxing out the quota for your deployment, certain regions now support a new fail fast feature that allows you to queue multiple batch jobs with exponential backoff. Once one large batch job completes and your enqueued token quota is once again available, the next batch job can be created and kicked off automatically. 
 
 **Old behavior:**
 
 
@@ -65,7 +65,7 @@ The `custom_id` is required to allow you to identify which individual batch requ
 
 ### Create input file
 
-For this article we'll create a file named `test.jsonl` and will copy the contents from standard input code block above to the file. You will need to modify and add your global batch deployment name to each line of the file.
+For this article we'll create a file named `test.jsonl` and will copy the contents from standard input code block above to the file. You'll need to modify and add your global batch deployment name to each line of the file.
 
 ## Upload batch file
 
@@ -78,21 +78,29 @@ curl -X POST https://YOUR_RESOURCE_NAME.openai.azure.com/openai/files?api-versio
   -H "Content-Type: multipart/form-data" \
   -H "api-key: $AZURE_OPENAI_API_KEY" \
   -F "purpose=batch" \
-  -F "file=@C:\\batch\\test.jsonl;type=application/json"
+  -F "file=@C:\\batch\\test.jsonl;type=application/json" \
+  -F "expires_after.seconds=1209600" \
+  -F "expires_after.anchor=created_at"
+
 ```
 
-The above code assumes a particular file path for your test.jsonl file. Adjust this file path as necessary for your local system.
+The above code assumes a particular file path for your test.jsonl file. Adjust this file path as necessary for your local system. 
+
+By adding the optional `"expires_after.seconds=1209600"` and `"expires_after.anchor=created_at"` parameters  you're setting your upload file to expire in 14 days. There's a max limit of 500 batch files per resource when no expiration is set. By setting a value for expiration the number of batch files per resource is increased to 10,000 files per resource. You can set to a number between 1209600-2592000. This is equivalent to 14-30 days. This feature isn't currently available in all regions.
+
+
 
 **Output:**
 
 ```json
 {
-  "status": "pending",
-  "bytes": 686,
+  "status": "processed",
+  "bytes": 817,
   "purpose": "batch",
   "filename": "test.jsonl",
-  "id": "file-21006e70789246658b86a1fc205899a4",
-  "created_at": 1721408291,
+  "expires_at": 1744607747,
+  "id": "file-7733bc35e32841e297a62a9ee50b3461",
+  "created_at": 1743398147,
   "object": "file"
 }
 
@@ -116,7 +124,8 @@ curl https://YOUR_RESOURCE_NAME.openai.azure.com/openai/files/{file-id}?api-vers
   "bytes": 686,
   "purpose": "batch",
   "filename": "test.jsonl",
-  "id": "file-21006e70789246658b86a1fc205899a4",
+  "expires_at": 1744607747,
+  "id": "file-7733bc35e32841e297a62a9ee50b3461",
   "created_at": 1721408291,
   "object": "file"
 }
@@ -134,12 +143,18 @@ curl -X POST https://YOUR_RESOURCE_NAME.openai.azure.com/openai/batches?api-vers
   -d '{
     "input_file_id": "file-abc123",
     "endpoint": "/chat/completions",
-    "completion_window": "24h"
+    "completion_window": "24h",
+    "output_expires_after": {
+        "seconds": 1209600
+    },
+    "anchor": "created_at"
   }'
 ```
 
+The default 500 max file limit per resource also applies to output files. Here you can optionally add  `"output_expires_after":{"seconds": 1209600},` and `"anchor": "created_at"` so that your output files expire in 14 days. By setting a value for expiration the number of batch files per resource is increased to 10,000 files per resource. The file expiration feature is currently not available in all regions.
+
 > [!NOTE]
-> Currently the completion window must be set to 24h. If you set any other value than 24h your job will fail. Jobs taking longer than 24 hours will continue to execute until canceled.
+> Currently the completion window must be set to `24h`. If you set any other value than `24h` your job will fail. Jobs taking longer than 24 hours will continue to execute until canceled.
 
 **Output:**
 
@@ -221,7 +236,7 @@ The following status values are possible:
 | `in_progress`|The input file was successfully validated and the batch is currently running. |
 | `finalizing`|The batch has completed and the results are being prepared. |
 | `completed`|The batch has been completed and the results are ready.  |
-| `expired`|The batch was not able to be completed within the 24-hour time window.|
+| `expired`|The batch wasn't able to be completed within the 24-hour time window.|
 | `cancelling`|The batch is being `cancelled` (This can take up to 10 minutes to go into effect.) |
 | `cancelled`|the batch was `cancelled`.|