Merge pull request #5147 from PatrickFarley/sora-build

JillGrant615 · web-flow · commit 496a52afbe8c · 2025-05-22T17:15:23.000-06:00
Sora build
diff --git a/articles/ai-services/openai/concepts/models.md b/articles/ai-services/openai/concepts/models.md
@@ -244,6 +244,19 @@ Once access has been granted, you will need to create a deployment for the model
 |`dall-e-3` | East US<br>Australia East<br>Sweden Central|
 |`gpt-image-1` | West US 3 (Global Standard) <br> UAE North (Global Standard) |
 
+
+## Video generation models
+
+Sora is an AI model from OpenAI that can create realistic and imaginative video scenes from text instructions. Sora is in public preview.
+
+
+
+### Region availability
+
+| Model | Region |
+|---|---|
+|`sora` | East US 2|
+
 ## Audio models
 
 Audio models in Azure OpenAI are available via the `realtime`, `completions`, and `audio` APIs.
@@ -434,13 +447,25 @@ These models can only be used with Embedding API requests.
 
 [!INCLUDE [Image Generation](../includes/model-matrix/standard-image-generation.md)]
 
-### Image generation models
-
 |  Model ID  | Max Request (characters) |
 |  --- | :---: |
 | gpt-image-1 | 4,000 |
 | dall-e-3  | 4,000 |
 
+# [Video Generation](#tab/standard-video-generations)
+
+### Video generation models
+
+| **Region**   | **sora**   |
+|:-----------------|:---------------------:|
+| eastus2    | ✅                  |
+
+|  Model ID  | Max Request (characters) |
+|  --- | :---: |
+| sora | 4,000 |
+
+
+
 # [Audio](#tab/standard-audio)
 
 ### Audio models
diff --git a/articles/ai-services/openai/concepts/video-generation.md b/articles/ai-services/openai/concepts/video-generation.md
@@ -0,0 +1,51 @@
+---
+title: Sora video generation overview (preview)
+description: Learn about Sora, an AI model for generating realistic and imaginative video scenes from text instructions, including safety, limitations, and supported features.
+author: PatrickFarley
+ms.author: pafarley
+manager: nitinme
+ms.service: azure-ai-openai
+ms.topic: conceptual
+ms.date: 05/22/2025
+---
+
+# Sora video generation (preview)
+
+Sora is an AI model from OpenAI that can create realistic and imaginative video scenes from text instructions. The model is capable of generating a wide range of video content, including realistic scenes, animations, and special effects. Several video resolutions and durations are supported.
+
+## Supported features
+
+Sora can generate complex scenes with multiple characters, diverse motions, and detailed backgrounds. The model interprets prompts with contextual and physical world understanding, enabling accurate scene composition and character persistence across multiple shots. Sora demonstrates strong language comprehension for prompt interpretation and emotional character generation. 
+
+## How it works
+
+Video generation is an asynchronous process. You create a job request with your text prompt and video format specifications, and the model processes the request in the background. You can check the status of the video generation job and, once it has finished, retrieve the generated video via a download URL.
+
+## Best practices for prompts
+
+Users should write text prompts in English or Latin script languages for the best video generation performance.  
+
+
+## Limitations
+
+### Content quality limitations
+
+Sora may have difficulty with complex physics, causal relationships (for example, bite marks on a cookie), spatial reasoning (for example, knowing left from right), and precise time-based event sequencing such as camera movement.
+
+### Technical limitations
+
+Sora supports the following output resolution dimensions: 
+480x480, 480x854, 854x480, 720x720, 720x1280, 1280x720, 1080x1080, 1080x1920, 1920x1080.
+
+Sora supports the following video durations: 5, 10, 15, and 20 seconds.
+
+You can request multiple video variants in a single job: for 1080p resolutions, this feature is disabled; for 720p, the maximum is two variants; for other resolutions, the maximum is four variants.
+
+A user can have two video creation jobs running at the same time. In that situation, you must wait for one of the jobs to finish before you can create another.
+
+## Responsible AI
+
+Sora has a robust safety stack including content filtering, abuse monitoring, sensitive content blocking, and safety classifiers.
+
+Sora doesn't generate scenes with acts of violence but can generate adjacent content, such as realistic war-like footage.
+
diff --git a/articles/ai-services/openai/includes/model-matrix/standard-image-generation.md b/articles/ai-services/openai/includes/model-matrix/standard-image-generation.md
@@ -9,8 +9,10 @@ ms.custom: references_regions
 ms.date: 02/06/2025
 ---
 
-| **Region**   | **dall-e-3**, **3.0**   |
-|:-----------------|:---------------------:|
-| australiaeast    | ✅                  |
-| eastus           | ✅                  |
-| swedencentral    | ✅                  |
+| **Region**   | **dall-e-3**, **3.0**   | **gpt-image-1** |
+|:-----------------|:---------------------:|---|
+| australiaeast    | ✅                  |  |
+| eastus           | ✅                  |  |
+| swedencentral    | ✅                  |  |
+| westus3   |                   | ✅ |
+| uaenorth    |                  |✅   |
diff --git a/articles/ai-services/openai/toc.yml b/articles/ai-services/openai/toc.yml
@@ -36,6 +36,8 @@ items:
       href: gpt-v-quickstart.md
     - name: Image generation
       href: dall-e-quickstart.md
+    - name: Video generation
+      href: video-generation-quickstart.md
     - name: Use your data
       href: use-your-data-quickstart.md
     - name: Realtime API for speech and audio (preview)
@@ -90,6 +92,8 @@ items:
       href: ./concepts/fine-tuning-considerations.md
     - name: Vision-enabled models
       href: ./concepts/gpt-with-vision.md
+    - name: Video generation (preview)
+      href: ./concepts/video-generation.md
     - name: Red teaming large language models (LLMs)
       href: ./concepts/red-teaming.md
     - name: Content credentials
diff --git a/articles/ai-services/openai/video-generation-quickstart.md b/articles/ai-services/openai/video-generation-quickstart.md
@@ -0,0 +1,113 @@
+---
+title: 'Quickstart: Generate video with Sora'
+titleSuffix: Azure OpenAI
+description: Learn how to get started generating video clips with Azure OpenAI.
+manager: nitinme
+ms.service: azure-ai-openai
+ms.topic: quickstart
+author: PatrickFarley
+ms.author: pafarley
+ms.date: 05/22/2025
+---
+
+# Quickstart: Generate a video with Sora (preview)
+
+In this Quickstart, you generate video clips using the Azure OpenAI service. The example uses the Sora model, which is a video generation model that creates realistic and imaginative video scenes from text instructions. This guide shows you how to create a video generation job, poll for its status, and retrieve the generated video.
+
+
+## Prerequisites
+
+- An Azure subscription. <a href="https://azure.microsoft.com/free/ai-services" target="_blank">Create one for free</a>.
+- <a href="https://www.python.org/" target="_blank">Python 3.8 or later version</a>.
+- An Azure OpenAI resource created in a supported region. See [Region availability](/azure/ai-services/openai/concepts/models#model-summary-table-and-region-availability).
+- Then, you need to deploy a `sora` model with your Azure resource. For more information, see [Create a resource and deploy a model with Azure OpenAI](./how-to/create-resource.md).
+- [Python 3.8 or later version](https://www.python.org/).
+
+
+## Setup
+
+### Retrieve key and endpoint
+
+To successfully call the Azure OpenAI APIs, you need the following information about your Azure OpenAI resource:
+
+| Variable | Name | Value |
+|---|---|---|
+| **Endpoint** | `api_base` | The endpoint value is located under **Keys and Endpoint** for your resource in the Azure portal. You can also find the endpoint via the **Deployments** page in Azure AI Foundry portal. An example endpoint is: `https://docs-test-001.openai.azure.com/`. |
+| **Key** | `api_key` | The key value is also located under **Keys and Endpoint** for your resource in the Azure portal. Azure generates two keys for your resource. You can use either value. |
+
+Go to your resource in the Azure portal. On the navigation pane, select **Keys and Endpoint** under **Resource Management**. Copy the **Endpoint** value and an access key value. You can use either the **KEY 1** or **KEY 2** value. Always having two keys allows you to securely rotate and regenerate keys without causing a service disruption.
+
+:::image type="content" source="./media/quickstarts/endpoint.png" alt-text="Screenshot that shows the Keys and Endpoint page for an Azure OpenAI resource in the Azure portal." lightbox="./media/quickstarts/endpoint.png":::
+
+
+
+[!INCLUDE [environment-variables](./includes/environment-variables.md)]
+
+
+
+## Create a new Python application
+
+Create a new Python file named `quickstart.py`. Open the new file in your preferred editor or IDE.
+1. Replace the contents of `quickstart.py` with the following code. Change the value of `prompt` to your preferred text.
+    
+    ```python
+    import os
+    import time
+    import requests
+    
+    # Set these variables with your values
+    endpoint = os.environ["AZURE_OPENAI_ENDPOINT"]  # e.g., "https://docs-test-001.openai.azure.com"
+    api_key = os.environ["AZURE_OPENAI_KEY"]
+    access_token = os.environ.get("AZURE_OPENAI_TOKEN")  # Optional: if using Azure AD auth
+    
+    headers = {
+        "Content-Type": "application/json",
+        "api-key": api_key,
+    }
+    if access_token:
+        headers["Authorization"] = f"Bearer {access_token}"
+    
+    # 1. Create a video generation job
+    create_url = f"{endpoint}/openai/v1/video/generations/jobs?api-version=preview"
+    payload = {
+        "prompt": "A cat playing piano in a jazz bar.",
+        "model": "sora"
+    }
+    response = requests.post(create_url, headers=headers, json=payload)
+    response.raise_for_status()
+    job_id = response.json()["body"]["id"]
+    print(f"Job created: {job_id}")
+    
+    # 2. Poll for job status
+    status_url = f"{endpoint}/openai/v1/video/generations/jobs/{job_id}?api-version=preview"
+    while True:
+        status_response = requests.get(status_url, headers=headers)
+        status_response.raise_for_status()
+        status = status_response.json()["body"]["status"]
+        print(f"Job status: {status}")
+        if status == "succeeded":
+            generations = status_response.json()["body"].get("generations", [])
+            if not generations:
+                raise Exception("No generations found in job result.")
+            generation_id = generations[0]["id"]
+            break
+        elif status in ("failed", "cancelled"):
+            raise Exception(f"Job did not succeed. Status: {status}")
+        time.sleep(5)  # Wait before polling again
+    
+    # 3. Retrieve the generated video
+    get_video_url = f"{endpoint}/openai/v1/video/generations/{generation_id}?api-version=preview"
+    video_response = requests.get(get_video_url, headers=headers)
+    video_response.raise_for_status()
+    download_url = video_response.json()["body"]["generations"]
+    print(f"Download your video at: {download_url}")
+    ```
+1. Run the application with the `python` command:
+
+    ```console
+    python quickstart.py
+    ```
+
+    Wait a few moments to get the response.
+
+---
diff --git a/articles/ai-services/openai/whats-new.md b/articles/ai-services/openai/whats-new.md
@@ -20,6 +20,12 @@ This article provides a summary of the latest releases and major documentation u
 
 ## May 2025
 
+### Sora video generation released (preview)
+
+Sora (2025-05-02) is a video generation model from OpenAI that can create realistic and imaginative video scenes from text instructions.
+
+Follow the [Video generation quickstart](/azure/ai-services/openai/video-generation-quickstart) to get started. For more information, see the [Video generation concepts](./concepts/video-generation.md) guide.
+
 ### PII detection content filter
 
 Personally identifiable information (PII) detection is now available as a built-in content filter. This feature allows you to identify and block sensitive information in LLM outputs, enhancing data privacy. For more information, see the [PII detection](./concepts/content-filter-personal-information.md) documentation.
@@ -147,7 +153,7 @@ For more information, see the [GPT-4o real-time audio quickstart](realtime-audio
 
 ### o1 reasoning model released for limited access
 
-The latest `o1` model is now available for API access and model deployment. **Registration is required, and access will be granted based on Microsoft's eligibility criteria**. Customers who previously applied and received access to `o1-preview`, don't need to reapply as they are automatically on the wait-list for the latest model.
+The latest `o1` model is now available for API access and model deployment. **Registration is required, and access will be granted based on Microsoft's eligibility criteria**. Customers who previously applied and received access to `o1-preview`, don't need to reapply as they're automatically on the wait-list for the latest model.
 
 Request access: [limited access model application](https://aka.ms/OAI/o1access)
 
@@ -161,7 +167,7 @@ To learn more about the advanced `o1` series models see, [getting started with o
 
 ### Preference fine-tuning (preview)
 
-[Direct preference optimization (DPO)](./how-to/fine-tuning-direct-preference-optimization.md) is a new alignment technique for large language models, designed to adjust model weights based on human preferences. Unlike reinforcement learning from human feedback (RLHF), DPO does not require fitting a reward model and uses simpler data (binary preferences) for training. This method is computationally lighter and faster, making it equally effective at alignment while being more efficient. DPO is especially useful in scenarios where subjective elements like tone, style, or specific content preferences are important. We’re excited to announce the public preview of DPO in Azure OpenAI, starting with the `gpt-4o-2024-08-06` model.
+[Direct preference optimization (DPO)](./how-to/fine-tuning-direct-preference-optimization.md) is a new alignment technique for large language models, designed to adjust model weights based on human preferences. Unlike reinforcement learning from human feedback (RLHF), DPO doesn't require fitting a reward model and uses simpler data (binary preferences) for training. This method is computationally lighter and faster, making it equally effective at alignment while being more efficient. DPO is especially useful in scenarios where subjective elements like tone, style, or specific content preferences are important. We’re excited to announce the public preview of DPO in Azure OpenAI, starting with the `gpt-4o-2024-08-06` model.
 
 For fine-tuning model region availability, see the [models page](./concepts/models.md#fine-tuning-models).
 
@@ -199,7 +205,7 @@ For fine-tuning model region availability, see the [models page](./concepts/mode
 
 ### NEW AI abuse monitoring
 
-We are introducing new forms of abuse monitoring that leverage LLMs to improve efficiency of detection of potentially abusive use of the Azure OpenAI and to enable abuse monitoring without the need for human review of prompts and completions. Learn more, see [Abuse monitoring](/azure/ai-services/openai/concepts/abuse-monitoring).
+We're introducing new forms of abuse monitoring that leverage LLMs to improve efficiency of detection of potentially abusive use of the Azure OpenAI and to enable abuse monitoring without the need for human review of prompts and completions. Learn more, see [Abuse monitoring](/azure/ai-services/openai/concepts/abuse-monitoring).
 
 Prompts and completions that are flagged through content classification and/or identified to be part of a potentially abusive pattern of use are subjected to an additional review process to help confirm the system's analysis and inform actioning decisions. Our abuse monitoring systems have been expanded to enable review by LLM by default and by humans when necessary and appropriate. 
 
@@ -312,13 +318,13 @@ OpenAI has incorporated additional safety measures into the `o1` models, includi
 
 ### Availability
 
-The `o1-preview` and `o1-mini` are available in the East US2 region for limited access through the [Azure AI Foundry portal](https://ai.azure.com) early access playground. Data processing for the `o1` models might occur in a different region than where they are available for use.
+The `o1-preview` and `o1-mini` are available in the East US2 region for limited access through the [Azure AI Foundry portal](https://ai.azure.com) early access playground. Data processing for the `o1` models might occur in a different region than where they're available for use.
 
 To try the `o1-preview` and `o1-mini` models in the early access playground **registration is required, and access will be granted based on Microsoft’s eligibility criteria.**
 
 Request access: [limited access model application](https://aka.ms/oai/modelaccess)
 
-Once access has been granted, you will need to:
+Once access has been granted, you'll need to:
 
 1. Navigate to https://ai.azure.com/resources and select a resource in the `eastus2` region. If you don't have an Azure OpenAI resource in this region you'll need to [create one](https://portal.azure.com/#create/Microsoft.CognitiveServicesOpenAI).  
 2. Once the `eastus2` Azure OpenAI resource is selected, in the upper left-hand panel under **Playgrounds** select **Early access playground (preview)**.
@@ -375,7 +381,7 @@ Unlike the previous early access playground, the [Azure AI Foundry portal](https
 > [!NOTE]
 > Prompts and completions made through the early access playground (preview) might be processed in any Azure OpenAI region, and are currently subject to a 10 request per minute per Azure subscription limit. This limit might change in the future.
 >
-> Azure OpenAI abuse monitoring is enabled for all early access playground users even if approved for modification; default content filters are enabled and cannot be modified.
+> Azure OpenAI abuse monitoring is enabled for all early access playground users even if approved for modification; default content filters are enabled and can't be modified.
 
 To test out GPT-4o `2024-08-06`, sign-in to the Azure AI early access playground (preview) using this [link](https://aka.ms/oai/docs/earlyaccessplayground).