Skip to content

Commit 96503a9

Browse files
Merge pull request #6091 from MicrosoftDocs/main
Auto Publish – main to live - 2025-07-17 11:00 UTC
2 parents 138136c + da0c871 commit 96503a9

File tree

16 files changed

+296
-85
lines changed

16 files changed

+296
-85
lines changed

articles/ai-foundry/openai/concepts/models.md

Lines changed: 15 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -96,7 +96,7 @@ Once access has been granted, you will need to create a deployment for the model
9696

9797
| Model ID | Description | Context Window | Max Output Tokens | Training Data (up to) |
9898
| --- | :--- |:--- |:---|:---: |
99-
| `computer-use-preview` (2025-03-11) | Specialized model for use with the [Responses API](../how-to/responses.md) computer use tool <br> <br>-Tools <br>-Streaming<br>-Text(input/output)<br>- Image(input) | 8,192 | 1,024 | Oct 2023 |
99+
| `computer-use-preview` (2025-03-11) | Specialized model for use with the [Responses API](../how-to/responses.md) computer use tool <br> <br>-Tools <br>-Streaming<br>-Text(input/output)<br>- Image(input) | 8,192 | 1,024 | October 2023 |
100100

101101

102102
## GPT-4.5 Preview
@@ -111,7 +111,7 @@ Once access has been granted, you will need to create a deployment for the model
111111

112112
| Model ID | Description | Context Window | Max Output Tokens | Training Data (up to) |
113113
| --- | :--- |:--- |:---|:---: |
114-
| `gpt-4.5-preview` (2025-02-27) <br> **GPT-4.5 Preview** | [GPT 4.1](#gpt-41-series) is the recommended replacement for this model. Excels at diverse text and image tasks. <br>- Structured outputs <br>- Prompt caching <br>- Tools <br>- Streaming<br>- Text(input/output)<br>- Image(input) | 128,000 | 16,384 | Oct 2023 |
114+
| `gpt-4.5-preview` (2025-02-27) <br> **GPT-4.5 Preview** | [GPT 4.1](#gpt-41-series) is the recommended replacement for this model. Excels at diverse text and image tasks. <br>- Structured outputs <br>- Prompt caching <br>- Tools <br>- Streaming<br>- Text(input/output)<br>- Image(input) | 128,000 | 16,384 | October 2023 |
115115

116116
> [!NOTE]
117117
> It is expected behavior that the model cannot answer questions about itself. If you want to know when the knowledge cutoff for the model's training data is, or other details about the model you should refer to the model documentation above.
@@ -126,10 +126,10 @@ The Azure OpenAI o<sup>&#42;</sup> series models are specifically designed to ta
126126
| `o3-pro` (2025-06-10) | - [Responses API](../how-to/responses.md) <br>- Structured outputs<br> - Text, image processing <br> - Functions/Tools<br> [Full summary of capabilities](../how-to/reasoning.md) | Input: 200,000 <br> Output: 100,000 | May 31, 2024 |
127127
| `o4-mini` (2025-04-16) | - **NEW** reasoning model, offering [enhanced reasoning abilities](../how-to/reasoning.md). <br><br> - Chat Completions API <br> - [Responses API](../how-to/responses.md) <br>- Structured outputs<br> - Text, image processing <br> - Functions/Tools<br> [Full summary of capabilities](../how-to/reasoning.md) | Input: 200,000 <br> Output: 100,000 | May 31, 2024 |
128128
| `o3` (2025-04-16) | - **NEW** reasoning model, offering [enhanced reasoning abilities](../how-to/reasoning.md). <br> <br> - Chat Completions API <br> - [Responses API](../how-to/responses.md) <br> - Structured outputs<br> - Text, image processing <br> - Functions/Tools/Parallel tool calling <br> [Full summary of capabilities](../how-to/reasoning.md) | Input: 200,000 <br> Output: 100,000 | May 31, 2024 |
129-
| `o3-mini` (2025-01-31) | - [Enhanced reasoning abilities](../how-to/reasoning.md). <br> - Structured outputs<br> - Text-only processing <br> - Functions/Tools | Input: 200,000 <br> Output: 100,000 | Oct 2023 |
130-
| `o1` (2024-12-17) | - [Enhanced reasoning abilities](../how-to/reasoning.md). <br> - Structured outputs<br> - Text, image processing <br> - Functions/Tools | Input: 200,000 <br> Output: 100,000 | Oct 2023 |
131-
|`o1-preview` (2024-09-12) | Older preview version | Input: 128,000 <br> Output: 32,768 | Oct 2023 |
132-
| `o1-mini` (2024-09-12) | A faster and more cost-efficient option in the o1 series, ideal for coding tasks requiring speed and lower resource consumption. <br><br> Global standard deployment available by default. <br> <br> Standard (regional) deployments are currently only available for select customers who received access as part of the `o1-preview` limited access release. | Input: 128,000 <br> Output: 65,536 | Oct 2023 |
129+
| `o3-mini` (2025-01-31) | - [Enhanced reasoning abilities](../how-to/reasoning.md). <br> - Structured outputs<br> - Text-only processing <br> - Functions/Tools | Input: 200,000 <br> Output: 100,000 | October 2023 |
130+
| `o1` (2024-12-17) | - [Enhanced reasoning abilities](../how-to/reasoning.md). <br> - Structured outputs<br> - Text, image processing <br> - Functions/Tools | Input: 200,000 <br> Output: 100,000 | October 2023 |
131+
|`o1-preview` (2024-09-12) | Older preview version | Input: 128,000 <br> Output: 32,768 | October 2023 |
132+
| `o1-mini` (2024-09-12) | A faster and more cost-efficient option in the o1 series, ideal for coding tasks requiring speed and lower resource consumption. <br><br> Global standard deployment available by default. <br> <br> Standard (regional) deployments are currently only available for select customers who received access as part of the `o1-preview` limited access release. | Input: 128,000 <br> Output: 65,536 | October 2023 |
133133

134134
### Availability
135135

@@ -187,10 +187,10 @@ See [model versions](../concepts/model-versions.md) to learn about how Azure Ope
187187

188188
| Model ID | Description | Max Request (tokens) | Training Data (up to) |
189189
| --- | :--- |:--- |:---: |
190-
| `gpt-4o` (2024-11-20) <br> **GPT-4o (Omni)** | **Latest large GA model** <br> - Structured outputs<br> - Text, image processing <br> - JSON Mode <br> - parallel function calling <br> - Enhanced accuracy and responsiveness <br> - Parity with English text and coding tasks compared to GPT-4 Turbo with Vision <br> - Superior performance in non-English languages and in vision tasks. <br> - **Enhanced creative writing ability** | Input: 128,000 <br> Output: 16,384 | Oct 2023 |
191-
|`gpt-4o` (2024-08-06) <br> **GPT-4o (Omni)** | - Structured outputs<br> - Text, image processing <br> - JSON Mode <br> - parallel function calling <br> - Enhanced accuracy and responsiveness <br> - Parity with English text and coding tasks compared to GPT-4 Turbo with Vision <br> - Superior performance in non-English languages and in vision tasks |Input: 128,000 <br> Output: 16,384 | Oct 2023 |
192-
|`gpt-4o-mini` (2024-07-18) <br> **GPT-4o mini** | **Latest small GA model** <br> - Fast, inexpensive, capable model ideal for replacing GPT-3.5 Turbo series models. <br> - Text, image processing <br>- JSON Mode <br> - parallel function calling | Input: 128,000 <br> Output: 16,384 | Oct 2023 |
193-
|`gpt-4o` (2024-05-13) <br> **GPT-4o (Omni)** | Text, image processing <br> - JSON Mode <br> - parallel function calling <br> - Enhanced accuracy and responsiveness <br> - Parity with English text and coding tasks compared to GPT-4 Turbo with Vision <br> - Superior performance in non-English languages and in vision tasks |Input: 128,000 <br> Output: 4,096| Oct 2023 |
190+
| `gpt-4o` (2024-11-20) <br> **GPT-4o (Omni)** | **Latest large GA model** <br> - Structured outputs<br> - Text, image processing <br> - JSON Mode <br> - parallel function calling <br> - Enhanced accuracy and responsiveness <br> - Parity with English text and coding tasks compared to GPT-4 Turbo with Vision <br> - Superior performance in non-English languages and in vision tasks. <br> - **Enhanced creative writing ability** | Input: 128,000 <br> Output: 16,384 | October 2023 |
191+
|`gpt-4o` (2024-08-06) <br> **GPT-4o (Omni)** | - Structured outputs<br> - Text, image processing <br> - JSON Mode <br> - parallel function calling <br> - Enhanced accuracy and responsiveness <br> - Parity with English text and coding tasks compared to GPT-4 Turbo with Vision <br> - Superior performance in non-English languages and in vision tasks |Input: 128,000 <br> Output: 16,384 | October 2023 |
192+
|`gpt-4o-mini` (2024-07-18) <br> **GPT-4o mini** | **Latest small GA model** <br> - Fast, inexpensive, capable model ideal for replacing GPT-3.5 Turbo series models. <br> - Text, image processing <br>- JSON Mode <br> - parallel function calling | Input: 128,000 <br> Output: 16,384 | October 2023 |
193+
|`gpt-4o` (2024-05-13) <br> **GPT-4o (Omni)** | Text, image processing <br> - JSON Mode <br> - parallel function calling <br> - Enhanced accuracy and responsiveness <br> - Parity with English text and coding tasks compared to GPT-4 Turbo with Vision <br> - Superior performance in non-English languages and in vision tasks |Input: 128,000 <br> Output: 4,096| October 2023 |
194194
| `gpt-4` (turbo-2024-04-09) <br>**GPT-4 Turbo with Vision** | **New GA model** <br> - Replacement for all previous GPT-4 preview models (`vision-preview`, `1106-Preview`, `0125-Preview`). <br> - [**Feature availability**](#gpt-4o-and-gpt-4-turbo) is currently different depending on method of input, and deployment type. | Input: 128,000 <br> Output: 4,096 | Dec 2023 |
195195
| `gpt-4-32k` (0613) | **Older GA model** <br> - Basic function calling with tools | 32,768 | Sep 2021 |
196196
| `gpt-4` (0613) | **Older GA model** <br> - Basic function calling with tools | 8,192 | Sep 2021 |
@@ -281,11 +281,11 @@ Details about maximum request tokens and training data are available in the foll
281281

282282
| Model ID | Description | Max Request (tokens) | Training Data (up to) |
283283
|---|---|---|---|
284-
|`gpt-4o-mini-audio-preview` (2024-12-17) <br> **GPT-4o audio** | **Audio model** for audio and text generation. |Input: 128,000 <br> Output: 4,096 | Oct 2023 |
285-
|`gpt-4o-mini-realtime-preview` (2024-12-17) <br> **GPT-4o audio** | **Audio model** for real-time audio processing. |Input: 128,000 <br> Output: 4,096 | Oct 2023 |
286-
|`gpt-4o-audio-preview` (2024-12-17) <br> **GPT-4o audio** | **Audio model** for audio and text generation. |Input: 128,000 <br> Output: 4,096 | Oct 2023 |
287-
|`gpt-4o-realtime-preview` (2024-12-17) <br> **GPT-4o audio** | **Audio model** for real-time audio processing. |Input: 128,000 <br> Output: 4,096 | Oct 2023 |
288-
|`gpt-4o-mini-realtime-preview` (2024-12-17) <br> **GPT-4o audio** | **Audio model** for real-time audio processing. |Input: 128,000 <br> Output: 4,096 | Oct 2023 |
284+
|`gpt-4o-mini-audio-preview` (2024-12-17) <br> **GPT-4o audio** | **Audio model** for audio and text generation. |Input: 128,000 <br> Output: 16,384 | September 2023 |
285+
|`gpt-4o-audio-preview` (2024-12-17) <br> **GPT-4o audio** | **Audio model** for audio and text generation. |Input: 128,000 <br> Output: 16,384 | September 2023 |
286+
|`gpt-4o-realtime-preview` (2025-06-03) <br> **GPT-4o audio** | **Audio model** for real-time audio processing. |Input: 128,000 <br> Output: 4,096 | October 2023 |
287+
|`gpt-4o-realtime-preview` (2024-12-17) <br> **GPT-4o audio** | **Audio model** for real-time audio processing. |Input: 128,000 <br> Output: 4,096 | October 2023 |
288+
|`gpt-4o-mini-realtime-preview` (2024-12-17) <br> **GPT-4o audio** | **Audio model** for real-time audio processing. |Input: 128,000 <br> Output: 4,096 | October 2023 |
289289

290290
To compare the availability of GPT-4o audio models across all regions, see the [models table](#global-standard-model-availability).
291291

articles/ai-services/speech-service/get-started-stt-diarization.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -7,9 +7,9 @@ manager: nitinme
77
ms.service: azure-ai-speech
88
ms.custom: devx-track-extended-java, devx-track-go, devx-track-js, devx-track-python
99
ms.topic: quickstart
10-
ms.date: 3/10/2025
10+
ms.date: 7/16/2025
1111
ms.author: eur
12-
zone_pivot_groups: programming-languages-speech-services
12+
zone_pivot_groups: programming-languages-speech-diarization
1313
keywords: speech to text, speech to text software
1414
#customer intent: As a developer, I want to create speech to text applications that use diarization to identify speakers in multiple person conversations.
1515
---
@@ -52,8 +52,8 @@ keywords: speech to text, speech to text software
5252
[!INCLUDE [REST include](includes/quickstarts/stt-diarization/rest.md)]
5353
::: zone-end
5454

55-
::: zone pivot="programming-language-cli"
56-
[!INCLUDE [CLI include](includes/quickstarts/stt-diarization/cli.md)]
55+
::: zone pivot="programming-language-typescript"
56+
[!INCLUDE [TypeScript include](includes/quickstarts/stt-diarization/typescript.md)]
5757
::: zone-end
5858

5959
## Next step

articles/ai-services/speech-service/includes/quickstarts/speech-to-text-basics/javascript.md

Lines changed: 32 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -47,33 +47,41 @@ To transcribe speech from a file:
4747
1. Create a new file named *transcription.js* with the following content:
4848

4949
```javascript
50-
import { readFileSync } from "fs";
51-
import { SpeechConfig, AudioConfig, SpeechRecognizer, ResultReason, CancellationDetails, CancellationReason } from "microsoft-cognitiveservices-speech-sdk";
50+
import { readFileSync, createReadStream } from "fs";
51+
import { SpeechConfig, AudioConfig, ConversationTranscriber, AudioInputStream } from "microsoft-cognitiveservices-speech-sdk";
5252
// This example requires environment variables named "SPEECH_KEY" and "SPEECH_REGION"
5353
const speechConfig = SpeechConfig.fromSubscription(process.env.SPEECH_KEY, process.env.SPEECH_REGION);
54-
speechConfig.speechRecognitionLanguage = "en-US";
5554
function fromFile() {
56-
const audioConfig = AudioConfig.fromWavFileInput(readFileSync("YourAudioFile.wav"));
57-
const speechRecognizer = new SpeechRecognizer(speechConfig, audioConfig);
58-
speechRecognizer.recognizeOnceAsync((result) => {
59-
switch (result.reason) {
60-
case ResultReason.RecognizedSpeech:
61-
console.log(`RECOGNIZED: Text=${result.text}`);
62-
break;
63-
case ResultReason.NoMatch:
64-
console.log("NOMATCH: Speech could not be recognized.");
65-
break;
66-
case ResultReason.Canceled:
67-
const cancellation = CancellationDetails.fromResult(result);
68-
console.log(`CANCELED: Reason=${cancellation.reason}`);
69-
if (cancellation.reason === CancellationReason.Error) {
70-
console.log(`CANCELED: ErrorCode=${cancellation.ErrorCode}`);
71-
console.log(`CANCELED: ErrorDetails=${cancellation.errorDetails}`);
72-
console.log("CANCELED: Did you set the speech resource key and region values?");
73-
}
74-
break;
75-
}
76-
speechRecognizer.close();
55+
const filename = "katiesteve.wav";
56+
const audioConfig = AudioConfig.fromWavFileInput(readFileSync(filename));
57+
const conversationTranscriber = new ConversationTranscriber(speechConfig, audioConfig);
58+
const pushStream = AudioInputStream.createPushStream();
59+
createReadStream(filename).on('data', function (chunk) {
60+
pushStream.write(chunk.slice());
61+
}).on('end', function () {
62+
pushStream.close();
63+
});
64+
console.log("Transcribing from: " + filename);
65+
conversationTranscriber.sessionStarted = function (s, e) {
66+
console.log("SessionStarted event");
67+
console.log("SessionId:" + e.sessionId);
68+
};
69+
conversationTranscriber.sessionStopped = function (s, e) {
70+
console.log("SessionStopped event");
71+
console.log("SessionId:" + e.sessionId);
72+
conversationTranscriber.stopTranscribingAsync();
73+
};
74+
conversationTranscriber.canceled = function (s, e) {
75+
console.log("Canceled event");
76+
console.log(e.errorDetails);
77+
conversationTranscriber.stopTranscribingAsync();
78+
};
79+
conversationTranscriber.transcribed = function (s, e) {
80+
console.log("TRANSCRIBED: Text=" + e.result.text + " Speaker ID=" + e.result.speakerId);
81+
};
82+
// Start conversation transcription
83+
conversationTranscriber.startTranscribingAsync(function () { }, function (err) {
84+
console.trace("err - starting transcription: " + err);
7785
});
7886
}
7987
fromFile();

articles/ai-services/speech-service/includes/quickstarts/stt-diarization/cli.md

Lines changed: 0 additions & 9 deletions
This file was deleted.

articles/ai-services/speech-service/includes/quickstarts/stt-diarization/cpp.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
author: eric-urban
33
ms.service: azure-ai-speech
44
ms.topic: include
5-
ms.date: 3/10/2025
5+
ms.date: 7/16/2025
66
ms.author: eur
77
---
88

articles/ai-services/speech-service/includes/quickstarts/stt-diarization/csharp.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
author: eric-urban
33
ms.service: azure-ai-speech
44
ms.topic: include
5-
ms.date: 3/10/2025
5+
ms.date: 7/16/2025
66
ms.author: eur
77
---
88

articles/ai-services/speech-service/includes/quickstarts/stt-diarization/go.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
author: eric-urban
33
ms.service: azure-ai-speech
44
ms.topic: include
5-
ms.date: 3/10/2025
5+
ms.date: 7/16/2025
66
ms.author: eur
77
---
88

articles/ai-services/speech-service/includes/quickstarts/stt-diarization/intro.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
author: eric-urban
33
ms.service: azure-ai-speech
44
ms.topic: include
5-
ms.date: 3/10/2025
5+
ms.date: 7/16/2025
66
ms.author: eur
77
---
88

articles/ai-services/speech-service/includes/quickstarts/stt-diarization/java.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
author: eric-urban
33
ms.service: azure-ai-speech
44
ms.topic: include
5-
ms.date: 3/10/2025
5+
ms.date: 7/16/2025
66
ms.author: eur
77
---
88

0 commit comments

Comments
 (0)