Skip to content

Commit 7c09fc5

Browse files
authored
Merge pull request #983 from diberry/diberry/1022-audio
Quickstart JS - Audio - Entra for JS & TS
2 parents 761b750 + 4bd60fa commit 7c09fc5

File tree

7 files changed

+537
-73
lines changed

7 files changed

+537
-73
lines changed

articles/ai-services/openai/includes/text-to-speech-javascript.md

Lines changed: 24 additions & 44 deletions
Original file line numberDiff line numberDiff line change
@@ -14,24 +14,12 @@ recommendations: false
1414

1515
## Prerequisites
1616

17-
#### [JavaScript](#tab/javascript)
18-
19-
- An Azure subscription - [Create one for free](https://azure.microsoft.com/free/cognitive-services?azure-portal=true)
20-
- [LTS versions of Node.js](https://github.com/nodejs/release#release-schedule)
21-
- An Azure OpenAI resource created in a supported region (see [Region availability](/azure/ai-services/openai/concepts/models#model-summary-table-and-region-availability)). For more information, see [Create a resource and deploy a model with Azure OpenAI](../how-to/create-resource.md).
22-
23-
24-
#### [TypeScript](#tab/typescript)
25-
2617
- An Azure subscription - [Create one for free](https://azure.microsoft.com/free/cognitive-services?azure-portal=true)
2718
- [LTS versions of Node.js](https://github.com/nodejs/release#release-schedule)
28-
- [TypeScript](https://www.typescriptlang.org/download/)
19+
- [Azure CLI](/cli/azure/install-azure-cli) used for passwordless authentication in a local development environment, create the necessary context by signing in with the Azure CLI.
2920
- An Azure OpenAI resource created in a supported region (see [Region availability](/azure/ai-services/openai/concepts/models#model-summary-table-and-region-availability)). For more information, see [Create a resource and deploy a model with Azure OpenAI](../how-to/create-resource.md).
3021

3122

32-
33-
---
34-
3523
## Set up
3624

3725
### Retrieve key and endpoint
@@ -104,32 +92,35 @@ Your app's _package.json_ file will be updated with the dependencies.
10492

10593
## Create a speech file
10694

95+
10796

108-
109-
#### [JavaScript](#tab/javascript)
97+
#### [Microsoft Entra ID](#tab/javascript-keyless)
11098

11199
1. Create a new file named _Text-to-speech.js_ and open it in your preferred code editor. Copy the following code into the _Text-to-speech.js_ file:
112100

113101
```javascript
114-
require("dotenv/config");
115102
const { writeFile } = require("fs/promises");
116103
const { AzureOpenAI } = require("openai");
104+
const { DefaultAzureCredential, getBearerTokenProvider } = require("@azure/identity");
117105
require("openai/shims/node");
118106

119107
// You will need to set these environment variables or edit the following values
120108
const endpoint = process.env["AZURE_OPENAI_ENDPOINT"] || "<endpoint>";
121-
const apiKey = process.env["AZURE_OPENAI_API_KEY"] || "<api key>";
122-
const speechFilePath =
123-
process.env["SPEECH_FILE_PATH"] || "<path to save the speech file>";
109+
const speechFilePath = "<path to save the speech file>";
124110

125111
// Required Azure OpenAI deployment name and API version
126112
const deploymentName = "tts";
127113
const apiVersion = "2024-08-01-preview";
128114

115+
// keyless authentication
116+
const credential = new DefaultAzureCredential();
117+
const scope = "https://cognitiveservices.azure.com/.default";
118+
const azureADTokenProvider = getBearerTokenProvider(credential, scope);
119+
129120
function getClient() {
130121
return new AzureOpenAI({
131122
endpoint,
132-
apiKey,
123+
azureADTokenProvider,
133124
apiVersion,
134125
deployment: deploymentName,
135126
});
@@ -169,30 +160,26 @@ Your app's _package.json_ file will be updated with the dependencies.
169160
```console
170161
node Text-to-speech.js
171162
```
172-
173163

174-
#### [TypeScript](#tab/typescript)
164+
#### [API key](#tab/javascript-key)
175165

176-
1. Create a new file named _Text-to-speech.ts_ and open it in your preferred code editor. Copy the following code into the _Text-to-speech.ts_ file:
166+
1. Create a new file named _Text-to-speech.js_ and open it in your preferred code editor. Copy the following code into the _Text-to-speech.js_ file:
177167

178-
```typescript
179-
import "dotenv/config";
180-
import { writeFile } from "fs/promises";
181-
import { AzureOpenAI } from "openai";
182-
import type { SpeechCreateParams } from "openai/resources/audio/speech";
183-
import "openai/shims/node";
168+
```javascript
169+
const { writeFile } = require("fs/promises");
170+
const { AzureOpenAI } = require("openai");
171+
require("openai/shims/node");
184172
185173
// You will need to set these environment variables or edit the following values
186174
const endpoint = process.env["AZURE_OPENAI_ENDPOINT"] || "<endpoint>";
187175
const apiKey = process.env["AZURE_OPENAI_API_KEY"] || "<api key>";
188-
const speechFilePath =
189-
process.env["SPEECH_FILE_PATH"] || "<path to save the speech file>";
176+
const speechFilePath = "<path to save the speech file>";
190177
191178
// Required Azure OpenAI deployment name and API version
192179
const deploymentName = "tts";
193180
const apiVersion = "2024-08-01-preview";
194181
195-
function getClient(): AzureOpenAI {
182+
function getClient() {
196183
return new AzureOpenAI({
197184
endpoint,
198185
apiKey,
@@ -202,9 +189,9 @@ Your app's _package.json_ file will be updated with the dependencies.
202189
}
203190
204191
async function generateAudioStream(
205-
client: AzureOpenAI,
206-
params: SpeechCreateParams
207-
): Promise<NodeJS.ReadableStream> {
192+
client,
193+
params
194+
) {
208195
const response = await client.audio.speech.create(params);
209196
if (response.ok) return response.body;
210197
throw new Error(`Failed to generate audio stream: ${response.statusText}`);
@@ -229,19 +216,12 @@ Your app's _package.json_ file will be updated with the dependencies.
229216
});
230217
231218
```
232-
233-
The import of `"openai/shims/node"` is necessary when running the code in a Node.js environment. It ensures that the output type of the `client.audio.speech.create` method is correctly set to `NodeJS.ReadableStream`.
234-
235-
1. Build the application with the following command:
236-
237-
```console
238-
tsc
239-
```
240219

241-
1. Run the application with the following command:
220+
1. Run the script with the following command:
242221

243222
```console
244223
node Text-to-speech.js
245224
```
225+
246226

247227
---
Lines changed: 247 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,247 @@
1+
---
2+
ms.topic: include
3+
manager: nitinme
4+
ms.service: azure-ai-openai
5+
ms.topic: include
6+
ms.date: 09/12/2024
7+
ms.reviewer: v-baolianzou
8+
ms.author: eur
9+
author: eric-urban
10+
recommendations: false
11+
---
12+
13+
[Source code](https://github.com/openai/openai-node) | [Package (npm)](https://www.npmjs.com/package/openai) | [Samples](https://github.com/Azure/azure-sdk-for-js/tree/main/sdk/openai/openai/samples)
14+
15+
## Prerequisites
16+
17+
- An Azure subscription - [Create one for free](https://azure.microsoft.com/free/cognitive-services?azure-portal=true)
18+
- [LTS versions of Node.js](https://github.com/nodejs/release#release-schedule)
19+
- [TypeScript](https://www.typescriptlang.org/download/)
20+
- [Azure CLI](/cli/azure/install-azure-cli) used for passwordless authentication in a local development environment, create the necessary context by signing in with the Azure CLI.
21+
- An Azure OpenAI resource created in a supported region (see [Region availability](/azure/ai-services/openai/concepts/models#model-summary-table-and-region-availability)). For more information, see [Create a resource and deploy a model with Azure OpenAI](../how-to/create-resource.md).
22+
23+
24+
## Set up
25+
26+
### Retrieve key and endpoint
27+
28+
To successfully make a call against Azure OpenAI, you need an **endpoint** and a **key**.
29+
30+
|Variable name | Value |
31+
|--------------------------|-------------|
32+
| `AZURE_OPENAI_ENDPOINT` | This value can be found in the **Keys & Endpoint** section when examining your resource from the Azure portal. Alternatively, you can find the value in the **Azure OpenAI Studio** > **Playground** > **Code View**. An example endpoint is: `https://aoai-docs.openai.azure.com/`.|
33+
| `AZURE_OPENAI_API_KEY` | This value can be found in the **Keys & Endpoint** section when examining your resource from the Azure portal. You can use either `KEY1` or `KEY2`.|
34+
35+
Go to your resource in the Azure portal. The **Endpoint and Keys** can be found in the **Resource Management** section. Copy your endpoint and access key as you need both for authenticating your API calls. You can use either `KEY1` or `KEY2`. Always having two keys allows you to securely rotate and regenerate keys without causing a service disruption.
36+
37+
:::image type="content" source="../media/quickstarts/endpoint.png" alt-text="Screenshot of the overview UI for an Azure OpenAI resource in the Azure portal with the endpoint & access keys location highlighted." lightbox="../media/quickstarts/endpoint.png":::
38+
39+
### Environment variables
40+
41+
Create and assign persistent environment variables for your key and endpoint.
42+
43+
[!INCLUDE [Azure key vault](~/reusable-content/ce-skilling/azure/includes/ai-services/security/azure-key-vault.md)]
44+
45+
# [Command Line](#tab/command-line)
46+
47+
```CMD
48+
setx AZURE_OPENAI_API_KEY "REPLACE_WITH_YOUR_KEY_VALUE_HERE"
49+
```
50+
51+
```CMD
52+
setx AZURE_OPENAI_ENDPOINT "REPLACE_WITH_YOUR_ENDPOINT_HERE"
53+
```
54+
55+
# [PowerShell](#tab/powershell)
56+
57+
```powershell
58+
[System.Environment]::SetEnvironmentVariable('AZURE_OPENAI_API_KEY', 'REPLACE_WITH_YOUR_KEY_VALUE_HERE', 'User')
59+
```
60+
61+
```powershell
62+
[System.Environment]::SetEnvironmentVariable('AZURE_OPENAI_ENDPOINT', 'REPLACE_WITH_YOUR_ENDPOINT_HERE', 'User')
63+
```
64+
65+
# [Bash](#tab/bash)
66+
67+
```Bash
68+
echo export AZURE_OPENAI_API_KEY="REPLACE_WITH_YOUR_KEY_VALUE_HERE" >> /etc/environment && source /etc/environment
69+
```
70+
71+
```Bash
72+
echo export AZURE_OPENAI_ENDPOINT="REPLACE_WITH_YOUR_ENDPOINT_HERE" >> /etc/environment && source /etc/environment
73+
```
74+
---
75+
76+
## Create a Node application
77+
78+
In a console window (such as cmd, PowerShell, or Bash), create a new directory for your app, and navigate to it. Then run the `npm init` command to create a node application with a _package.json_ file.
79+
80+
```console
81+
npm init
82+
```
83+
84+
## Install the client library
85+
86+
Install the client libraries with:
87+
88+
```console
89+
npm install openai @azure/identity
90+
```
91+
92+
Your app's _package.json_ file will be updated with the dependencies.
93+
94+
## Create a speech file
95+
96+
97+
98+
#### [Microsoft Entra ID](#tab/typescript-keyless)
99+
100+
1. Create a new file named _Text-to-speech.ts_ and open it in your preferred code editor. Copy the following code into the _Text-to-speech.ts_ file:
101+
102+
```typescript
103+
import { writeFile } from "fs/promises";
104+
import { AzureOpenAI } from "openai";
105+
import { DefaultAzureCredential, getBearerTokenProvider } from "@azure/identity";
106+
import type { SpeechCreateParams } from "openai/resources/audio/speech";
107+
import "openai/shims/node";
108+
109+
// You will need to set these environment variables or edit the following values
110+
const endpoint = process.env["AZURE_OPENAI_ENDPOINT"] || "<endpoint>";
111+
const speechFilePath = "<path to save the speech file>";
112+
113+
// Required Azure OpenAI deployment name and API version
114+
const deploymentName = "tts";
115+
const apiVersion = "2024-08-01-preview";
116+
117+
// keyless authentication
118+
const credential = new DefaultAzureCredential();
119+
const scope = "https://cognitiveservices.azure.com/.default";
120+
const azureADTokenProvider = getBearerTokenProvider(credential, scope);
121+
122+
function getClient(): AzureOpenAI {
123+
return new AzureOpenAI({
124+
endpoint,
125+
azureADTokenProvider,
126+
apiVersion,
127+
deployment: deploymentName,
128+
});
129+
}
130+
131+
async function generateAudioStream(
132+
client: AzureOpenAI,
133+
params: SpeechCreateParams
134+
): Promise<NodeJS.ReadableStream> {
135+
const response = await client.audio.speech.create(params);
136+
if (response.ok) return response.body;
137+
throw new Error(`Failed to generate audio stream: ${response.statusText}`);
138+
}
139+
export async function main() {
140+
console.log("== Text to Speech Sample ==");
141+
142+
const client = getClient();
143+
const streamToRead = await generateAudioStream(client, {
144+
model: deploymentName,
145+
voice: "alloy",
146+
input: "the quick brown chicken jumped over the lazy dogs",
147+
});
148+
149+
console.log(`Streaming response to ${speechFilePath}`);
150+
await writeFile(speechFilePath, streamToRead);
151+
console.log("Finished streaming");
152+
}
153+
154+
main().catch((err) => {
155+
console.error("The sample encountered an error:", err);
156+
});
157+
158+
```
159+
160+
The import of `"openai/shims/node"` is necessary when running the code in a Node.js environment. It ensures that the output type of the `client.audio.speech.create` method is correctly set to `NodeJS.ReadableStream`.
161+
162+
1. Build the application with the following command:
163+
164+
```console
165+
tsc
166+
```
167+
168+
1. Run the application with the following command:
169+
170+
```console
171+
node Text-to-speech.js
172+
```
173+
174+
175+
#### [API key](#tab/typescript-key)
176+
177+
1. Create a new file named _Text-to-speech.ts_ and open it in your preferred code editor. Copy the following code into the _Text-to-speech.ts_ file:
178+
179+
```typescript
180+
import { writeFile } from "fs/promises";
181+
import { AzureOpenAI } from "openai";
182+
import type { SpeechCreateParams } from "openai/resources/audio/speech";
183+
import "openai/shims/node";
184+
185+
// You will need to set these environment variables or edit the following values
186+
const endpoint = "<endpoint>";
187+
const apiKey = process.env["AZURE_OPENAI_API_KEY"] || "<api key>";
188+
const speechFilePath =
189+
process.env["SPEECH_FILE_PATH"] || "<path to save the speech file>";
190+
191+
// Required Azure OpenAI deployment name and API version
192+
const deploymentName = "tts";
193+
const apiVersion = "2024-08-01-preview";
194+
195+
function getClient(): AzureOpenAI {
196+
return new AzureOpenAI({
197+
endpoint,
198+
apiKey,
199+
apiVersion,
200+
deployment: deploymentName,
201+
});
202+
}
203+
204+
async function generateAudioStream(
205+
client: AzureOpenAI,
206+
params: SpeechCreateParams
207+
): Promise<NodeJS.ReadableStream> {
208+
const response = await client.audio.speech.create(params);
209+
if (response.ok) return response.body;
210+
throw new Error(`Failed to generate audio stream: ${response.statusText}`);
211+
}
212+
export async function main() {
213+
console.log("== Text to Speech Sample ==");
214+
215+
const client = getClient();
216+
const streamToRead = await generateAudioStream(client, {
217+
model: deploymentName,
218+
voice: "alloy",
219+
input: "the quick brown chicken jumped over the lazy dogs",
220+
});
221+
222+
console.log(`Streaming response to ${speechFilePath}`);
223+
await writeFile(speechFilePath, streamToRead);
224+
console.log("Finished streaming");
225+
}
226+
227+
main().catch((err) => {
228+
console.error("The sample encountered an error:", err);
229+
});
230+
231+
```
232+
233+
The import of `"openai/shims/node"` is necessary when running the code in a Node.js environment. It ensures that the output type of the `client.audio.speech.create` method is correctly set to `NodeJS.ReadableStream`.
234+
235+
1. Build the application with the following command:
236+
237+
```console
238+
tsc
239+
```
240+
241+
1. Run the application with the following command:
242+
243+
```console
244+
node Text-to-speech.js
245+
```
246+
247+
---

0 commit comments

Comments
 (0)