Skip to content

Commit 3e4008e

Browse files
committed
use audio content creation in ai foundry
1 parent 201f17b commit 3e4008e

File tree

3 files changed

+305
-178
lines changed

3 files changed

+305
-178
lines changed

articles/ai-services/speech-service/how-to-audio-content-creation.md

Lines changed: 10 additions & 178 deletions
Original file line numberDiff line numberDiff line change
@@ -8,11 +8,12 @@ ms.service: azure-ai-speech
88
ms.topic: how-to
99
ms.date: 7/31/2025
1010
ms.author: eur
11+
zone_pivot_groups: foundry-speech-studio
1112
---
1213

1314
# Text to speech with the Audio Content Creation tool
1415

15-
You can use the [Audio Content Creation](https://speech.microsoft.com/portal/audiocontentcreation) tool in Speech Studio for text to speech without writing any code. The Audio Content Creation tool might provide the final speech audio that you want. You can use the output audio as-is, or as a starting point for further customization.
16+
You can use the Audio Content Creation tool in [Azure AI Foundry portal](https://ai.azure.com/?cid=learnDocs) or [Speech Studio](https://speech.microsoft.com/portal/audiocontentcreation) for text to speech without writing any code. The Audio Content Creation tool might provide the final speech audio that you want. You can use the output audio as-is, or as a starting point for further customization.
1617

1718
Build highly natural audio content for various scenarios, such as audiobooks, news broadcasts, video narrations, and chat bots. With Audio Content Creation, you can efficiently fine-tune text to speech voices and design customized audio experiences.
1819

@@ -23,186 +24,17 @@ The tool is based on [Speech Synthesis Markup Language (SSML)](speech-synthesis-
2324

2425
You have easy access to a broad portfolio of [languages and voices](language-support.md?tabs=tts). These voices include state-of-the-art standard voices and your custom voice, if you built one.
2526

26-
## Prerequisites
27+
The Audio Content Creation tool is free to access; you pay only for Speech service usage.
2728

28-
- An active Azure subscription. [Create one for free](https://azure.microsoft.com/free/ai-services).
29-
- Permission to create resources in your subscription.
30-
- A Speech resource. Create one in the [Azure portal](https://portal.azure.com/#create/Microsoft.CognitiveServicesSpeechServices) or [Speech Studio](https://aka.ms/speechstudio).
29+
::: zone pivot="ai-foundry-portal"
30+
[!INCLUDE [AI Foundry include](includes/how-to/audio-content-creation/ai-foundry.md)]
31+
::: zone-end
3132

32-
> [!NOTE]
33-
> The [AI Foundry resource type](../multi-service-resource.md) isn't supported in Speech Studio.
33+
::: zone pivot="speech-studio"
34+
[!INCLUDE [Speech Studio include](includes/how-to/audio-content-creation/speech-studio.md)]
35+
::: zone-end
3436

35-
Sign in with your Azure account to use the Audio Content Creation tool. The tool is free to access; you pay only for Speech service usage.
36-
37-
## Use the Audio Content Creation tool
38-
39-
The following diagram displays the process for fine-tuning the text to speech outputs.
40-
41-
:::image type="content" source="media/audio-content-creation/audio-content-creation-diagram.jpg" alt-text="Diagram of the sequence of steps for fine-tuning text to speech outputs.":::
42-
43-
To use the Audio Content Creation tool, do the following:
44-
45-
1. Sign in to [Speech Studio](https://aka.ms/speechstudio/), and then select **Audio Content Creation**.
46-
47-
1. Select the Azure subscription and the Speech resource you want to work with, and then select **Use resource**.
48-
49-
> [!NOTE]
50-
> If you're returning to Audio Content Creation, you can select a different Speech resource that you want to work with. Go to your account settings at the top right corner of the page.
51-
52-
1. [Create an audio tuning file](#create-an-audio-tuning-file) by using plain text or SSML scripts. Enter or upload your content into Audio Content Creation.
53-
1. Choose the voice and the language for your script content. Audio Content Creation includes all of the [standard text to speech voices](language-support.md?tabs=tts). You can use standard voices or a custom voice.
54-
55-
> [!NOTE]
56-
> Gated access is available for custom voice, which allows you to create high-definition voices that are similar to natural-sounding speech. For more information, see [Gating process](./text-to-speech.md).
57-
58-
1. Select the content you want to preview, and then select **Play** (via the triangle icon) to preview the default synthesis output.
59-
60-
If you make any changes to the text, select the **Stop** icon, and then select **Play** again to regenerate the audio with changed scripts.
61-
62-
Improve the output by adjusting pronunciation, break, pitch, rate, intonation, voice style, and more. For a complete list of options, see [Speech Synthesis Markup Language](speech-synthesis-markup.md).
63-
64-
For more information about adjusting the speech output, see the [how to convert text to speech video on YouTube](https://youtu.be/ygApYuOOG6w). However, the video might not be available in all regions and might not be up to date by the time you watch it.
65-
66-
1. Save and [export your tuned audio](#export-tuned-audio).
67-
68-
When you save the tuning track in the system, you can continue to work and iterate on the output. When you're satisfied with the output, you can create an audio creation task with the export feature. You can observe the status of the export task and download the output for use with your apps and products.
69-
70-
## Create an audio tuning file
71-
72-
You can get your content into the Audio Content Creation tool in either of two ways:
73-
74-
### Option 1: Create a new audio tuning file
75-
76-
1. Select **New** > **Text file** to create a new audio tuning file.
77-
78-
1. Enter or paste your content into the editing window. The allowable number of characters for each file is 20,000 or fewer. If your script contains more than 20,000 characters, you can use Option 2 to automatically split your content into multiple files.
79-
80-
1. Select **Save**.
81-
82-
### Option 2: Upload an audio tuning file
83-
84-
1. Select **Upload** > **Text file** to import one or more text files. Both plain text and SSML are supported.
85-
86-
If your script file is more than 20,000 characters, split the content by paragraphs, by characters, or by regular expressions.
87-
88-
1. When you upload your text files, make sure that they meet these requirements:
89-
90-
| Property | Description |
91-
|----------|---------------|
92-
| File format | Plain text (.txt) or SSML text (.txt)<br/><br/>Zip files aren't supported. |
93-
| Encoding format | UTF-8 |
94-
| File name | Each file must have a unique name. Duplicate files aren't supported. |
95-
| Text length | Character limit is 20,000. If your files exceed the limit, split them according to the instructions in the tool. |
96-
| SSML restrictions | Each SSML file can contain only a single piece of SSML. |
97-
98-
99-
Here's a plain text example:
100-
101-
```txt
102-
Welcome to use Audio Content Creation to customize audio output for your products.
103-
```
104-
105-
Here's an SSML example:
106-
107-
```xml
108-
<speak xmlns="http://www.w3.org/2001/10/synthesis" xmlns:mstts="http://www.w3.org/2001/mstts" version="1.0" xml:lang="en-US">
109-
<voice name="en-US-AvaMultilingualNeural">
110-
Welcome to use Audio Content Creation <break time="10ms" />to customize audio output for your products.
111-
</voice>
112-
</speak>
113-
```
114-
115-
## Export tuned audio
116-
117-
After you review your audio output and are satisfied with your tuning and adjustment, you can export the audio.
118-
119-
1. Select **Export** to create an audio creation task.
120-
121-
We recommend **Export to Audio library** to easily store, find, and search audio output in the cloud. You can better integrate with your applications through Azure blob storage. You can also download the audio to your local disk directly.
122-
123-
1. Choose the output format for your tuned audio. The **supported audio formats and sample rates** are listed in the following table:
124-
125-
| Format | 8 kHz sample rate | 16 kHz sample rate | 24 kHz sample rate | 48 kHz sample rate |
126-
|--- |--- |--- |--- |--- |
127-
| wav | riff-8khz-16bit-mono-pcm | riff-16khz-16bit-mono-pcm | riff-24khz-16bit-mono-pcm |riff-48khz-16bit-mono-pcm |
128-
| mp3 | N/A | audio-16khz-128kbitrate-mono-mp3 | audio-24khz-160kbitrate-mono-mp3 |audio-48khz-192kbitrate-mono-mp3 |
129-
130-
1. To view the status of the task, select the **Task list** tab.
131-
132-
If the task fails, see the detailed information page for a full report.
133-
134-
1. When the task is complete, your audio is available for download on the **Audio library** pane.
135-
136-
1. Select the file you want to download and **Download**.
137-
138-
Now you're ready to use your custom tuned audio in your apps or products.
139-
140-
## Configure BYOS and anonymous public read access for blobs
141-
142-
If you lose access permission to your Bring Your Own Storage (BYOS), you can't view, create, edit, or delete files. To resume your access, you need to remove the current storage and reconfigure the BYOS in the [Azure portal](https://portal.azure.com/#allservices). To learn more about how to configure BYOS, see [Mount Azure Storage as a local share in App Service](/azure/app-service/configure-connect-to-azure-storage?pivots=container-linux&tabs=portal).
143-
144-
After configuring the BYOS permission, you need to configure anonymous public read access for related containers and blobs. Otherwise, blob data isn't available for public access and your lexicon file in the blob is inaccessible. By default, a container’s public access setting is disabled. To grant anonymous users read access to a container and its blobs, first set **Allow Blob anonymous access** to **Enabled** to allow public access for the storage account, then set the container's (named **acc-public-files**) public access level (**anonymous read access for blobs only**). To learn more about how to configure anonymous public read access, see [Configure anonymous public read access for containers and blobs](/azure/storage/blobs/anonymous-read-access-configure?tabs=portal).
145-
146-
## Add or remove Audio Content Creation users
147-
148-
If more than one user wants to use Audio Content Creation, you can grant them access to the Azure subscription and the Speech resource. If you add users to an Azure subscription, they can access all the resources under the Azure subscription. But if you add users to a Speech resource only, they only have access to the Speech resource and not to other resources under this Azure subscription. Users with access to the Speech resource can use the Audio Content Creation tool.
149-
150-
The users you grant access to need to set up a [Microsoft account](https://account.microsoft.com/account). If they don' have a Microsoft account, they can create one in just a few minutes. They can use their existing email and link it to a Microsoft account, or they can create and use an Outlook email address as a Microsoft account.
151-
152-
### Add users to a Speech resource
153-
154-
To add users to a Speech resource so that they can use Audio Content Creation, do the following:
155-
156-
1. In the [Azure portal](https://portal.azure.com/), select **All services** from the left pane, and then search for **Azure AI services** or **Speech**.
157-
1. Select your Speech resource.
158-
159-
> [!NOTE]
160-
> You can also set up Azure RBAC for whole resource groups, subscriptions, or management groups. Do this by selecting the desired scope level and then navigating to the desired item (for example, selecting **Resource groups** and then selecting your resource group).
161-
162-
1. Select **Access control (IAM)** on the left pane.
163-
1. Select **Add** > **Add role assignment**.
164-
1. On the **Role** tab on the next screen, select a role (such as **Owner**) that you want to add.
165-
1. On the **Members** tab, enter a user's email address and select the user's name in the directory. The email address must be linked to a Microsoft account that's trusted by Microsoft Entra ID. Users can easily sign up for a [Microsoft account](https://account.microsoft.com/account) by using their personal email address.
166-
1. On the **Review + assign** tab, select **Review + assign** to assign the role.
167-
168-
Here's what happens next:
169-
170-
1. An email invitation is automatically sent to users.
171-
172-
> [!NOTE]
173-
> If users don't receive the invitation email, you can search for their account under **Role assignments** and go into their profile. Look for **Identity** > **Invitation accepted**, and select **(manage)** to resend the email invitation. You can also copy and send the invitation link to them.
174-
175-
1. They can accept it by selecting **Accept invitation** > **Accept to join Azure** in their email.
176-
1. They're then redirected to the Azure portal. They don't need to take further action in the Azure portal.
177-
1. After a few moments, users are assigned the role at the Speech resource scope, which gives them access to this Speech resource.
178-
179-
Users now visit or refresh the [Audio Content Creation](https://aka.ms/audiocontentcreation) product page, and sign in with their Microsoft account. They select **Audio Content Creation** block among all speech products. They choose the Speech resource in the pop-up window or in the settings at the upper right.
180-
181-
If they can't find the available Speech resource, they can check to ensure that they're in the right directory. To do so, they select the account profile at the upper right and then select **Switch** next to **Current directory**. If there's more than one directory available, it means they have access to multiple directories. They can switch to different directories and go to **Settings** to see whether the right Speech resource is available.
182-
183-
Users who are in the same Speech resource see each other's work in the Audio Content Creation tool. If you want each individual user to have a unique and private workplace in Audio Content Creation, create a new Speech resource.
184-
185-
### Remove users from a Speech resource
186-
187-
To remove a user's permission from a Speech resource, do the following:
188-
1. Search for **Azure AI services** in the Azure portal, select the Speech resource that you want to remove users from.
189-
1. Select **Access control (IAM)**, and then select the **Role assignments** tab to view all the role assignments for this Speech resource.
190-
1. Select the users you want to remove, select **Remove**, and then select **OK**.
191-
192-
:::image type="content" source="media/audio-content-creation/remove-user.png" alt-text="Screenshot of the 'Remove' button on the 'Remove role assignments' pane.":::
193-
194-
### Enable users to grant access to others
195-
196-
If you want to allow a user to grant access to other users, you need to assign them the owner role for the Speech resource and set the user as the Azure directory reader.
197-
1. Add the user as the owner of the Speech resource. For more information, see [Add users to a Speech resource](#add-users-to-a-speech-resource).
198-
199-
:::image type="content" source="media/audio-content-creation/add-role.png" alt-text="Screenshot showing the 'Owner' role on the 'Add role assignment' pane. ":::
200-
201-
1. In the [Azure portal](https://portal.azure.com/), select the collapsed menu at the upper left, select **Microsoft Entra ID**, and then select **Users**.
202-
1. Search for the user's Microsoft account, go to their detail page, and then select **Assigned roles**.
203-
1. Select **Add assignments** > **Directory Readers**. If the **Add assignments** button is unavailable, it means that you don't have access. You must have the role of **Owner** or **User Access Administrator** to assign roles to users.
204-
205-
## Next steps
37+
## Related content
20638

20739
- [Speech Synthesis Markup Language (SSML)](speech-synthesis-markup.md)
20840
- [Batch synthesis](batch-synthesis.md)

0 commit comments

Comments
 (0)