Skip to content

Commit bcf0325

Browse files
Merge pull request #212458 from eric-urban/eur/audio-content-creation
ACC no code approach for TTS
2 parents 613feda + 26e2a94 commit bcf0325

File tree

4 files changed

+21
-10
lines changed

4 files changed

+21
-10
lines changed

articles/cognitive-services/Speech-Service/how-to-audio-content-creation.md

Lines changed: 11 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,30 +1,35 @@
11
---
22
title: Audio Content Creation - Speech service
33
titleSuffix: Azure Cognitive Services
4-
description: Audio Content Creation is an online tool that allows you to customize and fine-tune Microsoft's text-to-speech output for your apps and products.
4+
description: Audio Content Creation is an online tool that allows you to run text-to-speech synthesis without writing any code.
55
services: cognitive-services
66
author: eric-urban
77
manager: nitinme
88
ms.service: cognitive-services
99
ms.subservice: speech-service
1010
ms.topic: how-to
11-
ms.date: 01/23/2022
11+
ms.date: 09/25/2022
1212
ms.author: eur
1313
---
1414

15-
# Improve synthesis with the Audio Content Creation tool
15+
# Speech synthesis with the Audio Content Creation tool
1616

17-
[Audio Content Creation](https://aka.ms/audiocontentcreation) is an easy-to-use and powerful tool that lets you build highly natural audio content for a variety of scenarios, such as audiobooks, news broadcasts, video narrations, and chat bots. With Audio Content Creation, you can fine-tune text-to-speech voices and design customized audio experiences in an efficient and low-cost way.
17+
You can use the [Audio Content Creation](https://aka.ms/audiocontentcreation) tool in Speech Studio for text-to-speech synthesis without writing any code. You can use the output audio as-is, or as a starting point for further customization.
18+
19+
Build highly natural audio content for a variety of scenarios, such as audiobooks, news broadcasts, video narrations, and chat bots. With Audio Content Creation, you can efficiently fine-tune text-to-speech voices and design customized audio experiences.
1820

1921
The tool is based on [Speech Synthesis Markup Language (SSML)](speech-synthesis-markup.md). It allows you to adjust text-to-speech output attributes in real time or batch synthesis, such as voice characters, voice styles, speaking speed, pronunciation, and prosody.
2022

23+
- No-code approach: You can use the Audio Content Creation tool for text-to-speech synthesis without writing any code. The output audio might be the final deliverable that you want. For example, you can use the output audio for a podcast or a video narration.
24+
- Developer-friendly: You can listen to the output audio and adjust the SSML to improve speech synthesis. Then you can use the [Speech SDK](speech-sdk.md) or [Speech CLI](spx-basics.md) to integrate the SSML into your applications. For example, you can use the SSML for building a chat bot.
25+
2126
You have easy access to a broad portfolio of [languages and voices](language-support.md?tabs=stt-tts). These voices include state-of-the-art prebuilt neural voices and your custom neural voice, if you've built one.
2227

23-
To learn more, view the [Audio Content Creation tutorial video](https://youtu.be/ygApYuOOG6w).
28+
To learn more, view the Audio Content Creation tutorial video [on YouTube](https://youtu.be/ygApYuOOG6w).
2429

2530
## Get started
2631

27-
Audio Content Creation is a free tool, but you'll pay for the Speech service that you consume. To work with the tool, you need to sign in with an Azure account and create a Speech resource. For each Azure account, you have free monthly speech quotas, which include 0.5 million characters for prebuilt neural voices (referred to as *Neural* on the [pricing page](https://aka.ms/speech-pricing)). The monthly allotted amount is usually enough for a small content team of around 3-5 people.
32+
The Audio Content Creation tool in Speech Studio is free to access, but you'll pay for Speech service usage. To work with the tool, you need to sign in with an Azure account and create a Speech resource. For each Azure account, you have free monthly speech quotas, which include 0.5 million characters for prebuilt neural voices (referred to as *Neural* on the [pricing page](https://aka.ms/speech-pricing)). The monthly allotted amount is usually enough for a small content team of around 3-5 people.
2833

2934
The next sections cover how to create an Azure account and get a Speech resource.
3035

articles/cognitive-services/Speech-Service/speech-studio-overview.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ manager: nitinme
88
ms.service: cognitive-services
99
ms.subservice: speech-service
1010
ms.topic: overview
11-
ms.date: 05/13/2022
11+
ms.date: 09/25/2022
1212
ms.author: eur
1313
---
1414

@@ -30,7 +30,7 @@ In Speech Studio, the following Speech service features are available as project
3030

3131
* [Custom Voice](https://aka.ms/speechstudio/customvoice): Create custom, one-of-a-kind voices for text-to-speech. You supply audio files and create matching transcriptions in Speech Studio, and then use the custom voices in your applications. To create and use custom voices via endpoints, see [Create and use your voice model](how-to-custom-voice-create-voice.md).
3232

33-
* [Audio Content Creation](https://aka.ms/speechstudio/audiocontentcreation): Build highly natural audio content for a variety of scenarios, such as audiobooks, news broadcasts, video narrations, and chat bots, with the easy-to-use [Audio Content Creation](how-to-audio-content-creation.md) tool. With Speech Studio, you can export these audio files to use in your applications.
33+
* [Audio Content Creation](https://aka.ms/speechstudio/audiocontentcreation): A no-code approach for text-to-speech synthesis. You can use the output audio as-is, or as a starting point for further customization. You can build highly natural audio content for a variety of scenarios, such as audiobooks, news broadcasts, video narrations, and chat bots. For more information, see the [Audio Content Creation](how-to-audio-content-creation.md) documentation.
3434

3535
* [Custom Keyword](https://aka.ms/speechstudio/customkeyword): A custom keyword is a word or short phrase that you can use to voice-activate a product. You create a custom keyword in Speech Studio, and then generate a binary file to [use with the Speech SDK](custom-keyword-basics.md) in your applications.
3636

articles/cognitive-services/Speech-Service/speech-synthesis-markup.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,9 @@ ms.custom: "devx-track-js, devx-track-csharp"
1818

1919
Speech Synthesis Markup Language (SSML) is an XML-based markup language that lets developers specify how input text is converted into synthesized speech by using text-to-speech. Compared to plain text, SSML allows developers to fine-tune the pitch, pronunciation, speaking rate, volume, and more of the text-to-speech output. Normal punctuation, such as pausing after a period, or using the correct intonation when a sentence ends with a question mark are automatically handled.
2020

21+
> [!TIP]
22+
> Author plain text and SSML using the [Audio Content Creation](https://aka.ms/audiocontentcreation) tool in Speech Studio. You can listen to the output audio and adjust the SSML to improve speech synthesis. For more information, see [Speech synthesis with the Audio Content Creation tool](how-to-audio-content-creation.md).
23+
2124
The Speech service implementation of SSML is based on the World Wide Web Consortium's [Speech Synthesis Markup Language Version 1.0](https://www.w3.org/TR/2004/REC-speech-synthesis-20040907/).
2225

2326
> [!IMPORTANT]

articles/cognitive-services/Speech-Service/text-to-speech.md

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ manager: nitinme
88
ms.service: cognitive-services
99
ms.subservice: speech-service
1010
ms.topic: overview
11-
ms.date: 09/16/2022
11+
ms.date: 09/25/2022
1212
ms.author: eur
1313
ms.custom: cog-serv-seo-aug-2020
1414
keywords: text to speech
@@ -51,7 +51,7 @@ Here's more information about neural text-to-speech features in the Speech servi
5151

5252
* **Fine-tuning text-to-speech output with SSML**: Speech Synthesis Markup Language (SSML) is an XML-based markup language that's used to customize text-to-speech outputs. With SSML, you can adjust pitch, add pauses, improve pronunciation, change speaking rate, adjust volume, and attribute multiple voices to a single document.
5353

54-
You can use SSML to define your own lexicons or switch to different speaking styles. With the [multilingual voices](https://techcommunity.microsoft.com/t5/azure-ai/azure-text-to-speech-updates-at-build-2021/ba-p/2382981), you can also adjust the speaking languages via SSML. To fine-tune the voice output for your scenario, see [Improve synthesis with Speech Synthesis Markup Language](speech-synthesis-markup.md).
54+
You can use SSML to define your own lexicons or switch to different speaking styles. With the [multilingual voices](https://techcommunity.microsoft.com/t5/azure-ai/azure-text-to-speech-updates-at-build-2021/ba-p/2382981), you can also adjust the speaking languages via SSML. To fine-tune the voice output for your scenario, see [Improve synthesis with Speech Synthesis Markup Language](speech-synthesis-markup.md) and [Speech synthesis with the Audio Content Creation tool](how-to-audio-content-creation.md).
5555

5656
* **Visemes**: [Visemes](how-to-speech-synthesis-viseme.md) are the key poses in observed speech, including the position of the lips, jaw, and tongue in producing a particular phoneme. Visemes have a strong correlation with voices and phonemes.
5757

@@ -66,6 +66,9 @@ Here's more information about neural text-to-speech features in the Speech servi
6666

6767
To get started with text-to-speech, see the [quickstart](get-started-text-to-speech.md). Text-to-speech is available via the [Speech SDK](speech-sdk.md), the [REST API](rest-text-to-speech.md), and the [Speech CLI](spx-overview.md).
6868

69+
> [!TIP]
70+
> To convert text-to-speech with a no-code approach, try the [Audio Content Creation](how-to-audio-content-creation.md) tool in [Speech Studio](https://aka.ms/speechstudio/audiocontentcreation).
71+
6972
## Sample code
7073

7174
Sample code for text-to-speech is available on GitHub. These samples cover text-to-speech conversion in most popular programming languages:

0 commit comments

Comments
 (0)