Skip to content

Commit 4b7adf5

Browse files
committed
initial toc structure
1 parent b9fc3d5 commit 4b7adf5

File tree

6 files changed

+236
-0
lines changed

6 files changed

+236
-0
lines changed
Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
---
2+
author: eric-urban
3+
ms.service: cognitive-services
4+
ms.subservice: speech-service
5+
ms.date: 02/28/2023
6+
ms.topic: include
7+
ms.author: eur
8+
---
9+
10+
> [!div class="checklist"]
11+
> * Azure subscription - [Create one for free](https://azure.microsoft.com/free/cognitive-services)
12+
> * [Create an Azure OpenAI resource](https://portal.azure.com/#create/Microsoft.CognitiveServicesOpenAI) in the Azure portal.
13+
> * Deploy a [model](/azure/cognitive-services/openai/concepts/models) in your Azure OpenAI Service resource. For more information about model deployment, see the Azure OpenAI [resource deployment guide](/azure/cognitive-services/openai/how-to/create-resource).
14+
> * Get the Azure OpenAI resource key and endpoint. After your Azure OpenAI resource is deployed, select **Go to resource** to view and manage keys. For more information about Cognitive Services resources, see [Get the keys for your resource](~/articles/cognitive-services/cognitive-services-apis-create-account.md#get-the-keys-for-your-resource).
15+
> * [Create a Speech resource](https://portal.azure.com/#create/Microsoft.CognitiveServicesSpeechServices) in the Azure portal.
16+
> * Get the Speech resource key and region. After your Speech resource is deployed, select **Go to resource** to view and manage keys. For more information about Cognitive Services resources, see [Get the keys for your resource](~/articles/cognitive-services/cognitive-services-apis-create-account.md#get-the-keys-for-your-resource).
Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
---
2+
author: eric-urban
3+
ms.service: cognitive-services
4+
ms.subservice: speech-service
5+
ms.topic: include
6+
ms.date: 01/25/2022
7+
ms.author: eur
8+
---
9+
10+
You can use the [Azure portal](~/articles/cognitive-services/cognitive-services-apis-create-account.md#clean-up-resources) or [Azure Command Line Interface (CLI)](~/articles/cognitive-services/cognitive-services-apis-create-account-cli.md#clean-up-resources) to remove the Azure OpenAI and Speech resources you created.
Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
---
2+
author: eric-urban
3+
ms.service: cognitive-services
4+
ms.topic: include
5+
ms.date: 02/28/2023
6+
ms.author: eur
7+
---
8+
9+
You can run this Python script to converse with Azure OpenAI. Although the experience is a back-and-forth exchange, Azure OpenAI doesn't remember the context of your conversation.
10+
11+
Speak into the microphone to start a conversation with OpenAI.
12+
- Azure Cognitive Services Speech recognizes your speech and converts it into text (speech-to-text).
13+
- Your request as text is sent to the Azure OpenAI service.
14+
- Azure Cognitive Services Speech synthesizes (text-to-speech) the response from Azure OpenAI to the default speaker.
15+
16+
> [!IMPORTANT]
17+
> To complete the steps in this guide, access must be granted to the Azure OpenAI service in the desired Azure subscription. Currently, access to this service is granted only by application. You can apply for access to the Azure OpenAI service by completing the form at [https://aka.ms/oai/access](https://aka.ms/oai/access). Open an issue on this repo to contact us if you have an issue.
Lines changed: 164 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,164 @@
1+
---
2+
author: eric-urban
3+
ms.service: cognitive-services
4+
ms.topic: include
5+
ms.date: 02/28/2023
6+
ms.author: eur
7+
---
8+
9+
[!INCLUDE [Header](../../common/python.md)]
10+
11+
[!INCLUDE [Introduction](intro.md)]
12+
13+
## Prerequisites
14+
15+
[!INCLUDE [Prerequisites](../../common/azure-prerequisites-openai.md)]
16+
17+
## Set up the environment
18+
19+
The Speech SDK for Python is available as a [Python Package Index (PyPI) module](https://pypi.org/project/azure-cognitiveservices-speech/). The Speech SDK for Python is compatible with Windows, Linux, and macOS.
20+
- You must install the [Microsoft Visual C++ Redistributable for Visual Studio 2015, 2017, 2019, and 2022](/cpp/windows/latest-supported-vc-redist?view=msvc-170&preserve-view=true) for your platform. Installing this package for the first time might require a restart.
21+
- On Linux, you must use the x64 target architecture.
22+
23+
Install a version of [Python from 3.7 to 3.10](https://www.python.org/downloads/). First check the [SDK installation guide](../../../quickstarts/setup-platform.md?pivots=programming-language-python) for any more requirements.
24+
25+
Install the following Python libraries: `os`, `requests`, `json`
26+
27+
### Set environment variables
28+
29+
[!INCLUDE [Environment variables](../../common/environment-variables.md)]
30+
31+
## Recognize speech from a microphone
32+
33+
Follow these steps to create a new console application.
34+
35+
1. Open a command prompt where you want the new project, and create a new file named `openai-speech.py`.
36+
1. Run this command to install the Speech SDK:
37+
```console
38+
pip install azure-cognitiveservices-speech
39+
```
40+
1. Run this command to install the OpenAI SDK:
41+
```console
42+
pip install openai
43+
```
44+
> [!NOTE]
45+
> This library is maintained by OpenAI and is currently in preview. Refer to the [release history](https://github.com/openai/openai-python/releases) or the [version.py commit history](https://github.com/openai/openai-python/commits/main/openai/version.py) to track the latest updates to the library.
46+
47+
1. Copy the following code into `openai-speech.py`:
48+
49+
```Python
50+
import os
51+
import azure.cognitiveservices.speech as speechsdk
52+
import openai
53+
54+
# This example requires environment variables named "OPEN_AI_KEY" and "OPEN_AI_ENDPOINT"
55+
# Your endpoint should look like the following https://YOUR_OPEN_AI_RESOURCE_NAME.openai.azure.com/
56+
openai.api_key = os.environ.get('OPEN_AI_KEY')
57+
openai.api_base = os.environ.get('OPEN_AI_ENDPOINT')
58+
openai.api_type = 'azure'
59+
openai.api_version = '2022-12-01'
60+
61+
#This will correspond to the custom name you chose for your deployment when you deployed a model.
62+
deployment_id='text-davinci-002'
63+
64+
# This example requires environment variables named "SPEECH_KEY" and "SPEECH_REGION"
65+
speech_config = speechsdk.SpeechConfig(subscription=os.environ.get('SPEECH_KEY'), region=os.environ.get('SPEECH_REGION'))
66+
audio_output_config = speechsdk.audio.AudioOutputConfig(use_default_speaker=True)
67+
audio_config = speechsdk.audio.AudioConfig(use_default_microphone=True)
68+
69+
# Should be the locale for the speaker's language.
70+
speech_config.speech_recognition_language="en-US"
71+
speech_recognizer = speechsdk.SpeechRecognizer(speech_config=speech_config, audio_config=audio_output_config)
72+
73+
# The language of the voice that responds on behalf of OpenAI.
74+
speech_config.speech_synthesis_voice_name='en-US-JennyMultilingualNeural'
75+
speech_synthesizer = speechsdk.SpeechSynthesizer(speech_config=speech_config, audio_config=audio_config)
76+
77+
# Prompts OpenAI with a request and synthesizes the response.
78+
def ask_openai(prompt):
79+
80+
# Ask OpenAI
81+
print('OpenAI prompt:' + prompt)
82+
response = openai.Completion.create(engine=deployment_id, prompt=prompt, max_tokens=10)
83+
text = response['choices'][0]['text'].replace('\n', '').replace(' .', '.').strip()
84+
print('OpenAI response:' + text)
85+
86+
# Azure text-to-speech output
87+
speech_synthesis_result = speech_synthesizer.speak_text_async(text).get()
88+
89+
# Check result
90+
if speech_synthesis_result.reason == speechsdk.ResultReason.SynthesizingAudioCompleted:
91+
print("Speech synthesized to speaker for text [{}]".format(text))
92+
elif speech_synthesis_result.reason == speechsdk.ResultReason.Canceled:
93+
cancellation_details = speech_synthesis_result.cancellation_details
94+
print("Speech synthesis canceled: {}".format(cancellation_details.reason))
95+
if cancellation_details.reason == speechsdk.CancellationReason.Error:
96+
print("Error details: {}".format(cancellation_details.error_details))
97+
98+
# Continuously listens for speech input to recognize and send as text to Azure OpenAI
99+
def chat_with_open_ai():
100+
while True:
101+
print("OpenAI is listening. Ctrl-Z to exit")
102+
try:
103+
# Get audio from the microphone and then send it to the TTS service.
104+
speech_recognition_result = speech_recognizer.recognize_once_async().get()
105+
106+
# If speech is recognized, send it to OpenAI and listen for the response.
107+
if speech_recognition_result.reason == speechsdk.ResultReason.RecognizedSpeech:
108+
ask_openai(speech_recognition_result.text)
109+
elif speech_recognition_result.reason == speechsdk.ResultReason.NoMatch:
110+
print("No speech could be recognized: {}".format(speech_recognition_result.no_match_details))
111+
break
112+
elif speech_recognition_result.reason == speechsdk.ResultReason.Canceled:
113+
cancellation_details = speech_recognition_result.cancellation_details
114+
print("Speech Recognition canceled: {}".format(cancellation_details.reason))
115+
if cancellation_details.reason == speechsdk.CancellationReason.Error:
116+
print("Error details: {}".format(cancellation_details.error_details))
117+
print("Did you set the speech resource key and region values?")
118+
except EOFError:
119+
break
120+
121+
# Main
122+
123+
try:
124+
chat_with_open_ai()
125+
except Exception as err:
126+
print("Encountered exception. {}".format(err))
127+
```
128+
129+
Run your new console application to start speech recognition from a microphone:
130+
131+
```console
132+
python openai-speech.py
133+
```
134+
135+
> [!IMPORTANT]
136+
> Make sure that you set the `OPEN_AI_KEY`, `OPEN_AI_ENDPOINT`, `SPEECH__KEY` and `SPEECH__REGION` environment variables as described [above](#set-environment-variables). If you don't set these variables, the sample will fail with an error message.
137+
138+
Speak into your microphone when prompted. The console output will include the prompt for you to begin speaking, then your request as text, and then the Azure OpenAI response as text. The Azure OpenAI response should be converted from text to speech and output to the default speaker.
139+
140+
```console
141+
PS C:\dev\openai\python> python.exe .\openai-speech.py
142+
OpenAI is listening. Ctrl-Z to exit
143+
OpenAI prompt:How can artificial intelligence help technical writers?
144+
OpenAI response:Artificial intelligence (AI) can help improve the efficiency of a technical writer's workflow in several ways. For example, AI can be used to generate information maps of documentation sets, analyze customer feedback to identify areas that need improvement, or suggest new topics for documentation based on customer queries. Additionally, AI-powered search engines can help technical writers find relevant information more quickly and easily.
145+
Speech synthesized to speaker for text [Artificial intelligence (AI) can help improve the efficiency of a technical writer's workflow in several ways. For example, AI can be used to generate information maps of documentation sets, analyze customer feedback to identify areas that need improvement, or suggest new topics for documentation based on customer queries. Additionally, AI-powered search engines can help technical writers find relevant information more quickly and easily.]
146+
OpenAI is listening. Ctrl-Z to exit
147+
OpenAI prompt:Write a tagline for an ice cream shop.
148+
OpenAI response:Your favorite ice cream, made fresh!
149+
Speech synthesized to speaker for text [Your favorite ice cream, made fresh!]
150+
OpenAI is listening. Ctrl-Z to exit
151+
No speech could be recognized: NoMatchDetails(reason=NoMatchReason.InitialSilenceTimeout)
152+
PS C:\dev\openai\python>
153+
```
154+
155+
## Remarks
156+
Now that you've completed the quickstart, here are some additional considerations:
157+
158+
- To change the speech recognition language, replace `en-US` with another [supported language](~/articles/cognitive-services/speech-service/supported-languages.md). For example, `es-ES` for Spanish (Spain). The default language is `en-US` if you don't specify a language. For details about how to identify one of multiple languages that might be spoken, see [language identification](~/articles/cognitive-services/speech-service/language-identification.md).
159+
- To change the voice that you hear, replace `en-US-JennyMultilingualNeural` with another [supported voice](~/articles/cognitive-services/speech-service/supported-languages.md#prebuilt-neural-voices). If the voice does not speak the language of the text returned from Azure OpenAI, the Speech service won't output synthesized audio.
160+
- The Azure OpenAI Service also performs content moderation on the prompt inputs and generated outputs. The prompts or responses may be filtered if harmful content is detected. For more information, see the [content filtering](/azure/cognitive-services/openai/concepts/content-filter) article.
161+
162+
## Clean up resources
163+
164+
[!INCLUDE [Delete resource](../../common/delete-resource.md)]
Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
---
2+
title: "Azure OpenAI speech to speech chat - Speech service"
3+
titleSuffix: Azure Cognitive Services
4+
description: In this how-to guide, you can use Speech to converse with OpenAI. Although the experience is a back-and-forth exchange, OpenAI doesn't remember the context of your conversation.
5+
services: cognitive-services
6+
author: eric-urban
7+
manager: nitinme
8+
ms.service: cognitive-services
9+
ms.subservice: speech-service
10+
ms.topic: how-to
11+
ms.date: 02/23/2023
12+
ms.author: eur
13+
ms.devlang: python
14+
keywords: speech to text, openai
15+
---
16+
17+
# Azure OpenAI speech to speech chat
18+
19+
::: zone pivot="programming-language-python"
20+
[!INCLUDE [Python include](./includes/quickstarts/openai-speech/python.md)]
21+
::: zone-end
22+
23+
## Next steps
24+
25+
> [!div class="nextstepaction"]
26+
> [Learn more about Speech](overview.md)
27+
> [Learn more about Azure OpenAI](/azure/cognitive-services/openai/overview)

articles/cognitive-services/Speech-Service/toc.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -238,6 +238,8 @@ items:
238238
- name: How to use Pronunciation Assessment
239239
href: how-to-pronunciation-assessment.md
240240
displayName: pronounce, learn language, assess pron
241+
- name: Azure OpenAI speech to speech chat
242+
href: openai-speech.md
241243
- name: Conversation transcription
242244
items:
243245
- name: Conversation Transcription overview

0 commit comments

Comments
 (0)