Skip to content

Commit 96ba211

Browse files
authored
Merge pull request #77921 from trrwilson/travisw-msft-pr/add-android-assistant-quickstart
[Cog svcs] Speech SDK: add a quickstart for Android voice-first virtual assistants
2 parents 440a986 + 14818a5 commit 96ba211

8 files changed

+351
-50
lines changed
94.3 KB
Loading
13.3 KB
Loading

articles/cognitive-services/Speech-Service/quickstart-java-android.md

Lines changed: 1 addition & 41 deletions
Original file line numberDiff line numberDiff line change
@@ -29,47 +29,7 @@ You need a Speech Services subscription key to complete this Quickstart. You can
2929

3030
## Create and configure a project
3131

32-
1. Launch Android Studio, and choose **Start a new Android Studio project** in the Welcome window.
33-
34-
![Screenshot of Android Studio Welcome window](media/sdk/qs-java-android-01-start-new-android-studio-project.png)
35-
36-
1. The **Choose your project** wizard appears, select **Phone and Tablet** and **Empty Activity** in the activity selection box. Select **Next**.
37-
38-
![Screenshot of Choose your project wizard](media/sdk/qs-java-android-02-target-android-devices.png)
39-
40-
1. In the **Configure your project** screen, enter **Quickstart** as **Name**, **samples.speech.cognitiveservices.microsoft.com** as **Package name**, and choose a project directory. For **Minimum API level** pick **API 23: Android 6.0 (Marshmallow)**, leave all other checkboxes unchecked, and select **Finish**.
41-
42-
![Screenshot of Configure your project wizard](media/sdk/qs-java-android-03-create-android-project.png)
43-
44-
Android Studio takes a moment to prepare your new Android project. Next, configure the project to know about the Speech SDK and to use Java 8.
45-
46-
[!INCLUDE [License Notice](../../../includes/cognitive-services-speech-service-license-notice.md)]
47-
48-
The current version of the Cognitive Services Speech SDK is `1.5.1`.
49-
50-
The Speech SDK for Android is packaged as an [AAR (Android Library)](https://developer.android.com/studio/projects/android-library), which includes the necessary libraries and required Android permissions.
51-
It is hosted in a Maven repository at https:\//csspeechstorage.blob.core.windows.net/maven/.
52-
53-
Set up your project to use the Speech SDK. Open the Project Structure window by choosing **File** > **Project Structure** from the Android Studio menu bar. In the Project Structure window, make the following changes:
54-
55-
1. In the list on the left side of the window, select **Project**. Edit the **Default Library Repository** settings by appending a comma and our Maven repository URL enclosed in single quotes. 'https:\//csspeechstorage.blob.core.windows.net/maven/'
56-
57-
![Screenshot of Project Structure window](media/sdk/qs-java-android-06-add-maven-repository.png)
58-
59-
1. In the same screen, on the left side, select **app**. Then select the **Dependencies** tab at the top of the window. Select the green plus sign (+), and choose **Library dependency** from the drop-down menu.
60-
61-
![Screenshot of Project Structure window](media/sdk/qs-java-android-07-add-module-dependency.png)
62-
63-
1. In the window that comes up, enter the name and version of our Speech SDK for Android, `com.microsoft.cognitiveservices.speech:client-sdk:1.5.1`. Then select **OK**.
64-
The Speech SDK should be added to the list of dependencies now, as shown below:
65-
66-
![Screenshot of Project Structure window](media/sdk/qs-java-android-08-dependency-added-1.0.0.png)
67-
68-
1. Select the **Properties** tab. For both **Source Compatibility** and **Target Compatibility**, select **1.8**.
69-
70-
![](media/sdk/qs-java-android-09-dependency-added.png)
71-
72-
1. Select **OK** to close the Project Structure window and apply your changes to the project.
32+
[!INCLUDE [](../../../includes/cognitive-services-speech-service-quickstart-java-android-create-proj.md)]
7333

7434
## Create user interface
7535

articles/cognitive-services/Speech-Service/quickstart-virtual-assistant-csharp-uwp.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -27,11 +27,11 @@ In this article, you'll develop a C# Universal Windows Platform (UWP) applicatio
2727
This quickstart requires:
2828

2929
* [Visual Studio 2017](https://visualstudio.microsoft.com/downloads/)
30-
* An Azure subscription key for the Speech Service. [Get one for free](get-started.md).
30+
* An Azure subscription key for the Speech Services in the **westus2** region. Create this subscription on the [Azure portal](https://portal.azure.com).
3131
* A previously created bot configured with the [Direct Line Speech channel](https://docs.microsoft.com/azure/bot-service/bot-service-channel-connect-directlinespeech)
3232

3333
> [!NOTE]
34-
> In preview, the Direct Line Speech channel currently supports only the **westus2** region.
34+
> Direct Line Speech (Preview) is currently only available in the **westus2** region.
3535
3636
> [!NOTE]
3737
> The 30-day trial for the standard pricing tier described in [Try Speech Services for free](get-started.md) is restricted to **westus** (not **westus2**) and is thus not compatible with Direct Line Speech. Free and standard tier **westus2** subscriptions are compatible.
@@ -249,9 +249,9 @@ This quickstart will describe, step by step, how to make a simple client applica
249249
```csharp
250250
// create a BotConnectorConfig by providing a bot secret key and Cognitive Services subscription key
251251
// the RecoLanguage property is optional (default en-US); note that only en-US is supported in Preview
252-
const string channelSecret = "YourChannelSecret";
253-
const string speechSubscriptionKey = "YourSpeechSubscriptionKey";
254-
const string region = "YourServiceRegion"; // note: this is assumed as westus2 for preview
252+
const string channelSecret = "YourChannelSecret"; // Your channel secret
253+
const string speechSubscriptionKey = "YourSpeechSubscriptionKey"; // Your subscription key
254+
const string region = "YourServiceRegion"; // Your subscription service region. Note: only 'westus2' is currently supported
255255

256256
var botConnectorConfig = BotConnectorConfig.FromSecretKey(channelSecret, speechSubscriptionKey, region);
257257
botConnectorConfig.SetProperty(PropertyId.SpeechServiceConnection_RecoLanguage, "en-US");
Lines changed: 289 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,289 @@
1+
---
2+
title: 'Quickstart: Custom voice-first virtual assistant (Preview), Java (Android) - Speech Services'
3+
titleSuffix: Azure Cognitive Services
4+
description: Learn how to create a voice-first virtual assistant application in Java on Android using the Speech SDK
5+
services: cognitive-services
6+
author: trrwilson
7+
manager: nitinme
8+
ms.service: cognitive-services
9+
ms.subservice: speech-service
10+
ms.topic: quickstart
11+
ms.date: 5/24/2019
12+
ms.author: travisw
13+
---
14+
15+
# Quickstart: Create a voice-first virtual assistant in Java on Android by using the Speech SDK
16+
17+
A quickstart is also available for [speech-to-text](quickstart-java-android.md).
18+
19+
In this article, you'll build a voice-first virtual assistant with Java for Android using the [Speech SDK](speech-sdk.md). This application will connect to a bot that you've already authored and configured with the [Direct Line Speech channel](https://docs.microsoft.com/azure/bot-service/bot-service-channel-connect-directlinespeech). It will then send a voice request to the bot and present a voice-enabled response activity.
20+
21+
This application is built with the Speech SDK Maven package and Android Studio 3.3. The Speech SDK is currently compatible with Android devices having 32/64-bit ARM and Intel x86/x64 compatible processors.
22+
23+
> [!NOTE]
24+
> For the Speech Devices SDK and the Roobo device, see [Speech Devices SDK](speech-devices-sdk.md).
25+
26+
## Prerequisites
27+
28+
* An Azure subscription key for Speech Services in the **westus2** region. Create this subscription on the [Azure portal](https://portal.azure.com).
29+
* A previously created bot configured with the [Direct Line Speech channel](https://docs.microsoft.com/azure/bot-service/bot-service-channel-connect-directlinespeech)
30+
* [Android Studio](https://developer.android.com/studio/) v3.3 or later
31+
32+
> [!NOTE]
33+
> Direct Line Speech (Preview) is currently only available in the **westus2** region.
34+
35+
> [!NOTE]
36+
> The 30-day trial for the standard pricing tier described in [Try Speech Services for free](get-started.md) is restricted to **westus** (not **westus2**) and is thus not compatible with Direct Line Speech. Free and standard tier **westus2** subscriptions are compatible.
37+
38+
## Create and configure a project
39+
40+
[!INCLUDE [](../../../includes/cognitive-services-speech-service-quickstart-java-android-create-proj.md)]
41+
42+
## Create user interface
43+
44+
In this section, we'll create a basic user interface (UI) for the application. Let's start by opening the main activity: `activity_main.xml`. The basic template includes a title bar with the application's name, and a `TextView` with the message "Hello world!".
45+
46+
Next, replace the contents of the `activity_main.xml` with the following code:
47+
48+
```xml
49+
<?xml version="1.0" encoding="utf-8"?>
50+
<LinearLayout xmlns:android="http://schemas.android.com/apk/res/android"
51+
xmlns:tools="http://schemas.android.com/tools"
52+
android:layout_width="match_parent"
53+
android:layout_height="match_parent"
54+
android:orientation="vertical"
55+
tools:context=".MainActivity">
56+
57+
<Button
58+
android:id="@+id/button"
59+
android:layout_width="wrap_content"
60+
android:layout_height="wrap_content"
61+
android:layout_gravity="center"
62+
android:onClick="onBotButtonClicked"
63+
android:text="Talk to your bot" />
64+
65+
<TextView
66+
android:layout_width="match_parent"
67+
android:layout_height="wrap_content"
68+
android:text="Recognition Data"
69+
android:textSize="18dp"
70+
android:textStyle="bold" />
71+
72+
<TextView
73+
android:id="@+id/recoText"
74+
android:layout_width="match_parent"
75+
android:layout_height="wrap_content"
76+
android:text=" \n(Recognition goes here)\n" />
77+
78+
<TextView
79+
android:layout_width="match_parent"
80+
android:layout_height="wrap_content"
81+
android:text="Activity Data"
82+
android:textSize="18dp"
83+
android:textStyle="bold" />
84+
85+
<TextView
86+
android:id="@+id/activityText"
87+
android:layout_width="match_parent"
88+
android:layout_height="match_parent"
89+
android:scrollbars="vertical"
90+
android:text=" \n(Activities go here)\n" />
91+
92+
</LinearLayout>
93+
```
94+
95+
This XML defines a simple UI to interact with your bot.
96+
97+
* The `button` element initiates an interaction and invokes the `onBotButtonClicked` method when clicked.
98+
* The `recoText` element will display the speech-to-text results as you talk to your bot.
99+
* The `activityText` element will display the JSON payload for the latest Bot Framework activity from your bot.
100+
101+
The text and graphical representation of your UI should now look like this:
102+
103+
![](media/sdk/qs-java-android-assistant-designer-ui.png)
104+
105+
## Add sample code
106+
107+
1. Open `MainActivity.java`, and replace the contents with the following code:
108+
109+
```java
110+
package samples.speech.cognitiveservices.microsoft.com;
111+
112+
import android.media.AudioFormat;
113+
import android.media.AudioManager;
114+
import android.media.AudioTrack;
115+
import android.support.v4.app.ActivityCompat;
116+
import android.support.v7.app.AppCompatActivity;
117+
import android.os.Bundle;
118+
import android.text.method.ScrollingMovementMethod;
119+
import android.view.View;
120+
import android.widget.TextView;
121+
122+
import com.microsoft.cognitiveservices.speech.audio.AudioConfig;
123+
import com.microsoft.cognitiveservices.speech.audio.PullAudioOutputStream;
124+
import com.microsoft.cognitiveservices.speech.dialog.BotConnectorConfig;
125+
import com.microsoft.cognitiveservices.speech.dialog.SpeechBotConnector;
126+
127+
import org.json.JSONException;
128+
import org.json.JSONObject;
129+
130+
import static android.Manifest.permission.*;
131+
132+
public class MainActivity extends AppCompatActivity {
133+
// Replace below with your bot's own Direct Line Speech channel secret
134+
private static String channelSecret = "YourChannelSecret";
135+
// Replace below with your own speech subscription key
136+
private static String speechSubscriptionKey = "YourSpeechSubscriptionKey";
137+
// Replace below with your own speech service region (note: only 'westus2' is currently supported)
138+
private static String serviceRegion = "YourSpeechServiceRegion";
139+
140+
private SpeechBotConnector botConnector;
141+
142+
@Override
143+
protected void onCreate(Bundle savedInstanceState) {
144+
super.onCreate(savedInstanceState);
145+
setContentView(R.layout.activity_main);
146+
147+
TextView recoText = (TextView) this.findViewById(R.id.recoText);
148+
TextView activityText = (TextView) this.findViewById(R.id.activityText);
149+
recoText.setMovementMethod(new ScrollingMovementMethod());
150+
activityText.setMovementMethod(new ScrollingMovementMethod());
151+
152+
// Note: we need to request permissions for audio input and network access
153+
int requestCode = 5; // unique code for the permission request
154+
ActivityCompat.requestPermissions(MainActivity.this, new String[]{RECORD_AUDIO, INTERNET}, requestCode);
155+
}
156+
157+
public void onBotButtonClicked(View v) {
158+
// Recreate the SpeechBotConnector on each button press, ensuring that the existing one is closed
159+
if (botConnector != null) {
160+
botConnector.close();
161+
botConnector = null;
162+
}
163+
164+
// Create the SpeechBotConnector from the channel and speech subscription information
165+
BotConnectorConfig config = BotConnectorConfig.fromSecretKey(channelSecret, speechSubscriptionKey, serviceRegion);
166+
botConnector = new SpeechBotConnector(config, AudioConfig.fromDefaultMicrophoneInput());
167+
168+
// Optional step: preemptively connect to reduce first interaction latency
169+
botConnector.connectAsync();
170+
171+
// Register the SpeechBotConnector's event listeners
172+
registerEventListeners();
173+
174+
// Begin sending audio to your bot
175+
botConnector.listenOnceAsync();
176+
}
177+
178+
private void registerEventListeners() {
179+
TextView recoText = (TextView) this.findViewById(R.id.recoText); // 'recoText' is the ID of your text view
180+
TextView activityText = (TextView) this.findViewById(R.id.activityText); // 'activityText' is the ID of your text view
181+
182+
// Recognizing will provide the intermediate recognized text while an audio stream is being processed
183+
botConnector.recognizing.addEventListener((o, recoArgs) -> {
184+
recoText.setText(" Recognizing: " + recoArgs.getResult().getText());
185+
});
186+
187+
// Recognized will provide the final recognized text once audio capture is completed
188+
botConnector.recognized.addEventListener((o, recoArgs) -> {
189+
recoText.setText(" Recognized: " + recoArgs.getResult().getText());
190+
});
191+
192+
// SessionStarted will notify when audio begins flowing to the service for a turn
193+
botConnector.sessionStarted.addEventListener((o, sessionArgs) -> {
194+
recoText.setText("Listening...");
195+
});
196+
197+
// SessionStopped will notify when a turn is complete and it's safe to begin listening again
198+
botConnector.sessionStopped.addEventListener((o, sessionArgs) -> {
199+
});
200+
201+
// Canceled will be signaled when a turn is aborted or experiences an error condition
202+
botConnector.canceled.addEventListener((o, canceledArgs) -> {
203+
recoText.setText("Canceled (" + canceledArgs.getReason().toString() + ") error details: {}" + canceledArgs.getErrorDetails());
204+
botConnector.disconnectAsync();
205+
});
206+
207+
// ActivityReceived is the main way your bot will communicate with the client and uses bot framework activities.
208+
botConnector.activityReceived.addEventListener((o, activityArgs) -> {
209+
try {
210+
// Here we use JSONObject only to "pretty print" the condensed Activity JSON
211+
String rawActivity = activityArgs.getActivity().serialize();
212+
String formattedActivity = new JSONObject(rawActivity).toString(2);
213+
activityText.setText(formattedActivity);
214+
} catch (JSONException e) {
215+
activityText.setText("Couldn't format activity text: " + e.getMessage());
216+
}
217+
218+
if (activityArgs.hasAudio()) {
219+
// Text-to-speech audio associated with the activity is 16 kHz 16-bit mono PCM data
220+
final int sampleRate = 16000;
221+
int bufferSize = AudioTrack.getMinBufferSize(sampleRate, AudioFormat.CHANNEL_OUT_MONO, AudioFormat.ENCODING_PCM_16BIT);
222+
223+
AudioTrack track = new AudioTrack(
224+
AudioManager.STREAM_MUSIC,
225+
sampleRate,
226+
AudioFormat.CHANNEL_OUT_MONO,
227+
AudioFormat.ENCODING_PCM_16BIT,
228+
bufferSize,
229+
AudioTrack.MODE_STREAM);
230+
231+
track.play();
232+
233+
PullAudioOutputStream stream = activityArgs.getAudio();
234+
235+
// Audio is streamed as it becomes available. Play it as it arrives.
236+
byte[] buffer = new byte[bufferSize];
237+
long bytesRead = 0;
238+
239+
do {
240+
bytesRead = stream.read(buffer);
241+
track.write(buffer, 0, (int) bytesRead);
242+
} while (bytesRead == bufferSize);
243+
244+
track.release();
245+
}
246+
});
247+
}
248+
}
249+
```
250+
251+
* The `onCreate` method includes code that requests microphone and internet permissions.
252+
253+
* The method `onBotButtonClicked` is, as noted earlier, the button click handler. A button press triggers a single interaction ("turn") with your bot.
254+
255+
* The `registerEventListeners` method demonstrates the events used by the SpeechBotConnector and basic handling of incoming activities.
256+
257+
1. In the same file, replace the configuration strings to match your resources:
258+
259+
* Replace `YourChannelSecret` with the Direct Line Speech channel secret for your bot.
260+
261+
* Replace `YourSpeechSubscriptionKey` with your subscription key.
262+
263+
* Replace `YourServiceRegion` with the [region](regions.md) associated with your subscription (Note: only 'westus2' is currently supported).
264+
265+
## Build and run the app
266+
267+
1. Connect your Android device to your development PC. Make sure you have enabled [development mode and USB debugging](https://developer.android.com/studio/debug/dev-options) on the device.
268+
269+
1. To build the application, press Ctrl+F9, or choose **Build** > **Make Project** from the menu bar.
270+
271+
1. To launch the application, press Shift+F10, or choose **Run** > **Run 'app'**.
272+
273+
1. In the deployment target window that appears, choose your Android device.
274+
275+
![Screenshot of Select Deployment Target window](media/sdk/qs-java-android-12-deploy.png)
276+
277+
Once the application and its activity have launched, click the button to begin talking to your bot. Transcribed text will appear as you speak and the latest activity have you received from your bot will appear when it is received. If your bot is configured to provide spoken responses, the speech-to-text will automatically play.
278+
279+
![Screenshot of the Android application](media/sdk/qs-java-android-assistant-completed-turn.png)
280+
281+
## Next steps
282+
283+
> [!div class="nextstepaction"]
284+
> [Explore Java samples on GitHub](https://aka.ms/csspeech/samples)
285+
> [Connect Direct Line Speech to your bot](https://docs.microsoft.com/azure/bot-service/bot-service-channel-connect-directlinespeech)
286+
287+
## See also
288+
- [About voice-first virtual assistants](voice-first-virtual-assistants.md)
289+
- [Custom wake words](speech-devices-sdk-create-kws.md)

0 commit comments

Comments
 (0)