Skip to content

Commit 54d4ff7

Browse files
authored
Merge pull request #195473 from eric-urban/eur/sdk-1-21
Eur/sdk-1-21
2 parents f8617d7 + 44375a9 commit 54d4ff7

16 files changed

+126
-83
lines changed

articles/cognitive-services/Speech-Service/how-to-use-custom-entity-pattern-matching.md

Lines changed: 6 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -29,17 +29,14 @@ In this guide, you use the Speech SDK to develop a console application that deri
2929
> - Add custom entities via the Speech SDK API
3030
> - Use asynchronous, event-driven continuous recognition
3131
32-
## When should you use this?
32+
## When to use pattern matching
3333

34-
Use this sample code if:
34+
Use this sample code if:
35+
* You're only interested in matching strictly what the user said. These patterns match more aggressively than LUIS.
36+
* You don't have access to a [LUIS](../LUIS/index.yml) app, but still want intents.
37+
* You can't or don't want to create a [LUIS](../LUIS/index.yml) app but you still want some voice-commanding capability.
3538

36-
- You are only interested in matching very strictly what the user said. These patterns match more aggressively than LUIS.
37-
- You do not have access to a [LUIS](../LUIS/index.yml) app, but still want intents. This can be helpful since it is embedded within the SDK.
38-
- You cannot or do not want to create a LUIS app but you still want some voice-commanding capability.
39-
40-
If you do not have access to a [LUIS](../LUIS/index.yml) app, but still want intents, this can be helpful since it is embedded within the SDK.
41-
42-
For supported locales see [here](./language-support.md?tabs=IntentRecognitionPatternMatcher).
39+
For more information, see the [pattern matching overview](./pattern-matching-overview.md).
4340

4441
## Prerequisites
4542

@@ -48,10 +45,6 @@ Be sure you have the following items before you begin this guide:
4845
- A [Cognitive Services Azure resource](https://portal.azure.com/#create/Microsoft.CognitiveServicesSpeechServices) or a [Unified Speech resource](https://portal.azure.com/#create/Microsoft.CognitiveServicesSpeechServices)
4946
- [Visual Studio 2019](https://visualstudio.microsoft.com/downloads/) (any edition).
5047

51-
## Pattern Matching Model overview
52-
53-
[!INCLUDE [Pattern Matching Overview](includes/pattern-matching-overview.md)]
54-
5548
::: zone pivot="programming-language-csharp"
5649
[!INCLUDE [csharp](includes/how-to/intent-recognition/csharp/pattern-matching.md)]
5750
::: zone-end

articles/cognitive-services/Speech-Service/how-to-use-simple-language-pattern-matching.md

Lines changed: 13 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ manager: travisw
88
ms.service: cognitive-services
99
ms.subservice: speech-service
1010
ms.topic: how-to
11-
ms.date: 11/15/2021
11+
ms.date: 04/19/2022
1212
ms.author: chschrae
1313
zone_pivot_groups: programming-languages-set-nine
1414
ms.custom: devx-track-cpp, devx-track-csharp, mode-other
@@ -28,16 +28,14 @@ In this guide, you use the Speech SDK to develop a C++ console application that
2828
> - Recognize speech from a microphone
2929
> - Use asynchronous, event-driven continuous recognition
3030
31-
## When should you use this?
31+
## When to use pattern matching
3232

3333
Use this sample code if:
34-
* You are only interested in matching very strictly what the user said. These patterns match more aggressively than LUIS.
35-
* You do not have access to a [LUIS](../LUIS/index.yml) app, but still want intents. This can be helpful since it is embedded within the SDK.
36-
* You cannot or do not want to create a [LUIS](../LUIS/index.yml) app but you still want some voice-commanding capability.
34+
* You're only interested in matching strictly what the user said. These patterns match more aggressively than LUIS.
35+
* You don't have access to a [LUIS](../LUIS/index.yml) app, but still want intents.
36+
* You can't or don't want to create a [LUIS](../LUIS/index.yml) app but you still want some voice-commanding capability.
3737

38-
If you do not have access to a [LUIS](../LUIS/index.yml) app, but still want intents, this can be helpful since it is embedded within the SDK.
39-
40-
For supported locales see [here](./language-support.md?tabs=IntentRecognitionPatternMatcher).
38+
For more information, see the [pattern matching overview](./pattern-matching-overview.md).
4139

4240
## Prerequisites
4341

@@ -50,14 +48,12 @@ Be sure you have the following items before you begin this guide:
5048

5149
The simple patterns are a feature of the Speech SDK and need a Cognitive Services resource or a Unified Speech resource.
5250

53-
A pattern is a phrase that includes an Entity somewhere within it. An Entity is defined by wrapping a word in curly brackets. For example:
51+
A pattern is a phrase that includes an Entity somewhere within it. An Entity is defined by wrapping a word in curly brackets. This example defines an Entity with the ID "floorName", which is case-sensitive:
5452

5553
```
5654
Take me to the {floorName}
5755
```
5856

59-
This defines an Entity with the ID "floorName" which is case-sensitive.
60-
6157
All other special characters and punctuation will be ignored.
6258

6359
Intents will be added using calls to the IntentRecognizer->AddIntent() API.
@@ -68,4 +64,9 @@ Intents will be added using calls to the IntentRecognizer->AddIntent() API.
6864

6965
::: zone pivot="programming-language-cpp"
7066
[!INCLUDE [cpp](includes/how-to/intent-recognition/cpp/simple-pattern-matching.md)]
71-
::: zone-end
67+
::: zone-end
68+
69+
## Next steps
70+
71+
* Improve your pattern matching by using [custom entities](how-to-use-custom-entity-pattern-matching.md).
72+
* Look through our [GitHub samples](https://github.com/Azure-Samples/cognitive-services-speech-sdk).

articles/cognitive-services/Speech-Service/includes/get-speech-sdk-java.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ ms.author: eur
88

99
:::row:::
1010
:::column span="3":::
11-
The Java SDK for Android is packaged as an <a href="https://developer.android.com/studio/projects/android-library" target="_blank">AAR (Android Library)</a>, which includes the necessary libraries and required Android permissions. It's hosted in a Maven repository at `https://csspeechstorage.blob.core.windows.net/maven/` as package `com.microsoft.cognitiveservices.speech:client-sdk:1.19.0`. Make sure 1.19.0 is the latest version by [searching our GitHub repo](https://github.com/Azure-Samples/cognitive-services-speech-sdk/search?q=com.microsoft.cognitiveservices.speech%3Aclient-sdk).
11+
The Java SDK for Android is packaged as an <a href="https://developer.android.com/studio/projects/android-library" target="_blank">AAR (Android Library)</a>, which includes the necessary libraries and required Android permissions. It's hosted in a Maven repository at `https://azureai.azureedge.net/maven/` as package `com.microsoft.cognitiveservices.speech:client-sdk:1.19.0`. Make sure 1.19.0 is the latest version by [searching our GitHub repo](https://github.com/Azure-Samples/cognitive-services-speech-sdk/search?q=com.microsoft.cognitiveservices.speech%3Aclient-sdk).
1212
:::column-end:::
1313
:::column:::
1414
<br>
@@ -22,12 +22,12 @@ To consume the package from your Android Studio project, make the following chan
2222

2323
1. In the project-level *build.gradle* file, add the following to the `repositories` section:
2424
```gradle
25-
maven { url 'https://csspeechstorage.blob.core.windows.net/maven/' }
25+
maven { url 'https://azureai.azureedge.net/maven/' }
2626
```
2727
2828
1. In the module-level *build.gradle* file, add the following to the `dependencies` section:
2929
```gradle
30-
implementation 'com.microsoft.cognitiveservices.speech:client-sdk:1.19.0'
30+
implementation 'com.microsoft.cognitiveservices.speech:client-sdk:1.21.0'
3131
```
3232
3333
#### Additional resources

articles/cognitive-services/Speech-Service/includes/get-speech-sdk-macos.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,7 @@ platform :osx, 10.14
3636
use_frameworks!
3737
3838
target 'MyApp' do
39-
pod 'MicrosoftCognitiveServicesSpeech', '~> 1.20.0'
39+
pod 'MicrosoftCognitiveServicesSpeech', '~> 1.21.0'
4040
end
4141
```
4242

articles/cognitive-services/Speech-Service/includes/how-to/intent-recognition/cpp/simple-pattern-matching.md

Lines changed: 1 addition & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -239,9 +239,4 @@ Say something ...
239239
RECOGNIZED: Text = Take me to floor 7.
240240
Intent Id = ChangeFloors
241241
Floor name: = 7
242-
```
243-
244-
## Next steps
245-
246-
> Improve your pattern matching by using [custom entities](../../../../how-to-use-custom-entity-pattern-matching.md).
247-
242+
```

articles/cognitive-services/Speech-Service/includes/how-to/intent-recognition/csharp/simple-pattern-matching.md

Lines changed: 1 addition & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -269,9 +269,4 @@ Say something ...
269269
RECOGNIZED: Text= Take me to floor 7.
270270
Intent Id= ChangeFloors
271271
FloorName= 7
272-
```
273-
274-
## Next steps
275-
276-
> Improve your pattern matching by using [custom entities](../../../../how-to-use-custom-entity-pattern-matching.md).
277-
272+
```

articles/cognitive-services/Speech-Service/includes/how-to/remote-conversation/java/examples.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -103,7 +103,7 @@ You can obtain **remote-conversation** by editing your pom.xml file as follows.
103103
<repository>
104104
<id>maven-cognitiveservices-speech</id>
105105
<name>Microsoft Cognitive Services Speech Maven Repository</name>
106-
<url>https://csspeechstorage.blob.core.windows.net/maven/</url>
106+
<url>https://azureai.azureedge.net/maven/</url>
107107
</repository>
108108
</repositories>
109109
```

articles/cognitive-services/Speech-Service/includes/pattern-matching-overview.md

Lines changed: 35 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ Pattern matching can be customized to group together pattern intents and entitie
1414

1515
For supported locales see [here](../language-support.md?tabs=IntentRecognitionPatternMatcher).
1616

17-
### Patterns vs. Exact Phrases
17+
## Patterns vs. Exact Phrases
1818

1919
There are two types of strings used in the pattern matcher: "exact phrases" and "patterns". It's important to understand the differences.
2020

@@ -26,25 +26,25 @@ A pattern is a phrase that contains a marked entity. Entities are marked with "{
2626

2727
> "Take me to floor {floorName}"
2828
29-
### Outline of a PatternMatchingModel
29+
## Outline of a PatternMatchingModel
3030

3131
The ``PatternMatchingModel`` contains an ID to reference that model by, a list of ``PatternMatchingIntent`` objects, and a list of ``PatternMatchingEntity`` objects.
3232

33-
#### Pattern Matching Intents
33+
### Pattern Matching Intents
3434

3535
``PatternMatchingIntent`` objects represent a collection of phrases that will be used to evaluate speech or text in the ``IntentRecognizer``. If the phrases are matched, the ``IntentRecognitionResult`` returned will have the ID of the ``PatternMatchingIntent`` that was matched.
3636

37-
#### Pattern Matching Entities
37+
### Pattern Matching Entities
3838

3939
``PatternMatchingEntity`` objects represent an individual entity reference and its corresponding properties that tell the ``IntentRecognizer`` how to treat it. All ``PatternMatchingEntity`` objects must have an ID that is present in a phrase or else it will never be matched.
4040

41-
##### Entity Naming restrictions
41+
#### Entity Naming restrictions
4242

4343
Entity names containing ':' characters will assign a role to an entity. (See below)
4444

45-
### Types of Entities
45+
## Types of Entities
4646

47-
#### Any Entity
47+
### Any Entity
4848

4949
The "Any" entity will match any text that appears in that slot regardless of the text it contains. If we consider our previous example using the pattern "Take me to floor {floorName}", the user might say something like:
5050

@@ -60,7 +60,7 @@ In this case, the utterance "Take me to the floor parking 2" would match and ret
6060

6161
It may be tricky to handle extra text if it's captured. Perhaps the user kept talking and the utterance captured more than their command. "Take me to floor parking 2 yes Janice I heard about that let's". In this case the floorName1 would be correct, but floorName2 would = "2 yes Janice I heard about that let's". It's important to be aware of the way the Entities will match and adjust to your scenario appropriately. The Any entity type is the most basic and least precise.
6262

63-
#### List Entity
63+
### List Entity
6464

6565
The "List" entity is made up of a list of phrases that will guide the engine on how to match it. The "List" entity has two modes. "Strict" and "Fuzzy".
6666

@@ -79,13 +79,13 @@ It's important to note that the Intent will not match, not just the entity.
7979
When an entity is of type "List" and is in "Fuzzy" mode, the engine will still match the Intent, and will return the text that appeared in the slot in the utterance even if it's not in the list. This is useful behind the scenes to help make the speech recognition better.
8080

8181
> [!WARNING]
82-
> Fuzzy list entities are not currently implemented.
82+
> Fuzzy list entities are implemented, but not integrated into the speech recognition part. Therefore, they will match entities, but not improve speech recognition.
8383
84-
#### Prebuilt Integer Entity
84+
### Prebuilt Integer Entity
8585

8686
The "PrebuiltInteger" entity is used when you expect to get an integer in that slot. It won't match the intent if an integer cannot be found. The return value is a string representation of the number.
8787

88-
#### Examples of a valid match and return values
88+
### Examples of a valid match and return values
8989

9090
> "Two thousand one hundred and fifty-five" -> "2155"
9191
@@ -97,7 +97,7 @@ The "PrebuiltInteger" entity is used when you expect to get an integer in that s
9797
9898
If there's text that is not recognizable as a number, the entity and intent will not match.
9999

100-
#### Examples of an invalid match
100+
### Examples of an invalid match
101101

102102
> "the third"
103103
@@ -113,48 +113,52 @@ Consider our elevator example.
113113
114114
If "floorName" is a prebuilt integer entity, the expectation is that whatever text is inside the slot will represent an integer. Here a floor number would match well, but a floor with a name such as "lobby" would not.
115115

116-
### Optional items and grouping
116+
## Grouping required and optional items
117117

118-
In the pattern it's allowed to include words or entities that may be present in the utterance or not. This is especially useful for determiners like "the", "a", or "an". This doesn't have any functional difference from hard coding out the many combinations, but can help reduce the number of patterns needed. Indicate optional items with "[" and "]". You may include multiple items in the same group by separating them with a '|' character.
118+
In the pattern it's allowed to include words or entities that may be present in the utterance or not. This is especially useful for determiners like "the", "a", or "an". This doesn't have any functional difference from hard coding out the many combinations, but can help reduce the number of patterns needed. Indicate optional items with "[" and "]". Indicate required items with "(" and ")". You may include multiple items in the same group by separating them with a '|' character.
119119

120120
To see how this would reduce the number of patterns needed consider the following set.
121121

122-
> "Take me to the {floorName}"
122+
> "Take me to {floorName}"
123123
124124
> "Take me the {floorName}"
125125
126-
> "Take me to {floorName}"
126+
> "Take me {floorName}"
127127
128-
> "take me {floorName}"
128+
> "Take me to {floorName} please"
129129
130-
> "Take me to the {floorName}" please
130+
> "Take me the {floorName} please"
131131
132-
> "Take me the {floorName}" please
132+
> "Take me {floorName} please"
133133
134-
> "Take me to {floorName}" please
134+
> "Bring me {floorName} please"
135135
136-
> "take me {floorName}" please
136+
> "Bring me to {floorName} please"
137137
138-
These can all be reduced to a single pattern with optional items.
138+
These can all be reduced to a single pattern with grouping and optional items. First, it is possible to group "to" and "the" together as optional words like so: "[to | the]", and second we can make the "please" optional as well. Last, we can group the "bring" and "take" as required.
139139

140-
>"Take me [to | the] {floorName} [please]"
140+
>"(Bring | Take) me [to | the] {floorName} [please]"
141141
142142
It's also possible to include optional entities. Imagine there are multiple parking levels and you want to match the word before the {floorName}. You could do so with a pattern like this:
143143

144144
>"Take me to [{floorType}] {floorName}"
145145
146+
Optionals are also very useful if you might be using keyword recognition and a push-to-talk function. This means sometimes the keyword will be present, and sometimes it won't. Assuming your keyword was "computer" your pattern would look something like this.
147+
148+
>"[Computer] Take me to {floorName}"
149+
146150
> [!NOTE]
147151
> While it's helpful to use optional items, it increases the chances of pattern collisions. This is where two patterns can match the same-spoken phrase. If this occurs, it can sometimes be solved by separating out the optional items into separate patterns.
148152
149-
### Entity roles
153+
## Entity roles
150154

151155
Inside the pattern, there may be a scenario where you want to use the same entity multiple times. Consider the scenario of booking a flight from one city to another. In this case the list of cities is the same, but it's necessary to know which city is the user coming from and which city is the destination. To accomplish this, you can use a role assigned to an entity using a ':'.
152156

153157
> "Book a flight from {city:from} to {city:destination}"
154158
155159
Given a pattern like this, there will be two entities in the result labeled "city:from" and "city:destination" but they'll both be referencing the "city" entity for matching purposes.
156160

157-
### Intent Matching Priority
161+
## Intent Matching Priority
158162

159163
Sometimes multiple patterns may match the same utterance. In this case, the engine will give priority to patterns as follows.
160164

@@ -163,3 +167,9 @@ Sometimes multiple patterns may match the same utterance. In this case, the engi
163167
3. Patterns with Integer Entities.
164168
4. Patterns with List Entities.
165169
5. Patterns with Any Entities.
170+
171+
## Next steps
172+
173+
* Start with [simple pattern matching](../how-to-use-simple-language-pattern-matching.md).
174+
* Improve your pattern matching by using [custom entities](../how-to-use-custom-entity-pattern-matching.md).
175+
* Look through our [GitHub samples](https://github.com/Azure-Samples/cognitive-services-speech-sdk).

articles/cognitive-services/Speech-Service/includes/quickstarts/platform/java-jre.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ ms.custom: devx-track-java
1313
ms.author: eur
1414
---
1515

16-
This guide shows how to install the [Speech SDK](~/articles/cognitive-services/speech-service/speech-sdk.md) for Java. If you just want the package name to get started on your own, the Java SDK is not available in the Maven central repository. Whether you're using Gradle or a *pom.xml* dependency file, you need to add a custom repository that points to `https://csspeechstorage.blob.core.windows.net/maven/`. (See below for the package name.)
16+
This guide shows how to install the [Speech SDK](~/articles/cognitive-services/speech-service/speech-sdk.md) for Java. If you just want the package name to get started on your own, the Java SDK is not available in the Maven central repository. Whether you're using Gradle or a *pom.xml* dependency file, you need to add a custom repository that points to `https://azureai.azureedge.net/maven/`. (See below for the package name.)
1717

1818
[!INCLUDE [License Notice](~/includes/cognitive-services-speech-service-license-notice.md)]
1919

@@ -44,12 +44,12 @@ Gradle configurations require both a custom repository and an explicit reference
4444
4545
repositories {
4646
maven {
47-
url "https://csspeechstorage.blob.core.windows.net/maven/"
47+
url "https://azureai.azureedge.net/maven/"
4848
}
4949
}
5050
5151
dependencies {
52-
implementation group: 'com.microsoft.cognitiveservices.speech', name: 'client-sdk', version: "1.19.0", ext: "jar"
52+
implementation group: 'com.microsoft.cognitiveservices.speech', name: 'client-sdk', version: "1.21.0", ext: "jar"
5353
}
5454
```
5555

articles/cognitive-services/Speech-Service/includes/release-notes/release-notes-cli.md

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,26 @@ ms.date: 01/08/2022
66
ms.author: eur
77
---
88

9+
### Speech CLI 1.21.0: April 2022 release
10+
11+
Uses Speech SDK 1.21.0.
12+
13+
#### New features
14+
- WEBVTT Caption generation
15+
- Added `--output vtt` support to `spx translate`
16+
- Supports `--output vtt file FILENAME` to override default VTT FILENAME
17+
- Supports `--output vtt file -` to write to standard output
18+
- Individual VTT files are created for each target language (e.g. `--target en;de;fr`)
19+
- SRT Caption generation
20+
- Added `--output srt` support to `spx recognize`, `spx intent`, and `spx translate`
21+
- Supports `--output srt file FILENAME` to override default SRT FILENAME
22+
- Supports `--output srt file -` to write to standard output
23+
- For `spx translate` individual SRT files are created for each target language (e.g. `--target en;de;fr`)
24+
25+
#### Bug fixes
26+
- Corrected WEBVTT timespan output to properly use `hh:mm:ss.fff` format
27+
28+
929
### Speech CLI 1.20.0: January 2022 release
1030

1131
#### New features

0 commit comments

Comments
 (0)