Skip to content

Commit 1d7350f

Browse files
authored
Merge pull request #213036 from eric-urban/eur/call-center-qs
Clarify language flexibility and output details
2 parents bafaa3b + 1d12853 commit 1d7350f

File tree

5 files changed

+174
-50
lines changed

5 files changed

+174
-50
lines changed

articles/cognitive-services/Speech-Service/includes/quickstarts/call-center/azure-prerequisites.md

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
author: eric-urban
33
ms.service: cognitive-services
44
ms.subservice: speech-service
5-
ms.date: 06/30/2022
5+
ms.date: 09/29/2022
66
ms.topic: include
77
ms.author: eur
88
---
@@ -13,4 +13,6 @@ ms.author: eur
1313
> * Get the resource key and region. After your Cognitive Services resource is deployed, select **Go to resource** to view and manage keys. For more information about Cognitive Services resources, see [Get the keys for your resource](~/articles/cognitive-services/cognitive-services-apis-create-account.md#get-the-keys-for-your-resource).
1414
1515
> [!IMPORTANT]
16-
> This quickstart requires access to [conversation summarization](/azure/cognitive-services/language-service/summarization/how-to/conversation-summarization). To get access, you must submit an [online request](https://aka.ms/applyforconversationsummarization/) and have it approved.
16+
> This quickstart requires access to [conversation summarization](/azure/cognitive-services/language-service/summarization/how-to/conversation-summarization). To get access, you must submit an [online request](https://aka.ms/applyforconversationsummarization/) and have it approved.
17+
>
18+
> The `--languageKey` and `--languageEndpoint` values in this quickstart must correspond to a resource that's in one of the regions supported by the [conversation summarization API](https://aka.ms/convsumregions).

articles/cognitive-services/Speech-Service/includes/quickstarts/call-center/csharp.md

Lines changed: 12 additions & 44 deletions
Original file line numberDiff line numberDiff line change
@@ -31,58 +31,26 @@ Follow these steps to run post-call transcription analysis from an audio file.
3131
```dotnetcli
3232
dotnet build
3333
```
34-
1. Run the application with your preferred command line arguments. See [usage and arguments](#usage-and-arguments) for the available options. Here is an example:
34+
1. Run the application with your preferred command line arguments. See [usage and arguments](#usage-and-arguments) for the available options.
35+
36+
Here's an example that transcribes from an example audio file at [GitHub](https://github.com/Azure-Samples/cognitive-services-speech-sdk/raw/master/scenarios/call-center/sampledata/Call1_separated_16k_health_insurance.wav):
3537
```dotnetcli
36-
dotnet run --languageKey YourResourceKey --languageEndpoint YourResourceEndpoint --speechKey YourResourceKey --speechRegion YourResourceRegion --input "https://github.com/Azure-Samples/cognitive-services-speech-sdk/raw/master/scenarios/call-center/sampledata/Call1_separated_16k_health_insurance.wav" --stereo --output summary.txt
38+
dotnet run --languageKey YourResourceKey --languageEndpoint YourResourceEndpoint --speechKey YourResourceKey --speechRegion YourResourceRegion --input "https://github.com/Azure-Samples/cognitive-services-speech-sdk/raw/master/scenarios/call-center/sampledata/Call1_separated_16k_health_insurance.wav" --stereo --output summary.json
3739
```
40+
41+
If you already have a transcription for input, here's an example that only requires a Language resource:
42+
```dotnetcli
43+
dotnet run --languageKey YourResourceKey --languageEndpoint YourResourceEndpoint --jsonInput "YourTranscriptionFile.json" --stereo --output summary.json
44+
```
45+
3846
Replace `YourResourceKey` with your Cognitive Services resource key, replace `YourResourceRegion` with your Cognitive Services resource [region](~/articles/cognitive-services/speech-service/regions.md) (such as `eastus`), and replace `YourResourceEndpoint` with your Cognitive Services endpoint. Make sure that the paths specified by `--input` and `--output` are valid. Otherwise you must change the paths.
39-
4047
> [!IMPORTANT]
4148
> Remember to remove the key from your code when you're done, and never post it publicly. For production, use a secure way of storing and accessing your credentials like [Azure Key Vault](../../../../../key-vault/general/overview.md). See the Cognitive Services [security](../../../../cognitive-services-security.md) article for more information.
4249
50+
4351
## Check results
4452
45-
The console output shows the full conversation and summary. Here's an example of the overall summary:
46-
47-
```output
48-
Conversation summary:
49-
Issue: Customer wants to sign up for insurance.
50-
Resolution: Helped customer to sign up for insurance.
51-
```
52-
53-
If you specify `--output FILE`, a JSON version of the results are written to the file. The file output is a combination of the JSON responses from the [batch transcription](/azure/cognitive-services/speech-service/batch-transcription) (Speech), [sentiment](/azure/cognitive-services/language-service/sentiment-opinion-mining/overview) (Language), and [conversation summarization](/azure/cognitive-services/language-service/summarization/overview?tabs=conversation-summarization) (Language) APIs.
54-
55-
The `transcription` property contains a JSON object with the results of sentiment analysis merged with batch transcription. Here's an example, with redactions for brevity:
56-
```json
57-
{
58-
"source": "https://github.com/Azure-Samples/cognitive-services-speech-sdk/raw/master/scenarios/call-center/sampledata/Call1_separated_16k_health_insurance.wav",
59-
// Example results redacted for brevity
60-
"nBest": [
61-
{
62-
"confidence": 0.77464247,
63-
"lexical": "hello thank you for calling contoso who am i speaking with today",
64-
"itn": "hello thank you for calling contoso who am i speaking with today",
65-
"maskedITN": "hello thank you for calling contoso who am i speaking with today",
66-
"display": "Hello, thank you for calling Contoso. Who am I speaking with today?",
67-
"sentiment": {
68-
"positive": 0.78,
69-
"neutral": 0.21,
70-
"negative": 0.01
71-
}
72-
},
73-
]
74-
// Example results redacted for brevity
75-
}
76-
```
77-
78-
The `conversationAnalyticsResults` property contains a JSON object with the results of the conversation summarization analysis. Here's an example, with redactions for brevity:
79-
```json
80-
{
81-
"conversationSummaryResults": {
82-
}
83-
// Example results redacted for brevity
84-
}
85-
```
53+
[!INCLUDE [Example output](example-output.md)]
8654
8755
## Usage and arguments
8856
Lines changed: 154 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,154 @@
1+
---
2+
author: eric-urban
3+
ms.service: cognitive-services
4+
ms.subservice: speech-service
5+
ms.date: 09/29/2022
6+
ms.topic: include
7+
ms.author: eur
8+
---
9+
10+
The console output shows the full conversation and summary. Here's an example of the overall summary, with redactions for brevity:
11+
12+
```output
13+
Conversation summary:
14+
issue: Customer wants to sign up for insurance.
15+
resolution: Customer was advised that customer would be contacted by the insurance company.
16+
```
17+
18+
If you specify the `--output FILE` optional [argument](/azure/cognitive-services/speech-service/call-center-quickstart#usage-and-arguments), a JSON version of the results are written to the file. The file output is a combination of the JSON responses from the [batch transcription](/azure/cognitive-services/speech-service/batch-transcription) (Speech), [sentiment](/azure/cognitive-services/language-service/sentiment-opinion-mining/overview) (Language), and [conversation summarization](/azure/cognitive-services/language-service/summarization/overview?tabs=conversation-summarization) (Language) APIs.
19+
20+
The `transcription` property contains a JSON object with the results of sentiment analysis merged with batch transcription. Here's an example, with redactions for brevity:
21+
```json
22+
{
23+
"source": "https://github.com/Azure-Samples/cognitive-services-speech-sdk/raw/master/scenarios/call-center/sampledata/Call1_separated_16k_health_insurance.wav",
24+
// Example results redacted for brevity
25+
"nBest": [
26+
{
27+
"confidence": 0.77464247,
28+
"lexical": "hello thank you for calling contoso who am i speaking with today",
29+
"itn": "hello thank you for calling contoso who am i speaking with today",
30+
"maskedITN": "hello thank you for calling contoso who am i speaking with today",
31+
"display": "Hello, thank you for calling Contoso. Who am I speaking with today?",
32+
"sentiment": {
33+
"positive": 0.78,
34+
"neutral": 0.21,
35+
"negative": 0.01
36+
}
37+
},
38+
]
39+
// Example results redacted for brevity
40+
}
41+
```
42+
43+
The `conversationAnalyticsResults` property contains a JSON object with the results of the conversation summarization analysis. Here's an example, with redactions for brevity:
44+
```json
45+
{
46+
"conversationAnalyticsResults": {
47+
"conversationSummaryResults": {
48+
"conversations": [
49+
{
50+
"id": "conversation1",
51+
"summaries": [
52+
{
53+
"aspect": "issue",
54+
"text": "Customer wants to sign up for insurance"
55+
},
56+
{
57+
"aspect": "resolution",
58+
"text": "Customer was advised that customer would be contacted by the insurance company"
59+
}
60+
],
61+
"warnings": []
62+
}
63+
],
64+
"errors": [],
65+
"modelVersion": "2022-05-15-preview"
66+
},
67+
"conversationPiiResults": {
68+
"combinedRedactedContent": [
69+
{
70+
"channel": "0",
71+
"display": "Hello, thank you for calling Contoso. Who am I speaking with today? Hi, ****. Uh, are you calling because you need health insurance?", // Example results redacted for brevity
72+
"itn": "hello thank you for calling contoso who am i speaking with today hi **** uh are you calling because you need health insurance", // Example results redacted for brevity
73+
"lexical": "hello thank you for calling contoso who am i speaking with today hi **** uh are you calling because you need health insurance" // Example results redacted for brevity
74+
},
75+
{
76+
"channel": "1",
77+
"display": "Hi, my name is **********. I'm trying to enroll myself with Contoso. Yes. Yeah, I'm calling to sign up for insurance.", // Example results redacted for brevity
78+
"itn": "hi my name is ********** i'm trying to enroll myself with contoso yes yeah i'm calling to sign up for insurance", // Example results redacted for brevity
79+
"lexical": "hi my name is ********** i'm trying to enroll myself with contoso yes yeah i'm calling to sign up for insurance" // Example results redacted for brevity
80+
}
81+
],
82+
"conversations": [
83+
{
84+
"id": "conversation1",
85+
"conversationItems": [
86+
{
87+
"id": "0",
88+
"redactedContent": {
89+
"itn": "hello thank you for calling contoso who am i speaking with today",
90+
"lexical": "hello thank you for calling contoso who am i speaking with today",
91+
"text": "Hello, thank you for calling Contoso. Who am I speaking with today?"
92+
},
93+
"entities": [],
94+
"channel": "0",
95+
"offset": "PT0.77S"
96+
},
97+
{
98+
"id": "1",
99+
"redactedContent": {
100+
"itn": "hi my name is ********** i'm trying to enroll myself with contoso",
101+
"lexical": "hi my name is ********** i'm trying to enroll myself with contoso",
102+
"text": "Hi, my name is **********. I'm trying to enroll myself with Contoso."
103+
},
104+
"entities": [
105+
{
106+
"text": "Mary Rondo",
107+
"category": "Name",
108+
"offset": 15,
109+
"length": 10,
110+
"confidenceScore": 0.97
111+
}
112+
],
113+
"channel": "1",
114+
"offset": "PT4.55S"
115+
},
116+
{
117+
"id": "2",
118+
"redactedContent": {
119+
"itn": "hi **** uh are you calling because you need health insurance",
120+
"lexical": "hi **** uh are you calling because you need health insurance",
121+
"text": "Hi, ****. Uh, are you calling because you need health insurance?"
122+
},
123+
"entities": [
124+
{
125+
"text": "Mary",
126+
"category": "Name",
127+
"offset": 4,
128+
"length": 4,
129+
"confidenceScore": 0.93
130+
}
131+
],
132+
"channel": "0",
133+
"offset": "PT9.55S"
134+
},
135+
{
136+
"id": "3",
137+
"redactedContent": {
138+
"itn": "yes yeah i'm calling to sign up for insurance",
139+
"lexical": "yes yeah i'm calling to sign up for insurance",
140+
"text": "Yes. Yeah, I'm calling to sign up for insurance."
141+
},
142+
"entities": [],
143+
"channel": "1",
144+
"offset": "PT13.09S"
145+
},
146+
// Example results redacted for brevity
147+
],
148+
"warnings": []
149+
}
150+
]
151+
}
152+
}
153+
}
154+
```

articles/cognitive-services/Speech-Service/includes/quickstarts/call-center/intro.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
author: eric-urban
33
ms.service: cognitive-services
44
ms.topic: include
5-
ms.date: 09/18/2022
5+
ms.date: 09/29/2022
66
ms.author: eur
77
---
88

articles/cognitive-services/Speech-Service/includes/quickstarts/call-center/usage-arguments.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
author: eric-urban
33
ms.service: cognitive-services
44
ms.topic: include
5-
ms.date: 09/18/2022
5+
ms.date: 09/29/2022
66
ms.author: eur
77
---
88

@@ -20,7 +20,7 @@ Connection options include:
2020
Input options include:
2121

2222
- `--input URL`: Input audio from URL. You must set either the `--input` or `--jsonInput` option.
23-
- `--jsonInput FILE`: Input an existing batch transcription JSON result from FILE. Use this option to process a transcription result that was previously generated by the Speech service. With this option, you don't need an audio file. Overrides `--input`, `--speechKey`, and `--speechRegion`. You must set either the `--input` or `--jsonInput` option.
23+
- `--jsonInput FILE`: Input an existing batch transcription JSON result from FILE. With this option, you only need a Language resource to process a transcription that you already have. With this option, you don't need an audio file or a Speech resource. Overrides `--input`. You must set either the `--input` or `--jsonInput` option.
2424
- `--stereo`: Use stereo audio format. If stereo isn't specified, then mono 16khz 16 bit PCM wav files are assumed. Diarization of mono files is used to separate multiple speakers. Diarization of stereo files isn't supported, since 2-channel stereo files should already have one speaker per channel.
2525
- `--certificate`: The PEM certificate file. Required for C++.
2626

@@ -32,4 +32,4 @@ Language options include:
3232
Output options include:
3333

3434
- `--help`: Show the usage help and stop
35-
- `--output FILE`: Output the transcription, sentiment, and conversation summaries in JSON format to a text file.
35+
- `--output FILE`: Output the transcription, sentiment, and conversation summaries in JSON format to a text file. For more information, see [output examples](/azure/cognitive-services/speech-service/call-center-quickstart#check-results).

0 commit comments

Comments
 (0)