You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/cognitive-services/Speech-Service/rest-speech-to-text.md
+6-6Lines changed: 6 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -30,12 +30,12 @@ Use REST API v3.0 to:
30
30
- Request the manifest of the models that you create, to set up on-premises containers.
31
31
32
32
REST API v3.0 includes such features as:
33
-
-**Notifications-Webhooks**: All running processes of the service now support webhook notifications. REST API v3.0 provides the calls to enable you to register your webhooks where notifications are sent.
33
+
-**Webhook notifications**: All running processes of the service now support webhook notifications. REST API v3.0 provides the calls to enable you to register your webhooks where notifications are sent.
34
34
-**Updating models behind endpoints**
35
35
-**Model adaptation with multiple datasets**: Adapt a model by using multiple dataset combinations of acoustic, language, and pronunciation data.
36
36
-**Bring your own storage**: Use your own storage accounts for logs, transcription files, and other data.
37
37
38
-
For examples on using REST API v3.0 with batch transcription, see [How to use batch transcription](batch-transcription.md).
38
+
For examples of using REST API v3.0 with batch transcription, see [How to use batch transcription](batch-transcription.md).
39
39
40
40
For information about migrating to the latest version of the speech-to-text REST API, see [Migrate code from v2.0 to v3.0 of the REST API](./migrate-v2-to-v3.md).
41
41
@@ -81,7 +81,7 @@ These parameters might be included in the query string of the REST request:
81
81
|-----------|-------------|---------------------|
82
82
|`language`| Identifies the spoken language that's being recognized. See [Supported languages](language-support.md#speech-to-text). | Required |
83
83
|`format`| Specifies the result format. Accepted values are `simple` and `detailed`. Simple results include `RecognitionStatus`, `DisplayText`, `Offset`, and `Duration`. Detailed responses include four different representations of display text. The default setting is `simple`. | Optional |
84
-
|`profanity`| Specifies how to handle profanity in recognition results. Accepted values are: <br><br>`masked`, which replaces profanity with asterisks. <br>`removed`, which removes all profanity from the result. <br>`raw`, which includes the profanity in the result. <br><br>The default setting is `masked`. | Optional |
84
+
|`profanity`| Specifies how to handle profanity in recognition results. Accepted values are: <br><br>`masked`, which replaces profanity with asterisks. <br>`removed`, which removes all profanity from the result. <br>`raw`, which includes profanity in the result. <br><br>The default setting is `masked`. | Optional |
85
85
|`cid`| When you're using the [Custom Speech portal](./custom-speech-overview.md) to create custom models, you can take advantage of the **Endpoint ID** value from the **Deployment** page. Use the **Endpoint ID** value as the argument to the `cid` query string parameter. | Optional |
86
86
87
87
### Request headers
@@ -118,8 +118,8 @@ This table lists required and optional parameters for pronunciation assessment:
118
118
|-----------|-------------|---------------------|
119
119
|`ReferenceText`| The text that the pronunciation will be evaluated against. | Required |
120
120
|`GradingSystem`| The point system for score calibration. The `FivePoint` system gives a 0-5 floating point score, and `HundredMark` gives a 0-100 floating point score. Default: `FivePoint`. | Optional |
121
-
|`Granularity`| The evaluation granularity. Accepted values are:<br><br> `Phoneme`, which shows the score on the fulltext, word, and phoneme levels.<br>`Word`, which shows the score on the fulltext and word levels. <br>`FullText`, which shows the score on the fulltext level only.<br><br> The default setting is `Phoneme`. | Optional |
122
-
|`Dimension`| Defines the output criteria. Accepted values are:<br><br> `Basic`, which shows the accuracy score only. <br>`Comprehensive`, which shows scores on more dimensions (for example, fluency score and completeness score on the fulltext level, and error type on the word level).<br><br> To see definitions of different score dimensions and word error types, see [Response parameters](#response-parameters). The default setting is `Basic`. | Optional |
121
+
|`Granularity`| The evaluation granularity. Accepted values are:<br><br> `Phoneme`, which shows the score on the full-text, word, and phoneme levels.<br>`Word`, which shows the score on the full-text and word levels. <br>`FullText`, which shows the score on the full-text level only.<br><br> The default setting is `Phoneme`. | Optional |
122
+
|`Dimension`| Defines the output criteria. Accepted values are:<br><br> `Basic`, which shows the accuracy score only. <br>`Comprehensive`, which shows scores on more dimensions (for example, fluency score and completeness score on the full-text level, and error type on the word level).<br><br> To see definitions of different score dimensions and word error types, see [Response parameters](#response-parameters). The default setting is `Basic`. | Optional |
123
123
|`EnableMiscue`| Enables miscue calculation. With this parameter enabled, the pronounced words will be compared to the reference text. They'll be marked with omission or insertion based on the comparison. Accepted values are `False` and `True`. The default setting is `False`. | Optional |
124
124
|`ScenarioId`| A GUID that indicates a customized point system. | Optional |
125
125
@@ -251,7 +251,7 @@ The object in the `NBest` list can include:
251
251
|`ITN`| The inverse-text-normalized (ITN) or canonical form of the recognized text, with phone numbers, numbers, abbreviations ("doctor smith" to "dr smith"), and other transformations applied. |
252
252
|`MaskedITN`| The ITN form with profanity masking applied, if requested. |
253
253
|`Display`| The display form of the recognized text, with punctuation and capitalization added. This parameter is the same as what `DisplayText` provides when the format is set to `simple`. |
254
-
|`AccuracyScore`| Pronunciation accuracy of the speech. Accuracy indicates how closely the phonemes match a native speaker's pronunciation. The accuracy score at the word and fulltext levels is aggregated from the accuracy score at the phoneme level. |
254
+
|`AccuracyScore`| Pronunciation accuracy of the speech. Accuracy indicates how closely the phonemes match a native speaker's pronunciation. The accuracy score at the word and full-text levels is aggregated from the accuracy score at the phoneme level. |
255
255
|`FluencyScore`| Fluency of the provided speech. Fluency indicates how closely the speech matches a native speaker's use of silent breaks between words. |
256
256
|`CompletenessScore`| Completeness of the speech, determined by calculating the ratio of pronounced words to reference text input. |
257
257
|`PronScore`| Overall score that indicates the pronunciation quality of the provided speech. This score is aggregated from `AccuracyScore`, `FluencyScore`, and `CompletenessScore` with weight. |
Copy file name to clipboardExpand all lines: articles/cognitive-services/Speech-Service/rest-text-to-speech.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -241,7 +241,7 @@ ogg-48khz-16bit-mono-opus
241
241
```
242
242
243
243
> [!NOTE]
244
-
> If your selected voice and output format have different bit rates, the audio is resampled as necessary. You can decode the ogg-24khz-16bit-mono-opus format by using the [Opus codec](https://opus-codec.org/downloads/).
244
+
> If your selected voice and output format have different bit rates, the audio is resampled as necessary. You can decode the `ogg-24khz-16bit-mono-opus` format by using the [Opus codec](https://opus-codec.org/downloads/).
When you're using the `Authorization: Bearer` header, you're required to make a request to the `issueToken` endpoint. In this request, you exchange your subscription key for an access token that's valid for 10 minutes. In the next few sections, you'll learn how to get a token and use a token.
24
+
When you're using the `Authorization: Bearer` header, you're required to make a request to the `issueToken` endpoint. In this request, you exchange your subscription key for an access token that's valid for 10 minutes.
0 commit comments