Skip to content

Commit a2939d6

Browse files
committed
edit pass: text-to-speech-to-text-2
1 parent e2d1c13 commit a2939d6

File tree

3 files changed

+10
-10
lines changed

3 files changed

+10
-10
lines changed

articles/cognitive-services/Speech-Service/rest-speech-to-text.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -30,12 +30,12 @@ Use REST API v3.0 to:
3030
- Request the manifest of the models that you create, to set up on-premises containers.
3131

3232
REST API v3.0 includes such features as:
33-
- **Notifications-Webhooks**: All running processes of the service now support webhook notifications. REST API v3.0 provides the calls to enable you to register your webhooks where notifications are sent.
33+
- **Webhook notifications**: All running processes of the service now support webhook notifications. REST API v3.0 provides the calls to enable you to register your webhooks where notifications are sent.
3434
- **Updating models behind endpoints**
3535
- **Model adaptation with multiple datasets**: Adapt a model by using multiple dataset combinations of acoustic, language, and pronunciation data.
3636
- **Bring your own storage**: Use your own storage accounts for logs, transcription files, and other data.
3737

38-
For examples on using REST API v3.0 with batch transcription, see [How to use batch transcription](batch-transcription.md).
38+
For examples of using REST API v3.0 with batch transcription, see [How to use batch transcription](batch-transcription.md).
3939

4040
For information about migrating to the latest version of the speech-to-text REST API, see [Migrate code from v2.0 to v3.0 of the REST API](./migrate-v2-to-v3.md).
4141

@@ -81,7 +81,7 @@ These parameters might be included in the query string of the REST request:
8181
|-----------|-------------|---------------------|
8282
| `language` | Identifies the spoken language that's being recognized. See [Supported languages](language-support.md#speech-to-text). | Required |
8383
| `format` | Specifies the result format. Accepted values are `simple` and `detailed`. Simple results include `RecognitionStatus`, `DisplayText`, `Offset`, and `Duration`. Detailed responses include four different representations of display text. The default setting is `simple`. | Optional |
84-
| `profanity` | Specifies how to handle profanity in recognition results. Accepted values are: <br><br>`masked`, which replaces profanity with asterisks. <br>`removed`, which removes all profanity from the result. <br>`raw`, which includes the profanity in the result. <br><br>The default setting is `masked`. | Optional |
84+
| `profanity` | Specifies how to handle profanity in recognition results. Accepted values are: <br><br>`masked`, which replaces profanity with asterisks. <br>`removed`, which removes all profanity from the result. <br>`raw`, which includes profanity in the result. <br><br>The default setting is `masked`. | Optional |
8585
| `cid` | When you're using the [Custom Speech portal](./custom-speech-overview.md) to create custom models, you can take advantage of the **Endpoint ID** value from the **Deployment** page. Use the **Endpoint ID** value as the argument to the `cid` query string parameter. | Optional |
8686

8787
### Request headers
@@ -118,8 +118,8 @@ This table lists required and optional parameters for pronunciation assessment:
118118
|-----------|-------------|---------------------|
119119
| `ReferenceText` | The text that the pronunciation will be evaluated against. | Required |
120120
| `GradingSystem` | The point system for score calibration. The `FivePoint` system gives a 0-5 floating point score, and `HundredMark` gives a 0-100 floating point score. Default: `FivePoint`. | Optional |
121-
| `Granularity` | The evaluation granularity. Accepted values are:<br><br> `Phoneme`, which shows the score on the full text, word, and phoneme levels.<br>`Word`, which shows the score on the full text and word levels. <br>`FullText`, which shows the score on the full text level only.<br><br> The default setting is `Phoneme`. | Optional |
122-
| `Dimension` | Defines the output criteria. Accepted values are:<br><br> `Basic`, which shows the accuracy score only. <br>`Comprehensive`, which shows scores on more dimensions (for example, fluency score and completeness score on the full text level, and error type on the word level).<br><br> To see definitions of different score dimensions and word error types, see [Response parameters](#response-parameters). The default setting is `Basic`. | Optional |
121+
| `Granularity` | The evaluation granularity. Accepted values are:<br><br> `Phoneme`, which shows the score on the full-text, word, and phoneme levels.<br>`Word`, which shows the score on the full-text and word levels. <br>`FullText`, which shows the score on the full-text level only.<br><br> The default setting is `Phoneme`. | Optional |
122+
| `Dimension` | Defines the output criteria. Accepted values are:<br><br> `Basic`, which shows the accuracy score only. <br>`Comprehensive`, which shows scores on more dimensions (for example, fluency score and completeness score on the full-text level, and error type on the word level).<br><br> To see definitions of different score dimensions and word error types, see [Response parameters](#response-parameters). The default setting is `Basic`. | Optional |
123123
| `EnableMiscue` | Enables miscue calculation. With this parameter enabled, the pronounced words will be compared to the reference text. They'll be marked with omission or insertion based on the comparison. Accepted values are `False` and `True`. The default setting is `False`. | Optional |
124124
| `ScenarioId` | A GUID that indicates a customized point system. | Optional |
125125

@@ -251,7 +251,7 @@ The object in the `NBest` list can include:
251251
| `ITN` | The inverse-text-normalized (ITN) or canonical form of the recognized text, with phone numbers, numbers, abbreviations ("doctor smith" to "dr smith"), and other transformations applied. |
252252
| `MaskedITN` | The ITN form with profanity masking applied, if requested. |
253253
| `Display` | The display form of the recognized text, with punctuation and capitalization added. This parameter is the same as what `DisplayText` provides when the format is set to `simple`. |
254-
| `AccuracyScore` | Pronunciation accuracy of the speech. Accuracy indicates how closely the phonemes match a native speaker's pronunciation. The accuracy score at the word and full text levels is aggregated from the accuracy score at the phoneme level. |
254+
| `AccuracyScore` | Pronunciation accuracy of the speech. Accuracy indicates how closely the phonemes match a native speaker's pronunciation. The accuracy score at the word and full-text levels is aggregated from the accuracy score at the phoneme level. |
255255
| `FluencyScore` | Fluency of the provided speech. Fluency indicates how closely the speech matches a native speaker's use of silent breaks between words. |
256256
| `CompletenessScore` | Completeness of the speech, determined by calculating the ratio of pronounced words to reference text input. |
257257
| `PronScore` | Overall score that indicates the pronunciation quality of the provided speech. This score is aggregated from `AccuracyScore`, `FluencyScore`, and `CompletenessScore` with weight. |

articles/cognitive-services/Speech-Service/rest-text-to-speech.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -241,7 +241,7 @@ ogg-48khz-16bit-mono-opus
241241
```
242242

243243
> [!NOTE]
244-
> If your selected voice and output format have different bit rates, the audio is resampled as necessary. You can decode the ogg-24khz-16bit-mono-opus format by using the [Opus codec](https://opus-codec.org/downloads/).
244+
> If your selected voice and output format have different bit rates, the audio is resampled as necessary. You can decode the `ogg-24khz-16bit-mono-opus` format by using the [Opus codec](https://opus-codec.org/downloads/).
245245
246246
### Request body
247247

includes/cognitive-services-speech-service-rest-auth.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -12,16 +12,16 @@ Each request requires an authorization header. This table illustrates which head
1212

1313
| Supported authorization header | Speech-to-text | Text-to-speech |
1414
|------------------------|----------------|----------------|
15-
| Ocp-Apim-Subscription-Key | Yes | Yes |
16-
| Authorization: Bearer | Yes | Yes |
15+
| `Ocp-Apim-Subscription-Key` | Yes | Yes |
16+
| `Authorization: Bearer` | Yes | Yes |
1717

1818
When you're using the `Ocp-Apim-Subscription-Key` header, you're only required to provide your subscription key. For example:
1919

2020
```http
2121
'Ocp-Apim-Subscription-Key': 'YOUR_SUBSCRIPTION_KEY'
2222
```
2323

24-
When you're using the `Authorization: Bearer` header, you're required to make a request to the `issueToken` endpoint. In this request, you exchange your subscription key for an access token that's valid for 10 minutes. In the next few sections, you'll learn how to get a token and use a token.
24+
When you're using the `Authorization: Bearer` header, you're required to make a request to the `issueToken` endpoint. In this request, you exchange your subscription key for an access token that's valid for 10 minutes.
2525

2626
### How to get an access token
2727

0 commit comments

Comments
 (0)