Skip to content

Commit 720410b

Browse files
authored
Merge pull request #95943 from wolfma61/master
change description of rest api audio duration. Update to the current …
2 parents b3a06ca + 02b0a53 commit 720410b

File tree

1 file changed

+13
-12
lines changed

1 file changed

+13
-12
lines changed

articles/cognitive-services/Speech-Service/rest-speech-to-text.md

Lines changed: 13 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -14,21 +14,22 @@ ms.author: erhopf
1414

1515
# Speech-to-text REST API
1616

17-
As an alternative to the [Speech SDK](speech-sdk.md), Speech Services allow you to convert speech-to-text using a REST API. Each accessible endpoint is associated with a region. Your application requires a subscription key for the endpoint you plan to use.
17+
As an alternative to the [Speech SDK](speech-sdk.md), Speech Services allows you to convert speech-to-text using a REST API. Each accessible endpoint is associated with a region. Your application requires a subscription key for the endpoint you plan to use.
1818

1919
Before using the speech-to-text REST API, understand:
20-
* Requests that use the REST API can only contain 10 seconds of recorded audio.
20+
21+
* Requests that use the REST API and transmit audio directly can only contain up to 60 seconds of audio.
2122
* The speech-to-text REST API only returns final results. Partial results are not provided.
2223

23-
If sending longer audio is a requirement for your application, consider using the [Speech SDK](speech-sdk.md) or [batch transcription](batch-transcription.md).
24+
If sending longer audio is a requirement for your application, consider using the [Speech SDK](speech-sdk.md) or a file-based REST API, like [batch transcription](batch-transcription.md).
2425

2526
[!INCLUDE [](../../../includes/cognitive-services-speech-service-rest-auth.md)]
2627

2728
## Regions and endpoints
2829

2930
These regions are supported for speech-to-text transcription using the REST API. Make sure that you select the endpoint that matches your subscription region.
3031

31-
[!INCLUDE [](../../../includes/cognitive-services-speech-service-endpoints-speech-to-text.md)]
32+
[!INCLUDE]
3233

3334
## Query parameters
3435

@@ -38,7 +39,7 @@ These parameters may be included in the query string of the REST request.
3839
|-----------|-------------|---------------------|
3940
| `language` | Identifies the spoken language that is being recognized. See [Supported languages](language-support.md#speech-to-text). | Required |
4041
| `format` | Specifies the result format. Accepted values are `simple` and `detailed`. Simple results include `RecognitionStatus`, `DisplayText`, `Offset`, and `Duration`. Detailed responses include multiple results with confidence values and four different representations. The default setting is `simple`. | Optional |
41-
| `profanity` | Specifies how to handle profanity in recognition results. Accepted values are `masked`, which replaces profanity with asterisks, `removed`, which remove all profanity from the result, or `raw`, which includes the profanity in the result. The default setting is `masked`. | Optional |
42+
| `profanity` | Specifies how to handle profanity in recognition results. Accepted values are `masked`, which replaces profanity with asterisks, `removed`, which removes all profanity from the result, or `raw`, which includes the profanity in the result. The default setting is `masked`. | Optional |
4243

4344
## Request headers
4445

@@ -50,8 +51,8 @@ This table lists required and optional headers for speech-to-text requests.
5051
| `Authorization` | An authorization token preceded by the word `Bearer`. For more information, see [Authentication](#authentication). | Either this header or `Ocp-Apim-Subscription-Key` is required. |
5152
| `Content-type` | Describes the format and codec of the provided audio data. Accepted values are `audio/wav; codecs=audio/pcm; samplerate=16000` and `audio/ogg; codecs=opus`. | Required |
5253
| `Transfer-Encoding` | Specifies that chunked audio data is being sent, rather than a single file. Only use this header if chunking audio data. | Optional |
53-
| `Expect` | If using chunked transfer, send `Expect: 100-continue`. The Speech Services acknowledge the initial request and awaits additional data.| Required if sending chunked audio data. |
54-
| `Accept` | If provided, it must be `application/json`. The Speech Services provide results in JSON. Some Web request frameworks provide an incompatible default value if you do not specify one, so it is good practice to always include `Accept`. | Optional, but recommended. |
54+
| `Expect` | If using chunked transfer, send `Expect: 100-continue`. The Speech Services acknowledges the initial request and awaits additional data.| Required if sending chunked audio data. |
55+
| `Accept` | If provided, it must be `application/json`. The Speech Services provides results in JSON. Some request frameworks provide an incompatible default value. It is good practice to always include `Accept`. | Optional, but recommended. |
5556

5657
## Audio formats
5758

@@ -67,7 +68,7 @@ Audio is sent in the body of the HTTP `POST` request. It must be in one of the f
6768
6869
## Sample request
6970

70-
This is a typical HTTP request. The sample below includes the hostname and required headers. It's important to note that the service also expects audio data, which is not included in this sample. As mentioned earlier, chunking is recommended, however, not required.
71+
The sample below includes the hostname and required headers. It's important to note that the service also expects audio data, which is not included in this sample. As mentioned earlier, chunking is recommended, however, not required.
7172

7273
```HTTP
7374
POST speech/recognition/conversation/cognitiveservices/v1?language=en-US&format=detailed HTTP/1.1
@@ -87,13 +88,13 @@ The HTTP status code for each response indicates success or common errors.
8788
|------------------|-------------|-----------------|
8889
| 100 | Continue | The initial request has been accepted. Proceed with sending the rest of the data. (Used with chunked transfer.) |
8990
| 200 | OK | The request was successful; the response body is a JSON object. |
90-
| 400 | Bad request | Language code not provided or is not a supported language; invalid audio file. |
91+
| 400 | Bad request | Language code not provided, not a supported language, invalid audio file, etc. |
9192
| 401 | Unauthorized | Subscription key or authorization token is invalid in the specified region, or invalid endpoint. |
9293
| 403 | Forbidden | Missing subscription key or authorization token. |
9394

9495
## Chunked transfer
9596

96-
Chunked transfer (`Transfer-Encoding: chunked`) can help reduce recognition latency because it allows the Speech Services to begin processing the audio file while it's being transmitted. The REST API does not provide partial or interim results. This option is intended solely to improve responsiveness.
97+
Chunked transfer (`Transfer-Encoding: chunked`) can help reduce recognition latency. It allows the Speech Services to begin processing the audio file while it is transmitted. The REST API does not provide partial or interim results.
9798

9899
This code sample shows how to send audio in chunks. Only the first chunk should contain the audio file's header. `request` is an HTTPWebRequest object connected to the appropriate REST endpoint. `audioFile` is the path to an audio file on disk.
99100

@@ -172,7 +173,7 @@ Each object in the `NBest` list includes:
172173

173174
## Sample responses
174175

175-
This is a typical response for `simple` recognition.
176+
A typical response for `simple` recognition:
176177

177178
```json
178179
{
@@ -183,7 +184,7 @@ This is a typical response for `simple` recognition.
183184
}
184185
```
185186

186-
This is a typical response for `detailed` recognition.
187+
A typical response for `detailed` recognition:
187188

188189
```json
189190
{

0 commit comments

Comments
 (0)