Skip to content

Commit 137a671

Browse files
authored
Merge pull request #263805 from eric-urban/eur/jan-freshness-7
freshness and editorial pass
2 parents 99552b1 + b5e2616 commit 137a671

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

43 files changed

+372
-373
lines changed

articles/ai-services/speech-service/sovereign-clouds.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ manager: nitinme
77
ms.service: azure-ai-speech
88
ms.topic: how-to
99
ms.custom: references_regions
10-
ms.date: 11/17/2023
10+
ms.date: 1/21/2024
1111
ms.author: alexeyo
1212
---
1313

@@ -64,7 +64,7 @@ Replace `<REGION_IDENTIFIER>` with the identifier matching the region of your su
6464

6565
#### Speech SDK
6666

67-
For [Speech SDK](speech-sdk.md) in sovereign clouds you need to use "from host / with host" instantiation of `SpeechConfig` class or `--host` option of [Speech CLI](spx-overview.md). (You may also use "from endpoint / with endpoint" instantiation and `--endpoint` Speech CLI option).
67+
For [Speech SDK](speech-sdk.md) in sovereign clouds, you need to use "from host / with host" instantiation of `SpeechConfig` class or `--host` option of [Speech CLI](spx-overview.md). (You can also use "from endpoint / with endpoint" instantiation and `--endpoint` Speech CLI option).
6868

6969
`SpeechConfig` class should be instantiated like this:
7070

@@ -161,7 +161,7 @@ Replace `<REGION_IDENTIFIER>` with the identifier matching the region of your su
161161

162162
#### Speech SDK
163163

164-
For [Speech SDK](speech-sdk.md) in sovereign clouds you need to use "from host / with host" instantiation of `SpeechConfig` class or `--host` option of [Speech CLI](spx-overview.md). (You may also use "from endpoint / with endpoint" instantiation and `--endpoint` Speech CLI option).
164+
For [Speech SDK](speech-sdk.md) in sovereign clouds you need to use "from host / with host" instantiation of `SpeechConfig` class or `--host` option of [Speech CLI](spx-overview.md). (You can also use "from endpoint / with endpoint" instantiation and `--endpoint` Speech CLI option).
165165

166166
`SpeechConfig` class should be instantiated like this:
167167

articles/ai-services/speech-service/speaker-recognition-overview.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ author: eric-urban
66
manager: nitinme
77
ms.service: azure-ai-speech
88
ms.topic: overview
9-
ms.date: 01/08/2022
9+
ms.date: 1/21/2024
1010
ms.author: eur
1111
ms.custom: cog-serv-seo-aug-2020, ignite-fall-2021
1212
keywords: speaker recognition, voice biometry
@@ -54,9 +54,9 @@ Enrollment for speaker identification is text-independent. There are no restrict
5454

5555
Speaker enrollment data is stored in a secured system, including the speech audio for enrollment and the voice signature features. The speech audio for enrollment is only used when the algorithm is upgraded, and the features need to be extracted again. The service doesn't retain the speech recording or the extracted voice features that are sent to the service during the recognition phase.
5656

57-
You control how long data should be retained. You can create, update, and delete enrollment data for individual speakers through API calls. When the subscription is deleted, all the speaker enrollment data associated with the subscription will also be deleted.
57+
You control how long data should be retained. You can create, update, and delete enrollment data for individual speakers through API calls. When the subscription is deleted, all the speaker enrollment data associated with the subscription is also deleted.
5858

59-
As with all of the Azure AI services resources, developers who use the speaker recognition feature must be aware of Microsoft policies on customer data. You should ensure that you have received the appropriate permissions from the users. You can find more details in [Data and privacy for speaker recognition](/legal/cognitive-services/speech-service/speaker-recognition/data-privacy-speaker-recognition). For more information, see the [Azure AI services page](https://azure.microsoft.com/support/legal/cognitive-services-compliance-and-privacy/) on the Microsoft Trust Center.
59+
As with all of the Azure AI services resources, developers who use the speaker recognition feature must be aware of Microsoft policies on customer data. You should ensure that you received the appropriate permissions from the users. You can find more details in [Data and privacy for speaker recognition](/legal/cognitive-services/speech-service/speaker-recognition/data-privacy-speaker-recognition). For more information, see the [Azure AI services page](https://azure.microsoft.com/support/legal/cognitive-services-compliance-and-privacy/) on the Microsoft Trust Center.
6060

6161
## Common questions and solutions
6262

@@ -72,7 +72,7 @@ As with all of the Azure AI services resources, developers who use the speaker r
7272

7373
## Responsible AI
7474

75-
An AI system includes not only the technology, but also the people who will use it, the people who will be affected by it, and the environment in which it is deployed. Read the transparency notes to learn about responsible AI use and deployment in your systems.
75+
An AI system includes not only the technology, but also the people who use it, the people who are affected by it, and the environment in which it's deployed. Read the transparency notes to learn about responsible AI use and deployment in your systems.
7676

7777
* [Transparency note and use cases](/legal/cognitive-services/speech-service/speaker-recognition/transparency-note-speaker-recognition?context=/azure/ai-services/speech-service/context/context)
7878
* [Characteristics and limitations](/legal/cognitive-services/speech-service/speaker-recognition/characteristics-and-limitations-speaker-recognition?context=/azure/ai-services/speech-service/context/context)

articles/ai-services/speech-service/speech-container-batch-processing.md

Lines changed: 17 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ author: eric-urban
66
manager: nitinme
77
ms.service: azure-ai-speech
88
ms.topic: how-to
9-
ms.date: 10/22/2020
9+
ms.date: 1/21/2024
1010
ms.author: eur
1111
---
1212

@@ -16,15 +16,15 @@ Use the batch processing kit to complement and scale out workloads on Speech con
1616

1717
:::image type="content" source="media/containers/general-diagram.png" alt-text="A diagram showing an example batch-kit container workflow.":::
1818

19-
The batch kit container is available for free on [GitHub](https://github.com/microsoft/batch-processing-kit) and [Docker hub](https://hub.docker.com/r/batchkit/speech-batch-kit/tags). You are only [billed](speech-container-overview.md#billing) for the Speech containers you use.
19+
The batch kit container is available for free on [GitHub](https://github.com/microsoft/batch-processing-kit) and [Docker hub](https://hub.docker.com/r/batchkit/speech-batch-kit/tags). You're only [billed](speech-container-overview.md#billing) for the Speech containers you use.
2020

2121
| Feature | Description |
2222
|---------|---------|
2323
| Batch audio file distribution | Automatically dispatch large numbers of files to on-premises or cloud-based Speech container endpoints. Files can be on any POSIX-compliant volume, including network filesystems. |
2424
| Speech SDK integration | Pass common flags to the Speech SDK, including: n-best hypotheses, diarization, language, profanity masking. |
2525
|Run modes | Run the batch client once, continuously in the background, or create HTTP endpoints for audio files. |
2626
| Fault tolerance | Automatically retry and continue transcription without losing progress, and differentiate between which errors can, and can't be retried on. |
27-
| Endpoint availability detection | If an endpoint becomes unavailable, the batch client will continue transcribing, using other container endpoints. After becoming available again, the client will automatically begin using the endpoint. |
27+
| Endpoint availability detection | If an endpoint becomes unavailable, the batch client continues transcribing, using other container endpoints. When the client is available it automatically begins using the endpoint. |
2828
| Endpoint hot-swapping | Add, remove, or modify Speech container endpoints during runtime without interrupting the batch progress. Updates are immediate. |
2929
| Real-time logging | Real-time logging of attempted requests, timestamps, and failure reasons, with Speech SDK log files for each audio file. |
3030

@@ -72,7 +72,7 @@ MyContainer3:
7272
7373
This yaml example specifies three speech containers on three hosts. The first host is specified by a IPv4 address, the second is running on the same VM as the batch-client, and the third container is specified by the DNS hostname of another VM. The `concurrency` value specifies the maximum concurrent file transcriptions that can run on the same container. The `rtf` (Real-Time Factor) value is optional and can be used to tune performance.
7474

75-
The batch client can dynamically detect if an endpoint becomes unavailable (for example, due to a container restart or networking issue), and when it becomes available again. Transcription requests will not be sent to containers that are unavailable, and the client will continue using other available containers. You can add, remove, or edit endpoints at any time without interrupting the progress of your batch.
75+
The batch client can dynamically detect if an endpoint becomes unavailable (for example, due to a container restart or networking issue), and when it becomes available again. Transcription requests won't be sent to containers that are unavailable, and the client continues using other available containers. You can add, remove, or edit endpoints at any time without interrupting the progress of your batch.
7676

7777

7878

@@ -538,10 +538,10 @@ The batch client can dynamically detect if an endpoint becomes unavailable (for
538538
539539
> [!NOTE]
540540
> * This example uses the same directory (`/my_nfs`) for the configuration file and the inputs, outputs, and logs directories. You can use hosted or NFS-mounted directories for these folders.
541-
> * Running the client with `–h` will list the available command-line parameters, and their default values. 
541+
> * Running the client with the `–h` flag lists the available command-line parameters, and their default values. 
542542
> * The batch processing container is only supported on Linux.
543543

544-
Use the Docker `run` command to start the container. This will start an interactive shell inside the container.
544+
Use the Docker `run` command to start the container. This command starts an interactive shell inside the container.
545545

546546

547547

@@ -943,7 +943,7 @@ docker run --network host --rm -ti -v /mnt/my_nfs:/my_nfs docker.io/batchkit/spe
943943

944944

945945

946-
The client will start running. If an audio file has already been transcribed in a previous run, the client will automatically skip the file. Files are sent with an automatic retry if transient errors occur, and you can differentiate between which errors you want to the client to retry on. On a transcription error, the client will continue transcription, and can retry without losing progress.
946+
The client starts running. If an audio file was transcribed in a previous run, the client automatically skips the file. Files are sent with an automatic retry if transient errors occur, and you can differentiate between which errors you want to the client to retry on. On a transcription error, the client continues transcription, and can retry without losing progress.
947947

948948
## Run modes
949949

@@ -955,9 +955,9 @@ The batch processing kit offers three modes, using the `--run-mode` parameter.
955955

956956
:::image type="content" source="media/containers/batch-oneshot-mode.png" alt-text="A diagram showing the batch-kit container processing files in oneshot mode.":::
957957

958-
1. Define the Speech container endpoints that the batch client will use in the `config.yaml` file.
958+
1. Define the Speech container endpoints that the batch client uses in the `config.yaml` file.
959959
2. Place audio files for transcription in an input directory.
960-
3. Invoke the container on the directory, which will begin processing the files. If the audio file has already been transcribed in a previous run with the same output directory (same file name and checksum), the client will skip the file.
960+
3. Invoke the container on the directory to begin processing the files. If the audio file is already transcribed in a previous run with the same output directory (same file name and checksum), the client skips the file.
961961
4. The files are dispatched to the container endpoints from step 1.
962962
5. Logs and the Speech container output are returned to the specified output directory.
963963

@@ -966,13 +966,13 @@ The batch processing kit offers three modes, using the `--run-mode` parameter.
966966
> [!TIP]
967967
> If multiple files are added to the input directory at the same time, you can improve performance by instead adding them in a regular interval.
968968

969-
`DAEMON` mode transcribes existing files in a given folder, and continuously transcribes new audio files as they are added.
969+
`DAEMON` mode transcribes existing files in a given folder, and continuously transcribes new audio files as they're added.
970970

971971
:::image type="content" source="media/containers/batch-daemon-mode.png" alt-text="A diagram showing batch-kit container processing files in daemon mode.":::
972972

973-
1. Define the Speech container endpoints that the batch client will use in the `config.yaml` file.
974-
2. Invoke the container on an input directory. The batch client will begin monitoring the directory for incoming files.
975-
3. Set up continuous audio file delivery to the input directory. If the audio file has already been transcribed in a previous run with the same output directory (same file name and checksum), the client will skip the file.
973+
1. Define the Speech container endpoints that the batch client uses in the `config.yaml` file.
974+
2. Invoke the container on an input directory. The batch client begins monitoring the directory for incoming files.
975+
3. Set up continuous audio file delivery to the input directory. If the audio file was transcribed in a previous run with the same output directory (same file name and checksum), the client skips the file.
976976
4. Once a file write or POSIX signal is detected, the container is triggered to respond.
977977
5. The files are dispatched to the container endpoints from step 1.
978978
6. Logs and the Speech container output are returned to the specified output directory.
@@ -983,16 +983,16 @@ The batch processing kit offers three modes, using the `--run-mode` parameter.
983983

984984
:::image type="content" source="media/containers/batch-rest-api-mode.png" alt-text="A diagram showing the batch-kit container processing files in REST mode.":::
985985

986-
1. Define the Speech container endpoints that the batch client will use in the `config.yaml` file.
986+
1. Define the Speech container endpoints that the batch client uses in the `config.yaml` file.
987987
2. Send an HTTP request to one of the API server's endpoints.
988988

989989
|Endpoint |Description |
990990
|---------|---------|
991991
|`/submit` | Endpoint for creating new batch requests. |
992-
|`/status` | Endpoint for checking the status of a batch request. The connection will stay open until the batch completes. |
992+
|`/status` | Endpoint for checking the status of a batch request. The connection stays open until the batch completes. |
993993
|`/watch` | Endpoint for using HTTP long polling until the batch completes. |
994994

995-
3. Audio files are uploaded from the input directory. If the audio file has already been transcribed in a previous run with the same output directory (same file name and checksum), the client will skip the file.
995+
3. Audio files are uploaded from the input directory. If the audio file was transcribed in a previous run with the same output directory (same file name and checksum), the client skips the file.
996996
4. If a request is sent to the `/submit` endpoint, the files are dispatched to the container endpoints from step
997997
5. Logs and the Speech container output are returned to the specified output directory.
998998

@@ -1004,7 +1004,7 @@ The batch processing kit offers three modes, using the `--run-mode` parameter.
10041004

10051005
The client creates a *run.log* file in the directory specified by the `-log_folder` argument in the docker `run` command. Logs are captured at the DEBUG level by default. The same logs are sent to the `stdout/stderr`, and filtered depending on the `-file_log_level` or `console_log_level` arguments. This log is only necessary for debugging, or if you need to send a trace for support. The logging folder also contains the Speech SDK logs for each audio file.
10061006

1007-
The output directory specified by `-output_folder` will contain a *run_summary.json* file, which is periodically rewritten every 30 seconds or whenever new transcriptions are finished. You can use this file to check on progress as the batch proceeds. It will also contain the final run statistics and final status of every file when the batch is completed. The batch is completed when the process has a clean exit.
1007+
The output directory specified by `-output_folder` contains a *run_summary.json* file, which is periodically rewritten every 30 seconds or whenever new transcriptions are finished. You can use this file to check on progress as the batch proceeds. It also contains the final run statistics and final status of every file when the batch is completed. The batch is completed when the process has a clean exit.
10081008

10091009
## Next steps
10101010

0 commit comments

Comments
 (0)