Merge pull request #234248 from eric-urban/eur/input-stream

denrea · web-flow · commit 122b936c6d80 · 2023-04-12T09:50:16.000-07:00
audio input stream
diff --git a/articles/cognitive-services/Speech-Service/how-to-use-audio-input-streams.md b/articles/cognitive-services/Speech-Service/how-to-use-audio-input-streams.md
@@ -3,13 +3,13 @@ title: Speech SDK audio input stream concepts
 titleSuffix: Azure Cognitive Services
 description: An overview of the capabilities of the Speech SDK audio input stream API.
 services: cognitive-services
-author: fmegen
+author: eric-urban
 manager: nitinme
 ms.service: cognitive-services
 ms.subservice: speech-service
 ms.topic: how-to
-ms.date: 06/13/2022
-ms.author: fmegen
+ms.date: 04/12/2023
+ms.author: eur
 ms.devlang: csharp
 ms.custom: devx-track-csharp
 ---
@@ -18,64 +18,71 @@ ms.custom: devx-track-csharp
 
 The Speech SDK provides a way to stream audio into the recognizer as an alternative to microphone or file input.
 
-The following steps are required when you use audio input streams:
+This guide describes how to use audio input streams. It also describes some of the requirements and limitations of the audio input stream.
 
-- Identify the format of the audio stream. The format must be supported by the Speech SDK and the Azure Cognitive Services Speech service. Currently, only the following configuration is supported:
+See more examples of speech-to-text recognition with audio input stream on [GitHub](https://github.com/Azure-Samples/cognitive-services-speech-sdk/blob/master/samples/csharp/sharedcontent/console/speech_recognition_samples.cs).
 
-  Audio samples are:
+## Identify the format of the audio stream
 
-   - PCM format (int-16)
-   - One channel
-   - 16 bits per sample, 8,000 or 16,000 samples per second (16,000 bytes or 32,000 bytes per second)
-   - Two-block aligned (16 bit including padding for a sample)
+Identify the format of the audio stream. The format must be supported by the Speech SDK and the Azure Cognitive Services Speech service. 
 
-  The corresponding code in the SDK to create the audio format looks like this example:
+Supported audio samples are:
 
-  ```csharp
-  byte channels = 1;
-  byte bitsPerSample = 16;
-  int samplesPerSecond = 16000; // or 8000
-  var audioFormat = AudioStreamFormat.GetWaveFormatPCM(samplesPerSecond, bitsPerSample, channels);
-  ```
+  - PCM format (int-16)
+  - One channel
+  - 16 bits per sample, 8,000 or 16,000 samples per second (16,000 bytes or 32,000 bytes per second)
+  - Two-block aligned (16 bit including padding for a sample)
 
-- Make sure that your code provides the RAW audio data according to these specifications. Also, make sure that 16-bit samples arrive in little-endian format. Signed samples are also supported. If your audio source data doesn't match the supported formats, the audio must be transcoded into the required format.
+The corresponding code in the SDK to create the audio format looks like this example:
 
-- Create your own audio input stream class derived from `PullAudioInputStreamCallback`. Implement the `Read()` and `Close()` members. The exact function signature is language-dependent, but the code looks similar to this code sample:
+```csharp
+byte channels = 1;
+byte bitsPerSample = 16;
+int samplesPerSecond = 16000; // or 8000
+var audioFormat = AudioStreamFormat.GetWaveFormatPCM(samplesPerSecond, bitsPerSample, channels);
+```
 
-  ```csharp
-   public class ContosoAudioStream : PullAudioInputStreamCallback {
-      ContosoConfig config;
+Make sure that your code provides the RAW audio data according to these specifications. Also, make sure that 16-bit samples arrive in little-endian format. Signed samples are also supported. If your audio source data doesn't match the supported formats, the audio must be transcoded into the required format.
 
-      public ContosoAudioStream(const ContosoConfig& config) {
-          this.config = config;
-      }
+## Create your own audio input stream class
 
-      public int Read(byte[] buffer, uint size) {
-          // Returns audio data to the caller.
-          // E.g., return read(config.YYY, buffer, size);
-      }
+You can create your own audio input stream class derived from `PullAudioInputStreamCallback`. Implement the `Read()` and `Close()` members. The exact function signature is language-dependent, but the code looks similar to this code sample:
 
-      public void Close() {
-          // Close and clean up resources.
-      }
-   };
-  ```
+```csharp
+public class ContosoAudioStream : PullAudioInputStreamCallback {
+  ContosoConfig config;
 
-- Create an audio configuration based on your audio format and input stream. Pass in both your regular speech configuration and the audio input configuration when you create your recognizer. For example:
+  public ContosoAudioStream(const ContosoConfig& config) {
+      this.config = config;
+  }
 
-  ```csharp
-  var audioConfig = AudioConfig.FromStreamInput(new ContosoAudioStream(config), audioFormat);
+  public int Read(byte[] buffer, uint size) {
+      // Returns audio data to the caller.
+      // E.g., return read(config.YYY, buffer, size);
+  }
 
-  var speechConfig = SpeechConfig.FromSubscription(...);
-  var recognizer = new SpeechRecognizer(speechConfig, audioConfig);
+  public void Close() {
+      // Close and clean up resources.
+  }
+};
+```
 
-  // Run stream through recognizer.
-  var result = await recognizer.RecognizeOnceAsync();
+Create an audio configuration based on your audio format and input stream. Pass in both your regular speech configuration and the audio input configuration when you create your recognizer. For example:
+
+```csharp
+var audioConfig = AudioConfig.FromStreamInput(new ContosoAudioStream(config), audioFormat);
+
+var speechConfig = SpeechConfig.FromSubscription(...);
+var recognizer = new SpeechRecognizer(speechConfig, audioConfig);
+
+// Run stream through recognizer.
+var result = await recognizer.RecognizeOnceAsync();
+
+var text = result.GetText();
+```
 
-  var text = result.GetText();
-  ```
 
 ## Next steps
 
 - [Create a free Azure account](https://azure.microsoft.com/free/cognitive-services/)
-- [See how to recognize speech in C#](./get-started-speech-to-text.md?pivots=programming-language-csharp&tabs=dotnet)
+- [See how to recognize speech in C#](./get-started-speech-to-text.md?pivots=programming-language-csharp&tabs=dotnet)