|
| 1 | +--- |
| 2 | +title: Speech Endpoints |
| 3 | +description: Processing speech with AI Server |
| 4 | +--- |
| 5 | + |
| 6 | +AI Server provides endpoints for speech-related tasks, including Speech-to-Text and Text-to-Speech conversions. These endpoints utilize AI models to process audio and text data. |
| 7 | + |
| 8 | +The following tasks are available for speech processing: |
| 9 | + |
| 10 | +- **Speech to Text**: Convert audio input to text output. |
| 11 | +- **Text to Speech**: Convert text input to audio output. |
| 12 | + |
| 13 | +## Using Speech Endpoints |
| 14 | + |
| 15 | +These endpoints are used in a similar way to other AI Server endpoints. You can provide a RefId and Tag to help categorize the request, and for Queue requests, you can provide a ReplyTo URL to send a POST request to when the request is complete. |
| 16 | + |
| 17 | +### Speech to Text {#speech-to-text} |
| 18 | + |
| 19 | +The Speech to Text endpoint converts audio input into text. It provides two types of output: |
| 20 | + |
| 21 | +1. Text with timestamps: JSON format with `start` and `end` timestamps for each segment. |
| 22 | +2. Plain text: The full transcription without timestamps. |
| 23 | + |
| 24 | +These outputs are returned in the `TextOutputs` array, where the JSON will need to be parsed to extract the text and timestamps. |
| 25 | + |
| 26 | +::include ai-server/cs/speech-to-text-1.cs.md:: |
| 27 | + |
| 28 | +### Queue Speech to Text {#queue-speech-to-text} |
| 29 | + |
| 30 | +For longer audio files or when you want to process the request asynchronously, you can use the Queue Speech to Text endpoint. |
| 31 | + |
| 32 | +::include ai-server/cs/queue-speech-to-text-1.cs.md:: |
| 33 | + |
| 34 | +### Text to Speech {#text-to-speech} |
| 35 | + |
| 36 | +The Text to Speech endpoint converts text input into audio output. |
| 37 | + |
| 38 | +::include ai-server/cs/text-to-speech-1.cs.md:: |
| 39 | + |
| 40 | +### Queue Text to Speech {#queue-text-to-speech} |
| 41 | + |
| 42 | +For generating longer audio files or when you want to process the request asynchronously, you can use the Queue Text to Speech endpoint. |
| 43 | + |
| 44 | +::include ai-server/cs/queue-text-to-speech-1.cs.md:: |
| 45 | + |
0 commit comments