You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: doc/docs/config/asr.md
+11Lines changed: 11 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,15 +6,23 @@ sidebar_position: 2
6
6
7
7
The EchoKit server supports popular ASR providers.
8
8
9
+
| Platform | URL example | Notes |
10
+
| ------------- | ------------- | ---- |
11
+
|`openai`|`https://api.openai.com/v1/audio/transcriptions`| Supports endpoint URLs from any OpenAI-compatible services, such as Groq and Open Router. |
12
+
|`paraformer_v2`|`wss://dashscope.aliyuncs.com/api-ws/v1/inference`| A Web socket streaming ASR service endpoint supported by the ALi Cloud |
13
+
9
14
10
15
## OpenAI and compatible services
11
16
12
17
The OpenAI `/v1/audio/transcriptions` API is supported by OpenAI, Open Router, Groq, Azure, AWS and many other providers.
18
+
This is a non-streaming service endpoint, meaning that EchoKit server must determine when the user is done
19
+
talking (via an VAD service), and then submit the entire audio to get a transscription.
Copy file name to clipboardExpand all lines: doc/docs/config/intro.md
+13-5Lines changed: 13 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -45,6 +45,12 @@ The rest of the `config.toml` specifies how to use different AI services. Each s
45
45
* The `[llm]` section configures the [large language model](llm.md) services, including [tools](llm-tools.md) and [MCP actions](mcp.md).
46
46
* The `[tts]` section configures the [text-to-voice](tts.md) services.
47
47
48
+
It is important to note that each of sections has those fields.
49
+
50
+
* A `platform` field that designates the service protocol. A common example is `openai` for OpenAI compatible API endpoints.
51
+
* A `url` field for the service URL endpoint. It is typically a `https://` or `wss://` URL. The latter is the Web Socket address for streaming services.
52
+
* Optional fields that are specific to the `platform`. That includes `api_key`, `model`, and others.
53
+
48
54
## Complete Configuration Example
49
55
50
56
You will need a free [API key from Groq](https://console.groq.com/keys).
@@ -54,23 +60,25 @@ You will need a free [API key from Groq](https://console.groq.com/keys).
54
60
addr = "0.0.0.0:8080"
55
61
hello_wav = "hello.wav"
56
62
57
-
# Speech recognition
63
+
# Speech recognition using the OpenAI transcriptions API, but hosted by Groq (instead of OpenAI)
Copy file name to clipboardExpand all lines: doc/docs/get-started/echokit-server.md
+2Lines changed: 2 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -19,6 +19,8 @@ docker run --rm \
19
19
The required `config.toml` file for the local EchoKit server could be the following. You will need
20
20
free [Groq](https://console.groq.com/keys) and [ElevenLabs](https://elevenlabs.io/app/settings/api-keys) API keys.
21
21
22
+
> The `platform = "openai"` in the configuration refers to OpenAI-compatible service endpoints. Groq provides its inference service in the OpenAI protocol.
0 commit comments