Skip to content

Commit db608ce

Browse files
committed
fix: enhance README structure and clarity
1 parent d4af8bc commit db608ce

1 file changed

Lines changed: 19 additions & 7 deletions

File tree

README.md

Lines changed: 19 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,16 @@
11
# mcp-listen
22

3-
Give your AI agents the ability to listen.
3+
## The first MCP server that can hear
44

5-
Microphone capture and speech-to-text tools for MCP-compatible agents. Powered by [decibri](https://decibri.dev).
5+
[![npm version](https://img.shields.io/npm/v/mcp-listen)](https://www.npmjs.com/package/mcp-listen)
6+
[![license](https://img.shields.io/badge/license-Apache--2.0-blue)](LICENSE)
7+
8+
Give your AI agents the ability to listen. Microphone capture and speech-to-text tools for MCP-compatible agents. Powered by [decibri](https://decibri.dev).
69

710
## Tools
811

912
| Tool | Description |
10-
|------|-------------|
13+
| ------ | ------------- |
1114
| `list_audio_devices` | List available microphone input devices |
1215
| `capture_audio` | Record audio from the microphone and save as WAV |
1316
| `voice_query` | Capture, transcribe (whisper.cpp), and query a local LLM (Ollama) |
@@ -17,7 +20,7 @@ Microphone capture and speech-to-text tools for MCP-compatible agents. Powered b
1720
### Claude Code
1821

1922
```bash
20-
claude mcp add mcp-listen -- npx mcp-listen
23+
claude mcp add mcp-listen npx mcp-listen
2124
```
2225

2326
### Claude Desktop / ChatGPT Desktop / Cursor / Windsurf / VS Code
@@ -35,7 +38,7 @@ Add to your MCP configuration:
3538
}
3639
```
3740

38-
Works with any MCP-compatible client: Claude, ChatGPT, Cursor, GitHub Copilot, Windsurf, VS Code, Gemini, Zed, and more.
41+
Compatible with Claude Desktop, ChatGPT Desktop, Cursor, GitHub Copilot, Windsurf, VS Code, Gemini, Zed, and any MCP-compatible client.
3942

4043
### Global Install
4144

@@ -79,7 +82,7 @@ Records audio from the microphone and saves as a WAV file.
7982
**Parameters:**
8083

8184
| Parameter | Type | Default | Description |
82-
|-----------|------|---------|-------------|
85+
| ---------- | ------ | --------- | ------------- |
8386
| `duration_ms` | number | 5000 | Recording duration in milliseconds (100-30000) |
8487
| `device` | number | system default | Device index from `list_audio_devices` |
8588

@@ -102,7 +105,7 @@ Full voice pipeline: capture audio, transcribe with whisper.cpp, send to Ollama,
102105
**Parameters:**
103106

104107
| Parameter | Type | Default | Description |
105-
|-----------|------|---------|-------------|
108+
| ----------- | ------ | --------- | ------------- |
106109
| `duration_ms` | number | 5000 | Recording duration in milliseconds (100-30000) |
107110
| `device` | number | system default | Device index from `list_audio_devices` |
108111
| `whisper_model` | string | ggml-base.en.bin | Path or filename of Whisper GGML model |
@@ -132,11 +135,20 @@ The `voice_query` tool replicates the pipeline from [voxagent](https://voxagent.
132135

133136
The `voice_query` tool requires a Whisper GGML model file. Download one:
134137

138+
**Linux / macOS:**
139+
135140
```bash
136141
mkdir -p ~/.mcp-listen/models
137142
curl -L -o ~/.mcp-listen/models/ggml-base.en.bin https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-base.en.bin
138143
```
139144

145+
**Windows (PowerShell):**
146+
147+
```powershell
148+
mkdir "$env:USERPROFILE\.mcp-listen\models" -Force
149+
Invoke-WebRequest -Uri "https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-base.en.bin" -OutFile "$env:USERPROFILE\.mcp-listen\models\ggml-base.en.bin"
150+
```
151+
140152
The model is ~150MB and downloads once. You can also set the `WHISPER_MODEL_PATH` environment variable to a custom directory.
141153

142154
## Ollama Setup

0 commit comments

Comments
 (0)