InnoFranceSpeakerDetect

A Kimi-Audio powered single-speaker profile detector. Provide an audio clip that contains only one speaker, and get a JSON profile with design_text and design_instruct.

Features

CLI usage
FastAPI service + Web UI
MCP Server (stdio / SSE)

Installation

cd InnoFranceSpeakerDetect
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

Kimi-Audio dependency (required):

git clone https://github.com/MoonshotAI/Kimi-Audio
pip install -e Kimi-Audio

Configure the model path:

cp env.example .env
# edit .env

CLI

python3 -m app.cli /path/to/speaker.wav -o speaker.json

Run the service

uvicorn app.main:app --host 0.0.0.0 --port 8012

Open http://localhost:8012 for the Web UI.

The Web UI also supports an audio URL (.wav / .mp3).

MCP Server

stdio mode:

python3 -m app.mcp_server --transport stdio

SSE mode:

python3 -m app.mcp_server --transport sse --host 127.0.0.1 --port 8013

Tools:

detect_speaker(audio_path, output_path=None, model_path=None)
detect_speaker_from_url(audio_url, output_path=None, model_path=None)

Output

The output is a JSON array. Example:

[
  {
    "design_text": "Host responsible for leading the conversation.",
    "design_instruct": "Female, around 30, medium pace, clear timbre, friendly tone."
  }
]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

InnoFranceSpeakerDetect

Features

Installation

CLI

Run the service

MCP Server

Output

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

InnoFranceSpeakerDetect

Features

Installation

CLI

Run the service

MCP Server

Output