Skip to content

FengD/InnoFranceSpeakerDetect

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

InnoFranceSpeakerDetect

A Kimi-Audio powered single-speaker profile detector. Provide an audio clip that contains only one speaker, and get a JSON profile with design_text and design_instruct.

UI 1

Features

  • CLI usage
  • FastAPI service + Web UI
  • MCP Server (stdio / SSE)

Installation

cd InnoFranceSpeakerDetect
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

Kimi-Audio dependency (required):

git clone https://github.com/MoonshotAI/Kimi-Audio
pip install -e Kimi-Audio

Configure the model path:

cp env.example .env
# edit .env

CLI

python3 -m app.cli /path/to/speaker.wav -o speaker.json

Run the service

uvicorn app.main:app --host 0.0.0.0 --port 8012

Open http://localhost:8012 for the Web UI.

The Web UI also supports an audio URL (.wav / .mp3).

MCP Server

stdio mode:

python3 -m app.mcp_server --transport stdio

SSE mode:

python3 -m app.mcp_server --transport sse --host 127.0.0.1 --port 8013

Tools:

  • detect_speaker(audio_path, output_path=None, model_path=None)
  • detect_speaker_from_url(audio_url, output_path=None, model_path=None)

Output

The output is a JSON array. Example:

[
  {
    "design_text": "Host responsible for leading the conversation.",
    "design_instruct": "Female, around 30, medium pace, clear timbre, friendly tone."
  }
]

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors