MCP Voice Agent Demo

A demonstration of an MCP-enabled (Model Context Protocol) realtime voice agent that combines OpenAI's Realtime API with dynamic MCP tool integration. Entirely AI coded (but with human guidance).

Video (turn on sound)

mcp-realtime-poc.demo.mp4

🎯 Project Goal

This project demonstrates how to build a voice-controlled AI assistant that can dynamically discover and use tools from multiple MCP servers. The agent uses natural speech input to interact with various tools through OpenAI's Realtime API, showcasing the power of MCP for extending AI capabilities. Multiple servers can be configured via a standard mcp.json file, similar to Claude Desktop.

note: (currently only stdio transport is supported)

🏗️ Architecture

The project consists of two main components:

MCP Voice Agent (Python)

Location: mcp-voice-agent/ directory
Technology: Python 3.11+ with OpenAI Agents SDK
Features:
- Realtime voice interaction via OpenAI's Realtime API
- Dynamic MCP tool discovery and integration from multiple servers
- Automatic function generation from MCP schemas
- Audio input/output handling
- Console character set verification for emoji support
- Multi-server configuration via mcp.json

CalculatorMcp Server (C#)

Location: CalculatorMcp/ directory
Technology: .NET 8 with MCP SDK
Features:
- 13 mathematical and utility tools
- MCP protocol implementation
- Stdio-based communication

🛠️ Available Tools

The CalculatorMcp server provides these tools:

Math: add(a, b), multiply(a, b), circle_area(radius)
Numbers: random_between(min, max), is_even(number)
Strings: reverse_string(text), count_letter(text, letter), string_contains(text, substring)
Utilities: convert_temperature(temp, fromUnit, toUnit), delay(seconds), format_date(), days_until(date)

🔧 MCP Server Configuration

Configure multiple MCP servers using a mcp.json file in the project root. The format follows Claude Desktop's standard:

{
  "mcpServers": {
    "calculator": {
      "transport": "stdio",
      "command": "dotnet",
      "args": ["run", "--no-build", "--project", "CalculatorMcp/CalculatorMcp.csproj", "-v", "q"],
      "env": {}
    },
    "my-custom-server": {
      "transport": "stdio",
      "command": "python",
      "args": ["my_server.py"],
      "env": {
        "API_KEY": "your-key-here"
      }
    }
  }
}

Supported Transports:

stdio - Local process communication (currently supported)

Notes:

Server names (keys) must be unique and contain only alphanumeric characters, hyphens, and underscores
The voice agent automatically aggregates tools from all configured servers
Tools are prefixed with their server name to avoid conflicts (e.g., calculator_add, my_server_custom_tool)

📋 Prerequisites

.NET 8 SDK (for the C# MCP server)
Python 3.11+

OpenAI API Key in environment:

# Set permanently
setx OPENAI_API_KEY "your-api-key-here"

# Or set for current session
$env:OPENAI_API_KEY="your-api-key-here"

Microphone and speakers (default Windows audio devices)

🚀 Quick Start

1. Setup Environment

# Clone or extract the project
cd mcp-realtime-poc

# Create Python virtual environment
python -m venv .venv
.\.venv\Scripts\Activate.ps1

# Install Python dependencies
pip install -r .\mcp-voice-agent\requirements.txt

2. Configure MCP Servers

Create mcp.json in the project root (see configuration section above). A basic configuration for the included CalculatorMcp server:

{
  "mcpServers": {
    "calculator": {
      "transport": "stdio",
      "command": "dotnet",
      "args": ["run", "--no-build", "--project", "CalculatorMcp/CalculatorMcp.csproj", "-v", "q"],
      "env": {}
    }
  }
}

3. Build MCP Server(s)

# Build the C# MCP server
dotnet build .\CalculatorMcp\CalculatorMcp.csproj -c Release

4. Run the Voice Agent

# Single command - launches all configured MCP servers automatically
python .\mcp-voice-agent\main.py

🎤 Usage Examples

Once running, speak naturally to the agent:

"Add 7 and 13" → Performs addition using calculator server
"Give me a random number between 10 and 20" → Generates random number
"Convert 50 Celsius to Fahrenheit" → Temperature conversion
"Reverse the word hello" → String manipulation
"What's the date in a nice format?" → Date formatting
"Wait for 2 seconds" → Delay execution

With multiple servers configured, you can access tools from any server:

"Use the calculator to multiply 5 and 8" → Explicitly calls calculator server
"Run my custom analysis on this data" → Calls tool from custom server

Press Ctrl+C to exit gracefully.

📁 Project Structure

mcp-realtime-poc/
├── mcp.json                 # MCP server configuration
├── README.md               # This file
├── CalculatorMcp/          # C# MCP server
│   ├── CalculatorMcp.csproj
│   ├── CalculatorTools.cs
│   └── Program.cs
├── mcp-voice-agent/        # Main Python application
│   ├── main.py            # Entry point
│   ├── requirements.txt   # Python dependencies
│   ├── mcp_voice_agent/       # MCP integration module
│   │   ├── mcp_client_sdk.py # Official MCP SDK client + MultiMCPClient
│   │   ├── dynamic_tools.py # Dynamic function generation
│   │   ├── audio.py       # Audio handling
│   │   └── settings.py    # Configuration + MCPServerConfig
│   └── tests/             # Unit tests
└── artifacts/              # Development files (ignored)

⚙️ Configuration

The voice agent automatically detects your console's character encoding and provides emoji feedback. If emojis don't display correctly, ensure your terminal is set to UTF-8:

chcp 65001  # Set console to UTF-8

🔧 Development

Running Tests

cd mcp-voice-agent
python -m pytest tests/

Adding New MCP Servers

Create or obtain an MCP server that implements the MCP protocol

Add server configuration to mcp.json:

{
  "mcpServers": {
    "my-server": {
      "transport": "stdio",
      "command": "your-command",
      "args": ["arg1", "arg2"],
      "env": {"KEY": "value"}
    }
  }
}

Restart the voice agent - it will automatically discover and integrate tools from the new server
Test the integration by asking the agent to use tools from your new server

📚 Technical Details

Audio: PCM16 at 24kHz (optimal for OpenAI Realtime API)
Communication: Stdio-based MCP transport between Python and configured servers
Tool Generation: Dynamic Python function creation from MCP schemas with server prefixing
Multi-Server Support: Tools aggregated from all configured servers with automatic conflict resolution
Error Handling: Comprehensive logging with emoji indicators
Platform: Windows (PowerShell), with cross-platform potential

🔗 References

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MCP Voice Agent Demo

Video (turn on sound)

🎯 Project Goal

🏗️ Architecture

MCP Voice Agent (Python)

CalculatorMcp Server (C#)

🛠️ Available Tools

🔧 MCP Server Configuration

📋 Prerequisites

🚀 Quick Start

1. Setup Environment

2. Configure MCP Servers

3. Build MCP Server(s)

4. Run the Voice Agent

🎤 Usage Examples

📁 Project Structure

⚙️ Configuration

🔧 Development

Running Tests

Adding New MCP Servers

📚 Technical Details

🔗 References

📄 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
CalculatorMcp		CalculatorMcp
mcp-voice-agent		mcp-voice-agent
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
agents.md		agents.md
mcp.json		mcp.json

Folders and files

Latest commit

History

Repository files navigation

MCP Voice Agent Demo

Video (turn on sound)

🎯 Project Goal

🏗️ Architecture

MCP Voice Agent (Python)

CalculatorMcp Server (C#)

🛠️ Available Tools

🔧 MCP Server Configuration

📋 Prerequisites

🚀 Quick Start

1. Setup Environment

2. Configure MCP Servers

3. Build MCP Server(s)

4. Run the Voice Agent

🎤 Usage Examples

📁 Project Structure

⚙️ Configuration

🔧 Development

Running Tests

Adding New MCP Servers

📚 Technical Details

🔗 References

📄 License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages