Skip to content

Latest commit

Β 

History

History
73 lines (54 loc) Β· 2.03 KB

File metadata and controls

73 lines (54 loc) Β· 2.03 KB

DeepSeek-R1-AI-Voice-Agent

This project enables real-time speech-to-text transcription using AssemblyAI, generates AI responses with DeepSeek R1 (7B model) via Ollama, and converts text responses into speech using ElevenLabs. The entire process happens in real-time, allowing for seamless interaction.

Disclaimer: Using the Assembly ai, you need to add your credit card


πŸš€ Features

  • Real-time speech-to-text using AssemblyAI
  • AI-powered responses with DeepSeek R1 (7B model) via Ollama
  • Instant text-to-speech conversion with ElevenLabs
  • Live audio streaming for an interactive experience

πŸ› οΈ Setup Instructions

Step 1: Sign Up & Install Dependencies

βœ… Get API Keys

βœ… Install Ollama

DeepSeek R1 is accessed via Ollama. Install Ollama from:
πŸ”— Download Ollama

βœ… Install PortAudio (Required for real-time transcription)

  • Debian/Ubuntu:

    apt install portaudio19-dev

    MacOS:

    brew install portaudio

####βœ… Install Python Libraries

Before running the script, install the required dependencies:

pip install "assemblyai[extras]"
pip install ollama
pip install elevenlabs

βœ… (MacOS Only) Install MPV for Audio Streaming

brew install mpv

Step 2: Download the DeepSeek R1 Model

Since this script uses DeepSeek R1 via Ollama, download the model locally by running:

ollama pull deepseek-r1:7b

πŸ› οΈ Setup with the install.sh script

Alternatively you could use our install.sh script to take care of the setup.

chmod +x install.sh
./install.sh

🎯 Running the Script

Once all dependencies are installed and the model is downloaded, simply run:

python AIVoiceAgent.py