Skip to content

Deepan-mn/voice-agent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

9 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

voice-agent

πŸŽ™οΈ Voice Agent – RAG + Voice Chatbot

This application is a voice-enabled AI assistant that lets you talk to your documents.
Upload your PDFs or text files, and simply speak your question – the bot will transcribe your voice, retrieve the most relevant answer from your knowledge base using RAG (Retrieval-Augmented Generation), and reply back in a natural, human-like voice.


✨ Features

  • 🎀 Voice Input: Speak instead of typing – powered by OpenAI Whisper.
  • πŸ“š Document Knowledge Base: Upload PDFs or TXT files to build a searchable knowledge base.
  • πŸ” Retrieval-Augmented Generation: Finds the most relevant info from your documents before answering.
  • πŸ—£οΈ Text-to-Speech: Natural-sounding audio replies using Kokoro TTS.
  • ⚑ Real-time Interaction: Smooth and quick responses in a friendly chat interface.

πŸ› οΈ Tech Stack

  • Frontend/UI: Streamlit
  • Speech-to-Text (ASR): Whisper
  • Text-to-Speech (TTS): Kokoro
  • Document Processing & RAG: LangChain + Vector Stores
  • Backend Language: Python3.10

πŸš€ How It Works

  1. Upload Documents – PDF or TXT files via the sidebar.
  2. Process Knowledge Base – Files are chunked, embedded, and stored in a vector database.
  3. Ask via Voice – Speak your query into the mic.
  4. RAG Retrieval – Finds and ranks relevant chunks from your uploaded content.
  5. Answer Generation – Summarizes and formats the best answer.
  6. Voice Response – Converts the answer into natural speech and plays it.

πŸ“¦ Installation

git clone https://github.com/yourusername/voice-agent.git
cd voice-agent
pip install -r requirements.txt

▢️ Run the App

streamlit run main.py

πŸ“Œ Notes

  1. Make sure you have FFmpeg installed for Whisper.
  2. Supports multiple files and multiple queries in a session.
  3. Best used with clear audio for optimal transcription accuracy.

About

Voice Assistant - combination of STT-TTS and RAG

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages