Skip to content

himaenshuu/Multimodal-Audio-Video-RAG-System

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🎯 Audio-Video RAG with Gemini & Qdrant

This project implements a Retrieval-Augmented Generation (RAG) system that accepts audio or video input, transcribes it, and performs question-answering using Google Gemini and Qdrant.

🚀 Features

🎯 Core Capabilities

  • RAG (Retrieval-Augmented Generation) pipeline using LangChain and Gemini
  • Support for both audio and video input
  • Semantic search backed by Qdrant vector database
  • Dual response generation:
    • 🔍 Search-based context
    • 🤖 LLM-powered Gemini response

🗣️ Audio Processing

  • Transcription of .mp3 or .wav files using Faster-Whisper
  • High-quality offline transcription with fast performance

🎥 Video Handling

  • Automatic audio extraction from videos using FFmpeg
  • Full transcription of video content for downstream processing

🧠 Embedding and Search

  • Semantic embedding using Google Generative AI (Gemini embeddings)
  • Vector storage and similarity search using Qdrant

📦 Lightweight & Offline-Friendly

  • Can run entirely locally (without OpenAI or cloud dependencies)
  • Ideal for constrained environments with privacy concerns

📚 Interactive Development

  • Built in Jupyter Notebook for transparency, experimentation, and research workflows

🧰 Tech Stack Used

Technology Logo Purpose
Python Python Backend logic and orchestration
LangChain LangChain RAG architecture and LLM interface
Gemini (Google Generative AI) Gemini LLM and Embeddings
Qdrant Qdrant Vector database for semantic search
Faster-Whisper Whisper Fast local speech-to-text model
FFmpeg FFmpeg Audio extraction from video files
Jupyter Notebook Jupyter Interactive development environment

🔧 Requirements

  • Python = 3.10
  • FFmpeg (added to PATH)
  • Gemini API Key
  • Install dependencies:
  • pip install -r requirements.txt

Appendices

🎬 FFmpeg Installation Guide


✅ Windows Installation

  1. Download FFmpeg:

  2. Extract the Archive:

    • Use 7-Zip or WinRAR to extract the file
    • Extract the contents to C:\ffmpeg
  3. Add to System PATH:

    • Open:
      • Control PanelSystemAdvanced system settingsEnvironment Variables
    • Under System variables, find Path → Click Edit
    • Click New and enter: C:\ffmpeg\bin
    • Click OK to save
  4. Verify Installation:

    • Open Command Prompt and run:
      ffmpeg -version
      

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published