Skip to content

Real-time voice-enabled AI assistant with Emotional intelligence, RAG and WebSocket-based communication.

Notifications You must be signed in to change notification settings

abhi-la-sha/SamanthaAI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Samantha AI Conversational Assistant

Samantha AI is a voice-enabled conversational assistant designed to assist users with a variety of tasks, from scheduling and email management to emotional support and entertainment. Built with a focus on natural language understanding and empathetic responses, Samantha leverages advanced AI models and APIs to provide a seamless user experience.

Features

Samantha AI offers a wide range of features to cater to different user needs:

  • Voice and Text Interaction:

    • Accepts voice input via microphone (using speech recognition) and responds with synthesized speech.
    • Supports text-based input through a web frontend for flexible interaction.
  • Natural Language Understanding:

    • Powered by the Mistral-7B model for generating context-aware responses.
    • Maintains conversation history for coherent and personalized interactions.
  • Emotion Analysis:

    • Uses the SamLowe/roberta-base-go_emotions model to detect user emotions (e.g., happiness, sadness, anger).
    • Adjusts response tone based on detected emotions (e.g., empathetic for sadness, cheerful for happiness).
  • Scheduling and Calendar Management:

    • Integrates with Google Calendar API to schedule, modify, and delete events.
    • Provides daily event summaries (e.g., "What’s on my schedule today?").
  • Email Management:

    • Integrates with Gmail API to send, read, and search emails.
    • Example: "Send an email to John about the meeting" or "Read my recent emails."
  • Entertainment and Information:

    • Fetches random jokes and quotes from external APIs to entertain users.
    • Provides time and date information (e.g., "What’s the time?").
  • App Launching:

    • Launches applications on the user’s system (e.g., Notepad, Calculator, VS Code).
    • Example: "Open Notepad" or "Launch VS Code."
  • Recipe Guidance:

    • Offers step-by-step recipe instructions (currently hardcoded, with potential for API integration).
    • Example: "How do I make a sandwich?"
  • Memory and Context Awareness:

    • Stores conversation history in knowledge.json for context-aware responses.
    • Tracks user preferences (e.g., favorite games, travel destinations) to personalize interactions.
  • Web Frontend:

    • Provides a user-friendly React-based interface to interact with Samantha via text or voice.
    • Plays audio responses generated by the backend using ElevenLabs.

Technologies Used

Samantha AI is built using a combination of modern technologies for both backend and frontend, ensuring robust performance and scalability.

Backend

  • Python: Core programming language for the backend (sam11.py).
  • Flask: Lightweight web framework to create a REST API for frontend-backend communication.
  • Mistral-7B: A 7-billion-parameter language model from Hugging Face for natural language generation.
  • Transformers (Hugging Face): Library to load and use the Mistral-7B and emotion analysis models.
  • PyTorch: Deep learning framework for model inference, with 4-bit quantization to optimize performance.
  • SpeechRecognition: Python library for capturing voice input via microphone.
  • ElevenLabs API: Text-to-speech service for generating high-quality audio responses with mood variations.
  • Google Calendar API: For scheduling and managing calendar events.
  • Google Gmail API: For email-related tasks (send, read, search).
  • RoBERTa (SamLowe/roberta-base-go_emotions): Pre-trained model for emotion analysis.
  • Requests: HTTP library for fetching external data (e.g., jokes, quotes).
  • Pygame: For audio playback of ElevenLabs-generated responses.
  • python-dateutil: For parsing and manipulating dates in scheduling tasks.

Frontend

  • React: JavaScript library for building the user interface.
  • JavaScript (ES6+): For frontend logic and API interactions.
  • HTML/CSS: For structuring and styling the web interface.
  • Web Speech API (assumed, based on typical voice-enabled React apps): For browser-based voice input (if implemented).
  • Axios (assumed): For making HTTP requests to the Flask backend.
  • Node.js/NPM: For managing frontend dependencies and running the development server.

Other Tools

  • Git: Version control system for managing the codebase.
  • GitHub: Hosting the repository at https://github.com/arpanmathur27/SamanthaAI-Conversational-AI-.

Setup Instructions

Prerequisites

  • Python 3.8+: For running the backend.
  • Node.js 16+ and NPM: For the frontend.
  • Git: For cloning the repository.
  • Google Cloud Account: For Calendar and Gmail API credentials (credentials.json).
  • ElevenLabs API Key: For text-to-speech functionality.
  • Hugging Face Token (optional): If using restricted models.

Backend

  1. Navigate to the backend/ directory:
    cd backend
  2. Create a virtual environment and activate it:
    python -m venv venv
    venv\Scripts\activate  # Windows
    # OR
    source venv/bin/activate  # Linux/Mac
  3. Install dependencies:
    pip install -r requirements.txt
  4. Set up environment variables in a .env file:
    HF_TOKEN=your_huggingface_token
    ELEVENLABS_API_KEY=your_elevenlabs_api_key
    
  5. Place credentials.json (from Google Cloud Console) in the backend/ directory.
  6. Run the backend:
    python sam11.py
    The Flask server will start on http://localhost:5000.

Frontend

  1. Navigate to the frontend/ directory:
    cd frontend
  2. Install dependencies:
    npm install
  3. Set up environment variables in a .env file:
    REACT_APP_API_URL=http://localhost:5000
    
  4. Start the frontend:
    npm start
    The React app will open in your browser at http://localhost:3000.

Usage

  • Open the frontend in your browser (http://localhost:3000).
  • Use voice or text input to interact with Samantha.
  • Example commands:
    • "Schedule a meeting at 3 PM tomorrow."
    • "Tell me a joke."
    • "What’s the time?"
    • "Send an email to alice@example.com about the project."
    • "Open VS Code."
    • "I’m feeling sad—can you help?"

Samantha will respond with text (displayed on the frontend) and audio (played through your browser).

Project Structure

SamanthaAI-Conversational-AI-/
├── backend/
│   ├── sam11.py              # Main backend script with Flask API
│   └── requirements.txt      # Backend dependencies
├── frontend/
│   ├── src/                  # React source files
│   ├── public/               # Static assets
│   └── package.json          # Frontend dependencies
├── .gitignore                # Files to exclude from Git
└── README.md                 # Project documentation

Future Enhancements

  • Dynamic Recipe API: Integrate a recipe API (e.g., Spoonacular) for real-time recipe suggestions.
  • Weather Updates: Add weather information using an API like OpenWeatherMap.
  • Multilingual Support: Extend Samantha to support multiple languages using translation APIs.
  • Improved UI/UX: Enhance the frontend with better styling (e.g., Tailwind CSS) and animations.
  • CI/CD Pipeline: Set up GitHub Actions for automated testing and deployment.

About

Real-time voice-enabled AI assistant with Emotional intelligence, RAG and WebSocket-based communication.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors