Samantha AI is a voice-enabled conversational assistant designed to assist users with a variety of tasks, from scheduling and email management to emotional support and entertainment. Built with a focus on natural language understanding and empathetic responses, Samantha leverages advanced AI models and APIs to provide a seamless user experience.
Samantha AI offers a wide range of features to cater to different user needs:
-
Voice and Text Interaction:
- Accepts voice input via microphone (using speech recognition) and responds with synthesized speech.
- Supports text-based input through a web frontend for flexible interaction.
-
Natural Language Understanding:
- Powered by the Mistral-7B model for generating context-aware responses.
- Maintains conversation history for coherent and personalized interactions.
-
Emotion Analysis:
- Uses the
SamLowe/roberta-base-go_emotionsmodel to detect user emotions (e.g., happiness, sadness, anger). - Adjusts response tone based on detected emotions (e.g., empathetic for sadness, cheerful for happiness).
- Uses the
-
Scheduling and Calendar Management:
- Integrates with Google Calendar API to schedule, modify, and delete events.
- Provides daily event summaries (e.g., "What’s on my schedule today?").
-
Email Management:
- Integrates with Gmail API to send, read, and search emails.
- Example: "Send an email to John about the meeting" or "Read my recent emails."
-
Entertainment and Information:
- Fetches random jokes and quotes from external APIs to entertain users.
- Provides time and date information (e.g., "What’s the time?").
-
App Launching:
- Launches applications on the user’s system (e.g., Notepad, Calculator, VS Code).
- Example: "Open Notepad" or "Launch VS Code."
-
Recipe Guidance:
- Offers step-by-step recipe instructions (currently hardcoded, with potential for API integration).
- Example: "How do I make a sandwich?"
-
Memory and Context Awareness:
- Stores conversation history in
knowledge.jsonfor context-aware responses. - Tracks user preferences (e.g., favorite games, travel destinations) to personalize interactions.
- Stores conversation history in
-
Web Frontend:
- Provides a user-friendly React-based interface to interact with Samantha via text or voice.
- Plays audio responses generated by the backend using ElevenLabs.
Samantha AI is built using a combination of modern technologies for both backend and frontend, ensuring robust performance and scalability.
- Python: Core programming language for the backend (
sam11.py). - Flask: Lightweight web framework to create a REST API for frontend-backend communication.
- Mistral-7B: A 7-billion-parameter language model from Hugging Face for natural language generation.
- Transformers (Hugging Face): Library to load and use the Mistral-7B and emotion analysis models.
- PyTorch: Deep learning framework for model inference, with 4-bit quantization to optimize performance.
- SpeechRecognition: Python library for capturing voice input via microphone.
- ElevenLabs API: Text-to-speech service for generating high-quality audio responses with mood variations.
- Google Calendar API: For scheduling and managing calendar events.
- Google Gmail API: For email-related tasks (send, read, search).
- RoBERTa (SamLowe/roberta-base-go_emotions): Pre-trained model for emotion analysis.
- Requests: HTTP library for fetching external data (e.g., jokes, quotes).
- Pygame: For audio playback of ElevenLabs-generated responses.
- python-dateutil: For parsing and manipulating dates in scheduling tasks.
- React: JavaScript library for building the user interface.
- JavaScript (ES6+): For frontend logic and API interactions.
- HTML/CSS: For structuring and styling the web interface.
- Web Speech API (assumed, based on typical voice-enabled React apps): For browser-based voice input (if implemented).
- Axios (assumed): For making HTTP requests to the Flask backend.
- Node.js/NPM: For managing frontend dependencies and running the development server.
- Git: Version control system for managing the codebase.
- GitHub: Hosting the repository at
https://github.com/arpanmathur27/SamanthaAI-Conversational-AI-.
- Python 3.8+: For running the backend.
- Node.js 16+ and NPM: For the frontend.
- Git: For cloning the repository.
- Google Cloud Account: For Calendar and Gmail API credentials (
credentials.json). - ElevenLabs API Key: For text-to-speech functionality.
- Hugging Face Token (optional): If using restricted models.
- Navigate to the
backend/directory:cd backend - Create a virtual environment and activate it:
python -m venv venv venv\Scripts\activate # Windows # OR source venv/bin/activate # Linux/Mac
- Install dependencies:
pip install -r requirements.txt
- Set up environment variables in a
.envfile:HF_TOKEN=your_huggingface_token ELEVENLABS_API_KEY=your_elevenlabs_api_key - Place
credentials.json(from Google Cloud Console) in thebackend/directory. - Run the backend:
The Flask server will start on
python sam11.py
http://localhost:5000.
- Navigate to the
frontend/directory:cd frontend - Install dependencies:
npm install
- Set up environment variables in a
.envfile:REACT_APP_API_URL=http://localhost:5000 - Start the frontend:
The React app will open in your browser at
npm start
http://localhost:3000.
- Open the frontend in your browser (
http://localhost:3000). - Use voice or text input to interact with Samantha.
- Example commands:
- "Schedule a meeting at 3 PM tomorrow."
- "Tell me a joke."
- "What’s the time?"
- "Send an email to alice@example.com about the project."
- "Open VS Code."
- "I’m feeling sad—can you help?"
Samantha will respond with text (displayed on the frontend) and audio (played through your browser).
SamanthaAI-Conversational-AI-/
├── backend/
│ ├── sam11.py # Main backend script with Flask API
│ └── requirements.txt # Backend dependencies
├── frontend/
│ ├── src/ # React source files
│ ├── public/ # Static assets
│ └── package.json # Frontend dependencies
├── .gitignore # Files to exclude from Git
└── README.md # Project documentation
- Dynamic Recipe API: Integrate a recipe API (e.g., Spoonacular) for real-time recipe suggestions.
- Weather Updates: Add weather information using an API like OpenWeatherMap.
- Multilingual Support: Extend Samantha to support multiple languages using translation APIs.
- Improved UI/UX: Enhance the frontend with better styling (e.g., Tailwind CSS) and animations.
- CI/CD Pipeline: Set up GitHub Actions for automated testing and deployment.