Skip to content

shhreyuuFW/Autumn_AI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

2 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ‚ Autumn AI Assistant

A sophisticated AI assistant with advanced features including streaming responses, G2P-enhanced text-to-speech, and emotional intelligence.

✨ Latest Features (June 2025)

πŸš€ NEW: Streaming Responses

  • Real-time token generation like ChatGPT/Claude
  • 2.5x faster perceived speed with immediate response start
  • Sentence-level TTS integration for natural conversation flow
  • Comprehensive error handling with fallback to non-streaming

🎡 Enhanced TTS with G2P

  • Misaki G2P engine for improved pronunciation
  • Custom pronunciation dictionary for technical terms
  • Multi-engine TTS support (Kokoro, Edge-TTS, pyttsx3 fallbacks)
  • Emotion-aware voice selection with different speaking styles

πŸ”§ Technical Improvements

  • Unicode-safe logging - all emoji/special character issues resolved
  • Optimized AI model: Now using gemma3:1b-it-q4_K_M for better performance
  • Advanced error monitoring with automatic recovery
  • Memory optimizations and cleanup management

Features

🎭 Sophisticated Personality

  • Dual Modes: Serious (business) and Free (casual) personality modes
  • Intelligent Switching: Automatically detects context and switches modes
  • Sarcastic & Witty: Light humor and contextual sarcasm in casual mode
  • Professional: Efficient and formal communication in serious mode

🧠 Hybrid Intelligence

  • Local AI: Ollama + Phi-3 Mini for privacy and speed
  • Cloud Fallback: Google Gemini API for complex reasoning
  • Smart Routing: Automatically chooses best AI service for each task

πŸ’Ύ Incredible Memory

  • Conversation History: Remembers your chats and context
  • User Preferences: Learns and stores your preferences
  • Personal Facts: Remembers important information about you
  • Smart Retention: Automatic cleanup of old, unimportant memories

🎀 Voice Interaction

  • Speech-to-Text: OpenAI Whisper (local processing)
  • Text-to-Speech: pyttsx3 with personality-matched voice tones
  • Wake Words: "Hey Autumn" voice activation
  • Mode-Aware: Voice changes between serious and casual modes

πŸ–₯️ Beautiful GUI

  • Draggable Widget: Always-on-top, movable interface
  • Autumn Ball: Compact autumn leaf icon when minimized
  • Expandable Chat: Full chat interface when expanded
  • Modern Design: Autumn-themed colors and styling

πŸ”— Smart Integrations

  • Calendar: Schedule management and conflict detection
  • Web Search: Controlled access to weather, news, and information
  • Task Management: Proactive reminders and follow-ups

Hardware Requirements

Optimized for lightweight systems:

  • RAM: 4GB minimum, 8GB+ recommended
  • Storage: 5GB for models and dependencies
  • GPU: Optional (RTX 3050 4GB works perfectly)
  • CPU: Any modern processor (tested on Ryzen 5 5500H)

Resource Usage:

  • Memory: <200MB typical usage
  • CPU: <50% on dual-core systems
  • Storage: ~3GB for AI models

πŸš€ Quick Start

Prerequisites

  • Python 3.8+
  • Ollama installed and running
  • GPU recommended (CUDA/ROCm) for best performance

Installation

  1. Clone the repository

    git clone https://github.com/shhreyuuFTW/Autumn_AI_Assistant.git
    cd Autumn_AI_Assistant
  2. Install dependencies

    pip install -r requirements.txt
  3. Set up Ollama model

    ollama pull gemma3:1b-it-q4_K_M
  4. Configure API keys (optional)

    # Copy example config and edit with your API keys
    cp config/settings.py.example config/settings.py
    # Edit config/settings.py with your Gemini API key if using cloud features
  5. Run Autumn

    python app.py

🌊 Streaming Demo

Experience the real-time streaming responses:

python test_streaming_simple.py

This demonstrates:

  • Token-by-token generation (like ChatGPT)
  • Sentence detection for TTS integration
  • Performance comparison with traditional non-streaming

🎡 G2P Enhanced TTS Demo

Test the improved pronunciation:

python demo_g2p_fixed.py

Compare traditional vs G2P-enhanced pronunciation with difficult words.

Project Structure

Autumn_AI_Assistant/
β”œβ”€β”€ app.py                 # Main application entry point
β”œβ”€β”€ requirements.txt       # Python dependencies
β”œβ”€β”€ setup.py              # Automated setup script
β”œβ”€β”€ .env                  # Environment configuration
β”œβ”€β”€ config/
β”‚   β”œβ”€β”€ personality.py    # Autumn's personality system
β”‚   └── settings.py       # Application settings
β”œβ”€β”€ core/
β”‚   β”œβ”€β”€ brain.py         # AI intelligence and routing
β”‚   β”œβ”€β”€ memory.py        # Memory and storage system
β”‚   └── voice.py         # Voice input/output
β”œβ”€β”€ services/
β”‚   β”œβ”€β”€ gemini.py        # Gemini API integration
β”‚   β”œβ”€β”€ web_search.py    # Web information access
β”‚   └── calendar.py      # Calendar integration
β”œβ”€β”€ gui/
β”‚   └── widget.py        # PyQt6 GUI interface
└── data/
    └── autumn_memory.db # SQLite memory database

Configuration

Environment Variables

# Required
GEMINI_API_KEY=your_key_here

# Optional Performance
MEMORY_LIMIT_MB=200
CPU_LIMIT_PERCENT=50

# Optional Features
AUTO_START_VOICE=true
DEBUG=false
LOG_LEVEL=INFO

Voice Settings

  • STT Model: Whisper Tiny (39MB, fast)
  • TTS Engine: pyttsx3 (local, customizable)
  • Wake Words: "autumn", "hey autumn"

AI Models

  • Primary: Ollama Phi-3 Mini (2.3GB, local)
  • Fallback: Google Gemini Flash (cloud)

Usage Examples

Text Interaction

You: "Schedule a meeting with John tomorrow at 2 PM"
Autumn: "I'll schedule that meeting for you! Let me check for any conflicts... Done! Meeting with John set for tomorrow at 2 PM. Anything else you need, sir? 😊"

Voice Interaction

You: "Hey Autumn"
Autumn: "Yes sir? How can I help you today?"
You: "What's my schedule like?"
Autumn: "Let me check your calendar... You have three meetings today: 9 AM with the team, 1 PM lunch with Sarah, and 4 PM project review. Looks like a busy but manageable day!"

Mode Switching

You: "Urgent: I need the quarterly report ASAP"
Autumn: [Serious Mode] "Understood. I'll locate the quarterly report immediately and have it ready for you. Processing now."

You: "Thanks, you're awesome!"
Autumn: [Free Mode] "Aww, thank you! Just doing my job... though I do it with exceptional style! 😏"

Technical Architecture

Local-First Design

  • Privacy: Voice and personal data stay on your machine
  • Speed: Local processing for instant responses
  • Reliability: Works offline for basic interactions
  • Efficiency: Optimized for minimal resource usage

Intelligent Routing

User Input β†’ Mode Detection β†’ Task Classification
                ↓
        β”Œβ”€β”€β”€ Simple Chat ─── Local AI (Phi-3)
        β”œβ”€β”€β”€ Complex Task ─── Cloud AI (Gemini)
        β”œβ”€β”€β”€ Web Info ─────── Web APIs
        └─── Calendar ─────── Calendar Service
                ↓
        Response + Personality β†’ Voice/Text Output

Memory System

  • SQLite Database: Persistent storage
  • Smart Indexing: Fast retrieval of relevant memories
  • Automatic Cleanup: Configurable retention policies
  • Context Awareness: Uses memory to inform responses

Development

Adding New Features

  1. New Services: Add to services/ directory
  2. Personality Traits: Modify config/personality.py
  3. Memory Types: Extend core/memory.py
  4. GUI Components: Update gui/widget.py

Testing

# Test voice system
python -c "from core.voice import AutumnVoice; import asyncio; asyncio.run(AutumnVoice().initialize())"

# Test AI brain
python -c "from core.brain import AutumnBrain; print('Brain module OK')"

# Test memory
python -c "from core.memory import AutumnMemory; print('Memory module OK')"

Debugging

# Enable debug mode
export DEBUG=true
export LOG_LEVEL=DEBUG

python app.py

Troubleshooting

Common Issues

PyAudio Installation (Windows)

# If PyAudio fails to install
pip install pipwin
pipwin install pyaudio

Ollama Connection

# Check if Ollama is running
ollama list
curl http://localhost:11434/api/tags

Voice Issues

# Test microphone access
python -c "import pyaudio; print('Microphone OK')"

Memory Issues

# Check database
sqlite3 data/autumn_memory.db ".tables"

Performance Optimization

  • Use Whisper "tiny" model for fastest STT
  • Limit conversation history to 50 turns
  • Enable automatic memory cleanup
  • Use local AI for simple interactions

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Test thoroughly
  5. Submit a pull request

License

MIT License - See LICENSE file for details

Acknowledgments

  • OpenAI Whisper for excellent local STT
  • Ollama for making local LLMs accessible
  • PyQt6 for the beautiful GUI framework
  • Google Gemini for powerful cloud AI capabilities

Built with ❀️ for productivity and personality

Autumn AI Assistant - Your sophisticated digital companion πŸ‚

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages