🍂 Autumn AI Assistant

A sophisticated AI assistant with advanced features including streaming responses, G2P-enhanced text-to-speech, and emotional intelligence.

✨ Latest Features (June 2025)

🚀 NEW: Streaming Responses

Real-time token generation like ChatGPT/Claude
2.5x faster perceived speed with immediate response start
Sentence-level TTS integration for natural conversation flow
Comprehensive error handling with fallback to non-streaming

🎵 Enhanced TTS with G2P

Misaki G2P engine for improved pronunciation
Custom pronunciation dictionary for technical terms
Multi-engine TTS support (Kokoro, Edge-TTS, pyttsx3 fallbacks)
Emotion-aware voice selection with different speaking styles

🔧 Technical Improvements

Unicode-safe logging - all emoji/special character issues resolved
Optimized AI model: Now using gemma3:1b-it-q4_K_M for better performance
Advanced error monitoring with automatic recovery
Memory optimizations and cleanup management

Features

🎭 Sophisticated Personality

Dual Modes: Serious (business) and Free (casual) personality modes
Intelligent Switching: Automatically detects context and switches modes
Sarcastic & Witty: Light humor and contextual sarcasm in casual mode
Professional: Efficient and formal communication in serious mode

🧠 Hybrid Intelligence

Local AI: Ollama + Phi-3 Mini for privacy and speed
Cloud Fallback: Google Gemini API for complex reasoning
Smart Routing: Automatically chooses best AI service for each task

💾 Incredible Memory

Conversation History: Remembers your chats and context
User Preferences: Learns and stores your preferences
Personal Facts: Remembers important information about you
Smart Retention: Automatic cleanup of old, unimportant memories

🎤 Voice Interaction

Speech-to-Text: OpenAI Whisper (local processing)
Text-to-Speech: pyttsx3 with personality-matched voice tones
Wake Words: "Hey Autumn" voice activation
Mode-Aware: Voice changes between serious and casual modes

🖥️ Beautiful GUI

Draggable Widget: Always-on-top, movable interface
Autumn Ball: Compact autumn leaf icon when minimized
Expandable Chat: Full chat interface when expanded
Modern Design: Autumn-themed colors and styling

🔗 Smart Integrations

Calendar: Schedule management and conflict detection
Web Search: Controlled access to weather, news, and information
Task Management: Proactive reminders and follow-ups

Hardware Requirements

Optimized for lightweight systems:

RAM: 4GB minimum, 8GB+ recommended
Storage: 5GB for models and dependencies
GPU: Optional (RTX 3050 4GB works perfectly)
CPU: Any modern processor (tested on Ryzen 5 5500H)

Resource Usage:

Memory: <200MB typical usage
CPU: <50% on dual-core systems
Storage: ~3GB for AI models

🚀 Quick Start

Prerequisites

Python 3.8+
Ollama installed and running
GPU recommended (CUDA/ROCm) for best performance

Installation

Clone the repository

git clone https://github.com/shhreyuuFTW/Autumn_AI_Assistant.git
cd Autumn_AI_Assistant

Install dependencies
```
pip install -r requirements.txt
```
Set up Ollama model
```
ollama pull gemma3:1b-it-q4_K_M
```

Configure API keys (optional)

# Copy example config and edit with your API keys
cp config/settings.py.example config/settings.py
# Edit config/settings.py with your Gemini API key if using cloud features

Run Autumn
```
python app.py
```

🌊 Streaming Demo

Experience the real-time streaming responses:

python test_streaming_simple.py

This demonstrates:

Token-by-token generation (like ChatGPT)
Sentence detection for TTS integration
Performance comparison with traditional non-streaming

🎵 G2P Enhanced TTS Demo

Test the improved pronunciation:

python demo_g2p_fixed.py

Compare traditional vs G2P-enhanced pronunciation with difficult words.

Project Structure

Autumn_AI_Assistant/
├── app.py                 # Main application entry point
├── requirements.txt       # Python dependencies
├── setup.py              # Automated setup script
├── .env                  # Environment configuration
├── config/
│   ├── personality.py    # Autumn's personality system
│   └── settings.py       # Application settings
├── core/
│   ├── brain.py         # AI intelligence and routing
│   ├── memory.py        # Memory and storage system
│   └── voice.py         # Voice input/output
├── services/
│   ├── gemini.py        # Gemini API integration
│   ├── web_search.py    # Web information access
│   └── calendar.py      # Calendar integration
├── gui/
│   └── widget.py        # PyQt6 GUI interface
└── data/
    └── autumn_memory.db # SQLite memory database

Configuration

Environment Variables

# Required
GEMINI_API_KEY=your_key_here

# Optional Performance
MEMORY_LIMIT_MB=200
CPU_LIMIT_PERCENT=50

# Optional Features
AUTO_START_VOICE=true
DEBUG=false
LOG_LEVEL=INFO

Voice Settings

STT Model: Whisper Tiny (39MB, fast)
TTS Engine: pyttsx3 (local, customizable)
Wake Words: "autumn", "hey autumn"

AI Models

Primary: Ollama Phi-3 Mini (2.3GB, local)
Fallback: Google Gemini Flash (cloud)

Usage Examples

Text Interaction

You: "Schedule a meeting with John tomorrow at 2 PM"
Autumn: "I'll schedule that meeting for you! Let me check for any conflicts... Done! Meeting with John set for tomorrow at 2 PM. Anything else you need, sir? 😊"

Voice Interaction

You: "Hey Autumn"
Autumn: "Yes sir? How can I help you today?"
You: "What's my schedule like?"
Autumn: "Let me check your calendar... You have three meetings today: 9 AM with the team, 1 PM lunch with Sarah, and 4 PM project review. Looks like a busy but manageable day!"

Mode Switching

You: "Urgent: I need the quarterly report ASAP"
Autumn: [Serious Mode] "Understood. I'll locate the quarterly report immediately and have it ready for you. Processing now."

You: "Thanks, you're awesome!"
Autumn: [Free Mode] "Aww, thank you! Just doing my job... though I do it with exceptional style! 😏"

Technical Architecture

Local-First Design

Privacy: Voice and personal data stay on your machine
Speed: Local processing for instant responses
Reliability: Works offline for basic interactions
Efficiency: Optimized for minimal resource usage

Intelligent Routing

User Input → Mode Detection → Task Classification
                ↓
        ┌─── Simple Chat ─── Local AI (Phi-3)
        ├─── Complex Task ─── Cloud AI (Gemini)
        ├─── Web Info ─────── Web APIs
        └─── Calendar ─────── Calendar Service
                ↓
        Response + Personality → Voice/Text Output

Memory System

SQLite Database: Persistent storage
Smart Indexing: Fast retrieval of relevant memories
Automatic Cleanup: Configurable retention policies
Context Awareness: Uses memory to inform responses

Development

Adding New Features

New Services: Add to services/ directory
Personality Traits: Modify config/personality.py
Memory Types: Extend core/memory.py
GUI Components: Update gui/widget.py

Testing

# Test voice system
python -c "from core.voice import AutumnVoice; import asyncio; asyncio.run(AutumnVoice().initialize())"

# Test AI brain
python -c "from core.brain import AutumnBrain; print('Brain module OK')"

# Test memory
python -c "from core.memory import AutumnMemory; print('Memory module OK')"

Debugging

# Enable debug mode
export DEBUG=true
export LOG_LEVEL=DEBUG

python app.py

Troubleshooting

Common Issues

PyAudio Installation (Windows)

# If PyAudio fails to install
pip install pipwin
pipwin install pyaudio

Ollama Connection

# Check if Ollama is running
ollama list
curl http://localhost:11434/api/tags

Voice Issues

# Test microphone access
python -c "import pyaudio; print('Microphone OK')"

Memory Issues

# Check database
sqlite3 data/autumn_memory.db ".tables"

Performance Optimization

Use Whisper "tiny" model for fastest STT
Limit conversation history to 50 turns
Enable automatic memory cleanup
Use local AI for simple interactions

Contributing

Fork the repository
Create a feature branch
Make your changes
Test thoroughly
Submit a pull request

License

MIT License - See LICENSE file for details

Acknowledgments

OpenAI Whisper for excellent local STT
Ollama for making local LLMs accessible
PyQt6 for the beautiful GUI framework
Google Gemini for powerful cloud AI capabilities

Built with ❤️ for productivity and personality

Autumn AI Assistant - Your sophisticated digital companion 🍂

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
config		config
core		core
data		data
gui		gui
services		services
.env.example		.env.example
.gitignore		.gitignore
AUDIO_CLEANUP_FIX_SUMMARY.md		AUDIO_CLEANUP_FIX_SUMMARY.md
Autumn_Persona_Report.md		Autumn_Persona_Report.md
Autumn_Project_History.md		Autumn_Project_History.md
Autumn_Quick_Reference.md		Autumn_Quick_Reference.md
COMPLETE_ENHANCEMENT_SUMMARY.md		COMPLETE_ENHANCEMENT_SUMMARY.md
COMPLETION_SUMMARY.md		COMPLETION_SUMMARY.md
ENHANCEMENT_GUIDE.md		ENHANCEMENT_GUIDE.md
FIXES_SUMMARY.md		FIXES_SUMMARY.md
G2P_INTEGRATION.md		G2P_INTEGRATION.md
KOKORO_BENCHMARK_RESULTS.md		KOKORO_BENCHMARK_RESULTS.md
KOKORO_OPTIMIZATION_SUMMARY.md		KOKORO_OPTIMIZATION_SUMMARY.md
KOKORO_V1_SETUP_COMPLETE.md		KOKORO_V1_SETUP_COMPLETE.md
PIPER_TTS_SUCCESS_REPORT.md		PIPER_TTS_SUCCESS_REPORT.md
README.md		README.md
README_backup.md		README_backup.md
STREAMING_GUIDE.md		STREAMING_GUIDE.md
VOICE_ACHIEVEMENTS.md		VOICE_ACHIEVEMENTS.md
VOICE_IMPROVEMENTS_SUMMARY.md		VOICE_IMPROVEMENTS_SUMMARY.md
app.py		app.py
app_cli.py		app_cli.py
autumn_chat.py		autumn_chat.py
chat_with_autumn.py		chat_with_autumn.py
clean_emoji_logging.py		clean_emoji_logging.py
demo_audio_fixed.py		demo_audio_fixed.py
demo_g2p_fixed.py		demo_g2p_fixed.py
demo_g2p_tts.py		demo_g2p_tts.py
demo_streaming.py		demo_streaming.py
demo_tts_comparison.py		demo_tts_comparison.py
error_diagnostic_enhanced.py		error_diagnostic_enhanced.py
fix_indentation.py		fix_indentation.py
kokoro_benchmark.py		kokoro_benchmark.py
optimization_benchmark.py		optimization_benchmark.py
quick_test.py		quick_test.py
requirements.txt		requirements.txt
setup.py		setup.py
simple_voice_test.py		simple_voice_test.py
system_diagnostic.py		system_diagnostic.py
test_autumn_kokoro_integration.py		test_autumn_kokoro_integration.py
test_brain_fix.py		test_brain_fix.py
test_complete_integration.py		test_complete_integration.py
test_g2p.py		test_g2p.py
test_g2p_tts_integration.py		test_g2p_tts_integration.py
test_kokoro_v1.py		test_kokoro_v1.py
test_streaming_simple.py		test_streaming_simple.py
test_unicode_logging.py		test_unicode_logging.py
test_unicode_simple.py		test_unicode_simple.py
tts_optimization_test.py		tts_optimization_test.py
verify_enhancements.py		verify_enhancements.py
verify_kokoro.py		verify_kokoro.py

Folders and files

Latest commit

History

Repository files navigation

🍂 Autumn AI Assistant

✨ Latest Features (June 2025)

🚀 NEW: Streaming Responses

🎵 Enhanced TTS with G2P

🔧 Technical Improvements

Features

🎭 Sophisticated Personality

🧠 Hybrid Intelligence

💾 Incredible Memory

🎤 Voice Interaction

🖥️ Beautiful GUI

🔗 Smart Integrations

Hardware Requirements

🚀 Quick Start

Prerequisites

Installation

🌊 Streaming Demo

🎵 G2P Enhanced TTS Demo

Project Structure

Configuration

Environment Variables

Voice Settings

AI Models

Usage Examples

Text Interaction

Voice Interaction

Mode Switching

Technical Architecture

Local-First Design

Intelligent Routing

Memory System

Development

Adding New Features

Testing

Debugging

Troubleshooting

Common Issues

Performance Optimization

Contributing

License

Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages