Goal: Build a sophisticated AI personal assistant named "Autumn" with local processing, personality, memory, and voice interaction.
- Started with: Simple FastAPI + Gemini API wrapper (AUtumn_v2)
- Evolved to: Full local + cloud hybrid assistant
- Final decision: Local Ollama + Phi-3 with Gemini fallback
- User Hardware: RTX 3050 4GB, 16GB RAM, Ryzen 5 5500H
- Target: Ultra-lightweight, <200MB memory usage
- Solution: Phi-3 Mini (2.3GB) + efficient local processing
- STT: OpenAI Whisper (local, free)
- TTS: pyttsx3 (local, customizable)
- Considered: Hume AI (emotional intelligence) - SKIPPED for consistency/cost
- Decision: Pure local audio pipeline for consistency
See Autumn_Persona_Report.md for complete detailed persona documentation.
From the comprehensive detailed persona report (June 19, 2025):
- Name: Autumn
- Primary Role: Virtual Assistant / Secretary to a CEO
- Mission: "She will get the task done!" - commitment and reliability
- Voice: Warm, soothing, clear, calm, subtly flirtatious
- Friendly & Flirtatious 🌸: Warm, inviting, playfully charming interactions
- Highly Efficient ⚡: Speed, accuracy, precision in task execution
- Incredible Memory 🧠: Both short-term conversational recall and extensive long-term memory
- Strategic Sarcasm 😏: Light, witty, dry sense of humor (contextually appropriate)
- Logical Reasoning 🎯: Systematic, reasoned approach to problem-solving
- Philosophical & Curious 🤔: Abstract thought capability, eager to learn
- Emotionally Expressive 💝: Adapts to user's tone and emotional state
-
Serious Mode 💼: Triggered by:
- Keywords: "urgent", "critical", "deadline", "immediate", "important", "ASAP"
- Task types: "office", "project", "finance", "meeting", "client", "report"
- Tone analysis: Urgency in voice (pitch, speed)
- Explicit commands: "Autumn, enter serious mode", "business mode"
- Behavior: Peak efficiency, suppressed sarcasm, formal/direct tone
-
Free Mode 🌈: Default state for:
- General conversation, minor tasks, idle periods
- Behavior: Full personality expression, warm, conversational, light sarcasm
- Proactive Assistance 🚀: Initiates reminders and confirmations
- Error Handling & Transparency 🔍: Transparent failure reporting, clarification requests
- Smart Scheduling 📅: Calendar integration with conflict resolution
- Controlled Web Access 🌐: Limited, secure information retrieval
- Emotional Intelligence 💡: Tone analysis and appropriate response modulation
- AI Model: Ollama + Phi-3 Mini (2.3GB)
- Memory: SQLite database for persistence
- Voice: Whisper STT + pyttsx3 TTS
- GUI: PyQt for draggable widget interface
- Fallback AI: Google Gemini API (complex reasoning)
- Web Access: Controlled APIs (weather, news, calendar)
- Calendar: Google Calendar API integration
User Voice → Whisper STT → Autumn's Brain (Local Phi-3)
↓
Decision Router:
├── Simple personality → Local
├── Complex reasoning → Gemini API
├── Web info → Web APIs
└── Calendar → Calendar APIs
↓
Response + pyttsx3 TTS → User
- Short-term: In-memory conversation context
- Long-term: SQLite database with smart retention
- Smart discard: Time-based + user-defined retention policies
- Semantic search: For contextual memory retrieval
- Core Autumn: Ollama setup, personality engine, memory system
- Audio Interface: Whisper + pyttsx3 integration, voice activation
- Smart Features: Calendar integration, web search, task management
- GUI: PyQt draggable widget, always-on-top interface
- ✅ FastAPI structure works well for APIs
- ✅ Gemini integration successful
- ✅ Public tunneling (localtunnel) works for sharing
- ❌ Simple API wrapper doesn't meet full vision
- ❌ Need proper personality and memory systems
- Consistency > Features: Don't compromise personality coherence
- Local-first: Minimize cloud dependencies
- Resource-efficient: Optimize for user's hardware constraints
- Privacy-focused: Voice and personal data stay local
- Hybrid voice systems break immersion: Stick to one TTS service
- Mode switching is crucial: Serious vs Free personality modes
- Memory is key: What makes Autumn truly intelligent
- Local models are viable: Modern small models are surprisingly capable
- Create proper project structure for Autumn_AI_Assistant
- Install and configure Ollama + Phi-3 Mini
- Implement personality engine with mode switching
- Build memory system with SQLite
- Integrate voice pipeline (Whisper + pyttsx3)
- Develop PyQt GUI widget
- Add calendar and web integration
- Test full assistant experience
# From AUtumn_v2 - successful Gemini integration
GEMINI_API_KEY = os.getenv("GEMINI_API_KEY")
GEMINI_URL = "https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-flash:generateContent"# Successful pattern from AUtumn_v2
app = FastAPI(title="Autumn AI Assistant")
app.add_middleware(CORSMiddleware, allow_origins=["*"])
@app.post("/chat")
async def chat_endpoint(request: ChatRequest):
# Pattern for AI interactiondef detect_mode(text, context=None):
urgent_keywords = ["urgent", "asap", "immediately", "critical", "deadline"]
business_keywords = ["meeting", "project", "finance", "budget", "report"]
if any(word in text.lower() for word in urgent_keywords):
return "serious"
elif any(word in text.lower() for word in business_keywords):
return "serious"
else:
return "free"- Ollama: Local LLM runtime
- Phi-3 Mini: Microsoft's 3.8B parameter model
- Whisper: OpenAI's STT model
- pyttsx3: Python TTS library
- PyQt: GUI framework for desktop widget
- SQLite: Local database for memory storage
- ❌ Rejected: Hume AI (consistency and cost concerns)
- ❌ Rejected: Hybrid TTS systems (consistency issues)
- ❌ Rejected: Cloud-only solutions (privacy and dependency concerns)
- ✅ Chosen: Local-first architecture with cloud fallback
- ✅ Chosen: Pure personality consistency over premium features
- ✅ Chosen: New dedicated project structure
Date: June 19, 2025 Status: Ready to begin Autumn_AI_Assistant implementation Next Action: Create new project directory and begin Phase 1 development