Date: 2025-02-08
Status: ✅ COMPLETE
- ✅ Added
_fetch_google_docs_content()method to Agent 0.0 - ✅ Uses
GoogleDriveClientto fetch actual document content - ✅ Extracts file IDs from document metadata
- ✅ Fetches content for top 10 most recent documents
- ✅ Truncates content to 5000 chars to avoid token limits
- ✅ Google Docs content included in Gemini analysis when available
- ✅ Content analysis provides deeper insights than metadata alone:
- Actual topics and concepts user works on
- Technical depth and complexity
- Learning patterns from document content
- Project identification from content
- ✅ Optional
google_access_tokenparameter - ✅ Placeholder for automatic token retrieval from database
- ✅ Graceful fallback to metadata-only if token not available
- Agent 0.0 Called →
POST /api/agents/persona-architect - Fetch Docs Metadata → From database
- Get Access Token → From request or database (future)
- Fetch Docs Content → Using Google Drive API
- Build Analysis Prompt → Includes content if available
- Gemini Analysis → Uses content for deeper insights
- Build Persona Card → Enhanced with content insights
For each document (top 10):
↓
Extract Google Doc file ID from metadata
↓
Call Google Drive API export_media()
↓
Get document content as text
↓
Truncate to 5000 chars (if needed)
↓
Include in analysis prompt
POST /api/agents/persona-architect
{
"include_docs": true,
"google_access_token": "ya29.a0AfH6SMB..." # Optional
}When content is fetched, the analysis prompt includes:
- Actual document text (up to 5000 chars per doc)
- Content length information
- File IDs for reference
If no token provided:
- ✅ Still works with metadata only
- ✅ Uses document titles and metadata
⚠️ Less accurate persona analysis
With Content:
- Analyze technical depth in documents
- Identify specific technologies/concepts used
- Determine actual skill level from code/examples
Without Content:
- Only titles and metadata
- Less accurate expertise assessment
With Content:
- Identify actual project topics from content
- See project structure and goals
- Understand project complexity
Without Content:
- Only document titles
- Less project detail
With Content:
- Extract actual topics from document text
- Identify recurring themes
- Understand learning focus areas
Without Content:
- Infer from titles only
- Less topic accuracy
With Content:
- Identify learning areas from tutorial content
- See questions/problems being solved
- Understand confusion patterns
Without Content:
- Infer from document types
- Less gap identification
- ✅ Only documents user has access to
- ✅ Content truncated to 5000 chars per doc
- ✅ Top 10 documents only
- ✅ Requires explicit access token
- ❌ Private documents without permission
- ❌ Full document content (truncated)
- ❌ Documents outside user's access
-
✅
backend/agents/persona_architect.py- Added
_fetch_google_docs_content()method - Added
_get_user_google_token()placeholder - Updated
process()to fetch content - Enhanced
_build_analysis_prompt()with content
- Added
-
✅
backend/routes/agents.py- Added
google_access_tokenparameter
- Added
- Token must be passed explicitly in request
- Placeholder for database token storage
# TODO: When TOKENS table is created
async def _get_user_google_token(self, user_id: str):
# Query TOKENS table for user's Google access token
# Refresh token if expired
# Return valid access token# 1. Get Google OAuth access token (from frontend)
# 2. Call Agent 0.0
curl -X POST http://localhost:8000/api/agents/persona-architect \
-H "Authorization: Bearer YOUR_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"google_access_token": "ya29.a0AfH6SMB...",
"include_docs": true
}'
# 3. Check response - should include content analysis# Should still work with metadata only
curl -X POST http://localhost:8000/api/agents/persona-architect \
-H "Authorization: Bearer YOUR_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"include_docs": true
}'✅ Agent 0.0 - Can fetch Google Docs content
✅ Google Drive Client - Ready to use
✅ Content Analysis - Included in persona building
- Create TOKENS Table - Store access tokens securely
- Auto Token Retrieval - Get token from database automatically
- Token Refresh - Handle expired tokens
- Content Caching - Cache document content to reduce API calls
Status: ✅ READY - Agent 0.0 can analyze Google Docs content when access token is provided