Feature: Notes-Only Backup
Branch: 002-notes-only-backup
Date: 2025-12-28
This document consolidates technical research and design decisions for simplifying the Fellow.app backup tool to exclusively handle notes via the POST /api/v1/notes endpoint. All technical context is well-defined from the existing codebase, so this research focuses on design decisions for the simplification.
Decision: Use POST /api/v1/notes with pagination support for retrieving all notes.
Rationale:
- The spec explicitly mandates using only the POST /api/v1/notes endpoint (FR-002)
- Existing codebase in
src/services/fellow_api.pyalready implements POST request handling with httpx - Pagination is required per FR-003 to handle large note collections
- API response includes note id, content, author information (author_name, author_id), and timestamps (fellow_created_at, fellow_updated_at)
Implementation Details:
- Endpoint accepts pagination parameters (likely
pageandper_pageoroffset/limit) - Response includes notes array and pagination metadata
- Rate limiting must be respected (existing exponential backoff in fellow_api.py)
- Authentication via API token in headers (existing pattern in codebase)
Alternatives Considered:
- GET /api/v1/notes: Not mentioned in spec; POST is explicitly required
- Streaming API: Fellow.app doesn't provide streaming endpoints
- GraphQL: Fellow.app uses REST API
Decision: Create standalone notes table without foreign key dependencies; store author_name directly.
Rationale:
- Removes complexity of managing meetings, workspaces, and participants tables
- Author information (author_name, author_id) is embedded in note response - no separate lookup needed
- Eliminates cascading delete concerns and join queries
- Maintains data integrity by storing complete note records atomically
- Simpler backup/restore operations with single-table design
Schema Design:
CREATE TABLE notes (
id VARCHAR(255) PRIMARY KEY, -- Fellow.app note ID
content TEXT NOT NULL, -- Note text content
author_name VARCHAR(500), -- Author's display name
author_id VARCHAR(255), -- Fellow.app author/user ID
fellow_created_at DATETIME, -- Creation timestamp from Fellow.app
fellow_updated_at DATETIME, -- Last update timestamp from Fellow.app
created_at DATETIME NOT NULL, -- Local DB creation timestamp
updated_at DATETIME NOT NULL, -- Local DB update timestamp
INDEX idx_fellow_updated (fellow_updated_at),
INDEX idx_author_id (author_id)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;Alternatives Considered:
- Keep foreign key to participants table: Rejected - adds unnecessary complexity for notes-only feature
- Store author data in JSON column: Rejected - less queryable and doesn't simplify schema significantly
- Use MEDIUMTEXT or LONGTEXT for content: Rejected - TEXT (65KB) is sufficient for typical notes; can revisit if needed
Decision: Use fellow_updated_at timestamp comparison for incremental updates.
Rationale:
- Fellow.app provides fellow_updated_at timestamp for each note (FR-009)
- Query database for MAX(fellow_updated_at) to find last backup point
- Filter API requests to only retrieve notes modified since that timestamp
- Reduces API load and processing time for subsequent backups (target: <30% of full backup time per SC-003)
- Upsert pattern (INSERT ... ON DUPLICATE KEY UPDATE) handles both new and updated notes
Implementation:
-- Get last backup timestamp
SELECT MAX(fellow_updated_at) FROM notes;
-- Update logic in database service
INSERT INTO notes (id, content, author_name, author_id, fellow_created_at, fellow_updated_at, created_at, updated_at)
VALUES (?, ?, ?, ?, ?, ?, NOW(), NOW())
ON DUPLICATE KEY UPDATE
content = VALUES(content),
author_name = VALUES(author_name),
fellow_updated_at = VALUES(fellow_updated_at),
updated_at = NOW();Alternatives Considered:
- Track deleted notes: Rejected - Fellow.app API doesn't provide deletion events; notes-only backup is for historical preservation
- Use batch timestamps in backup_metadata: Considered but MAX(fellow_updated_at) query is simpler and equally effective
- Version history: Rejected - out of scope; only latest version needed
Decision: Remove all entity models except Note; simplify Note model to remove meeting_id.
Rationale:
- Spec explicitly requires removing meetings, workspaces, action items, and streams
- Note entity becomes standalone - no relationships to manage
- Simplified dataclass with 8 fields matching database schema
- Type safety maintained with Python dataclasses and Optional typing
Model Definition:
@dataclass
class Note:
"""Standalone note model for Fellow.app backup."""
id: str
content: str
author_name: Optional[str] = None
author_id: Optional[str] = None
fellow_created_at: Optional[datetime] = None
fellow_updated_at: Optional[datetime] = None
created_at: Optional[datetime] = None
updated_at: Optional[datetime] = NoneAlternatives Considered:
- Keep Meeting model with nullable fields: Rejected - violates spec requirement to remove meeting references
- Use dict instead of dataclass: Rejected - loses type safety and IDE support
- Pydantic models: Rejected - unnecessary dependency; dataclasses sufficient for internal models
Decision: Simplify backup.py orchestration to single notes-only workflow; remove multi-entity coordination.
Rationale:
- Current backup.py orchestrates meetings → participants → notes → action items sequence
- Simplified version: authenticate → fetch notes (with pagination) → store notes → report
- Remove workspace and meeting retrieval logic
- Single entity type eliminates ordering concerns and dependency management
Service Flow:
- Authenticate with Fellow.app API (existing)
- Determine incremental backup start point (MAX(fellow_updated_at) or None for full backup)
- Fetch notes from POST /api/v1/notes with pagination
- For each note: parse response → create Note model → upsert to database
- Handle failures gracefully (log and continue)
- Generate summary report (counts, errors)
Alternatives Considered:
- Keep existing orchestration structure: Rejected - unnecessary abstraction for single entity
- Async batch processing: Considered but synchronous pagination simpler; can optimize later if needed
- Parallel API requests: Rejected initially - respect rate limits; optimize if performance inadequate
Decision: Retain single backup command; remove meeting/workspace-specific options.
Rationale:
- Current CLI has commands for different entity types and workspaces
- Simplified CLI:
fellow-backup backup(existing command) with options for --full, --incremental, --dry-run - Remove workspace selection since notes are fetched globally
- Maintain existing logging and progress indicator patterns
Command Structure:
fellow-backup backup [OPTIONS]
Options:
--full Force full backup (ignore incremental timestamps)
--dry-run Show what would be backed up without writing to database
--verbose Enable detailed logging
--quiet Suppress progress output (errors only)
--json Output summary report in JSON format
--help Show this message and exitAlternatives Considered:
- Separate
backup-notescommand: Rejected - since notes are the only entity, genericbackupis clearer - Remove --full flag: Rejected - useful for testing and re-syncing
Decision: Continue using existing httpx retry logic with exponential backoff; add per-note error isolation.
Rationale:
- Fellow.app API rate limiting requires retry logic (FR-014)
- Existing fellow_api.py implements retry with exponential backoff
- Individual note failures must not stop entire backup (FR-016)
- Try/except around each note processing; log error with note ID and continue
Error Categories:
- API errors (4xx, 5xx): Retry with backoff; log and skip after max retries
- Rate limiting (429): Exponential backoff with jitter; automatic retry
- Database errors: Rollback transaction; log error; retry note or continue
- Parsing errors: Log note ID and raw response; continue to next note
Alternatives Considered:
- Stop on first error: Rejected - violates FR-016 and resilience principle
- Dead letter queue: Considered but overkill for one-time backup; error log sufficient
- Checkpointing every N notes: Considered but database transactions provide adequate safety
Decision: Contract tests for POST /api/v1/notes endpoint; integration tests for simplified schema; unit tests for note parsing.
Rationale:
- Contract tests verify API endpoint behavior and response structure
- Integration tests validate database upsert logic and incremental backup queries
- Unit tests for Note model creation from API response
- Simplified feature reduces test matrix (no cross-entity validation needed)
Test Coverage:
- Contract: POST /api/v1/notes with pagination, rate limiting, error responses
- Integration: notes table CRUD, incremental backup timestamp logic, UTF-8 content handling
- Unit: Note model instantiation, timestamp parsing, upsert SQL generation
- E2E: Full backup → verify counts → incremental backup → verify only new/updated notes
Alternatives Considered:
- Mock all API calls: Rejected for contract tests - need real endpoint verification
- Test against production API: Rejected - use test/sandbox environment or recorded responses
All technical decisions are straightforward simplifications of the existing codebase. No new technologies or patterns required. The main work is removing code for unused entities while preserving the core backup orchestration, API client, and database service patterns. The notes-only design eliminates foreign key complexity and multi-entity coordination, resulting in a simpler, more focused tool.
None. All technical context is well-defined. Implementation can proceed to Phase 1 (design artifacts).