Skip to content

feat(db): implement database configuration, migrations, and CLI commands#2

Merged
raychrisgdp merged 19 commits intomainfrom
c/pr-001-db-config
Dec 31, 2025
Merged

feat(db): implement database configuration, migrations, and CLI commands#2
raychrisgdp merged 19 commits intomainfrom
c/pr-001-db-config

Conversation

@raychrisgdp
Copy link
Copy Markdown
Owner

@raychrisgdp raychrisgdp commented Dec 31, 2025

Summary

Issues & Goals

  • Database foundation: TaskGenie needed a reliable local-first database foundation with SQLite initialization, schema migrations, and backup/restore capabilities
  • Configuration management: Standardize configuration loading with clear precedence (env vars → .env~/.taskgenie/config.toml → defaults) for consistent app behavior across environments
  • Developer experience: Enable safe schema evolution through Alembic migrations and provide CLI commands for database management (tgenie db upgrade, downgrade, dump, restore)
  • Data persistence: Establish canonical data paths and ensure automatic database creation on first run

Implementation Highlights

  • Configuration system: Implemented Pydantic Settings with multi-source precedence (backend/config.py): env vars → .env file → ~/.taskgenie/config.toml → built-in defaults, with TOML config file support and automatic app directory creation
  • Database initialization: Complete SQLAlchemy async engine setup (backend/database.py) with automatic migration execution on startup, foreign key enforcement for SQLite, and proper session management with dependency injection support
  • Alembic migrations: Initialized Alembic configuration (backend/migrations/) with SQLite-specific setup, initial schema migration (001_initial_schema.py) creating tables for tasks, attachments, notifications, chat_history, config, and alembic_version
  • CLI database commands: Implemented tgenie db subcommands (backend/cli/db.py): upgrade, downgrade, revision (with autogenerate), dump, and restore with confirmation prompts for destructive operations
  • Data models: Created SQLAlchemy models (backend/models/) for Task, Attachment, Notification, ChatHistory, and Config with proper relationships, JSON fields, and timestamp management
  • FastAPI integration: Updated backend/main.py with lifespan management to initialize database on startup and close connections on shutdown
  • Comprehensive testing: Added 1622+ lines of test coverage (tests/) including configuration tests, database initialization tests, migration tests, CLI command tests, and model validation tests
  • Documentation: Added migration guide (docs/02-implementation/MIGRATIONS.md), agent guide (docs/02-implementation/AGENT_GUIDE.md), and updated PR specifications

Key Files Modified:

  • backend/config.py: Complete rewrite with Pydantic Settings, TOML config support, and multi-source precedence (250 lines added)
  • backend/database.py: Database initialization with async SQLAlchemy, automatic migrations, and session management (175 lines added)
  • backend/cli/db.py: New CLI module for database management commands (174 lines)
  • backend/migrations/: Alembic configuration and initial schema migration (new directory)
  • backend/models/: SQLAlchemy models for Task, Attachment, Notification, ChatHistory, Config (new files)
  • backend/main.py: Added lifespan management for database initialization (30 lines modified)
  • tests/: Comprehensive test suite covering config, database, migrations, CLI, and models (1622+ lines added)
  • docs/02-implementation/MIGRATIONS.md: Migration guide with examples (219 lines)
  • docs/02-implementation/AGENT_GUIDE.md: Agent workflow guide (811 lines)
  • pyproject.toml: Added dependencies (alembic, pydantic-settings, tomllib) and test configuration (71 lines modified)

How to Test

Database Configuration Testing

Configuration Precedence Verification:

  1. Test environment variable override:

    export TASKGENIE_DATA_DIR=/tmp/test-taskgenie
    export DATABASE_URL=sqlite+aiosqlite:////tmp/test-taskgenie/data/taskgenie.db
    uv run python -c "from backend.config import get_settings; s = get_settings(); print(s.database_url_resolved)"
    # Expected: Uses DATABASE_URL from env var
  2. Test .env file:

    # Create .env file with DATABASE_URL
    echo "DATABASE_URL=sqlite+aiosqlite:////tmp/test-env.db" > .env
    uv run python -c "from backend.config import get_settings; s = get_settings(); print(s.database_url_resolved)"
    # Expected: Uses DATABASE_URL from .env
  3. Test config.toml file:

    # Create ~/.taskgenie/config.toml
    mkdir -p ~/.taskgenie
    cat > ~/.taskgenie/config.toml << EOF
    [database]
    url = "sqlite+aiosqlite:////tmp/test-config.db"
    EOF
    unset DATABASE_URL
    rm -f .env
    uv run python -c "from backend.config import get_settings; s = get_settings(); print(s.database_url_resolved)"
    # Expected: Uses database.url from config.toml
  4. Test default behavior:

    unset DATABASE_URL
    rm -f .env
    rm -f ~/.taskgenie/config.toml
    uv run python -c "from backend.config import get_settings; s = get_settings(); print(s.database_url_resolved)"
    # Expected: Uses default path ~/.taskgenie/data/taskgenie.db

Database Initialization Testing

Fresh Install Test (AC1):

  1. Clean start:

    # Remove existing database
    rm -rf ~/.taskgenie/data/taskgenie.db
    
    # Start FastAPI app (should auto-create DB and run migrations)
    uv run uvicorn backend.main:app --reload
  2. Verify database creation:

    # In another terminal, check database exists
    ls -la ~/.taskgenie/data/taskgenie.db
    # Expected: File exists
    
    # Verify tables exist
    sqlite3 ~/.taskgenie/data/taskgenie.db ".tables"
    # Expected: tasks, attachments, notifications, chat_history, config, alembic_version
    
    # Verify alembic_version table
    sqlite3 ~/.taskgenie/data/taskgenie.db "SELECT * FROM alembic_version;"
    # Expected: Shows revision "001_initial"
  3. Test health endpoint:

    curl http://localhost:8000/health
    # Expected: {"status":"ok","version":"..."}

CLI Database Commands Testing

Migration Commands:

  1. Upgrade database:

    uv run tgenie db upgrade
    # Expected: "✓ Database upgraded to head"
  2. Check current revision:

    sqlite3 ~/.taskgenie/data/taskgenie.db "SELECT * FROM alembic_version;"
    # Expected: Shows "001_initial"
  3. Create new migration (if schema changes):

    # Modify a model (e.g., add a column)
    # Then create migration:
    uv run tgenie db revision -m "add_test_column" --autogenerate
    # Expected: Creates new migration file in backend/migrations/versions/
  4. Apply new migration:

    uv run tgenie db upgrade head
    # Expected: "✓ Database upgraded to head"
  5. Test downgrade:

    uv run tgenie db downgrade -1
    # Expected: "✓ Database downgraded to -1"

Backup/Restore Commands:

  1. Create test data:

    # Start API and create a task via API or directly in DB
    sqlite3 ~/.taskgenie/data/taskgenie.db "INSERT INTO tasks (id, title, status) VALUES ('test-123', 'Test Task', 'pending');"
  2. Dump database:

    uv run tgenie db dump --out backup.sql
    # Expected: Creates backup.sql file
    # Verify: cat backup.sql | grep "Test Task"
  3. Restore database:

    # Delete existing database
    rm ~/.taskgenie/data/taskgenie.db
    
    # Restore from backup
    uv run tgenie db restore --in backup.sql
    # Expected: Prompts for confirmation, then restores database
    
    # Verify data restored
    sqlite3 ~/.taskgenie/data/taskgenie.db "SELECT * FROM tasks WHERE id='test-123';"
    # Expected: Shows "Test Task"

Data Model Testing

Model Validation:

  1. Test Task model:

    uv run python -c "
    from backend.models.task import Task
    from backend.database import init_db, get_db
    import asyncio
    from sqlalchemy import select
    
    async def test():
        init_db()
        async for db in get_db():
            # Create task
            task = Task(id='test-1', title='Test', status='pending')
            db.add(task)
            await db.commit()
            
            # Query task
            result = await db.execute(select(Task).where(Task.id == 'test-1'))
            task = result.scalar_one()
            print(f'Task: {task.title}, Status: {task.status}')
    
    asyncio.run(test())
    "
    # Expected: Prints "Task: Test, Status: pending"
  2. Test relationships:

    # Test Task -> Attachment relationship
    uv run python -c "
    from backend.models.task import Task
    from backend.models.attachment import Attachment
    from backend.database import init_db, get_db
    import asyncio
    
    async def test():
        init_db()
        async for db in get_db():
            task = Task(id='task-1', title='Test Task', status='pending')
            attachment = Attachment(id='att-1', task_id='task-1', type='file', reference='test.txt')
            task.attachments.append(attachment)
            db.add(task)
            await db.commit()
            
            # Verify relationship
            print(f'Task has {len(task.attachments)} attachments')
    
    asyncio.run(test())
    "
    # Expected: Prints "Task has 1 attachments"

Integration Testing

End-to-End Workflow:

  1. Complete setup flow:

    # Fresh install
    rm -rf ~/.taskgenie/data
    
    # Start API (should auto-create DB)
    uv run uvicorn backend.main:app --reload &
    API_PID=$!
    
    # Wait for startup
    sleep 2
    
    # Test health endpoint
    curl http://localhost:8000/health
    # Expected: {"status":"ok","version":"..."}
    
    # Stop API
    kill $API_PID
  2. CLI-to-API integration:

    # Start API
    uv run uvicorn backend.main:app --reload &
    API_PID=$!
    
    # Use CLI to manage database
    uv run tgenie db upgrade
    
    # Verify via API health check
    curl http://localhost:8000/health
    
    # Stop API
    kill $API_PID

Automated Testing

Run test suite:

# Run all tests
make test
# or
uv run pytest -v

# Run specific test categories
uv run pytest tests/test_config.py -v
uv run pytest tests/test_database.py -v
uv run pytest tests/cli/test_db.py -v
uv run pytest tests/models/ -v

Expected Test Coverage:

  • Configuration precedence and validation (398 lines in tests/test_config.py)
  • Database initialization and session management (373 lines in tests/test_database.py)
  • CLI database commands (465 lines in tests/cli/test_db.py)
  • Model validation and relationships (252 lines in tests/models/)
  • Migration environment setup (204 lines in tests/test_migrations_env.py)

Expected Behavior

  • Configuration: Settings load correctly with proper precedence (env vars override .env, which overrides config.toml, which overrides defaults)
  • Database: Database file is created automatically on first run at ~/.taskgenie/data/taskgenie.db (or configured path)
  • Migrations: Initial schema migration runs automatically on startup if database doesn't exist or alembic_version table is missing
  • CLI Commands: All tgenie db commands execute successfully with proper error handling and user feedback
  • Models: SQLAlchemy models work correctly with relationships, JSON fields, and timestamp management
  • API: FastAPI app starts successfully, health endpoint returns 200 OK, database connections work correctly

Related Issues

  • Implements PR-001: Database & Configuration (see docs/02-implementation/pr-specs/PR-001-db-config.md)

Author Checklist

  • Synced with latest main branch
  • Self-reviewed
  • All tests pass locally
  • Documentation updated
  • Code follows project style guidelines
  • Type hints added where applicable
  • No breaking changes introduced

Additional Notes

Key Implementation Areas for Review

Configuration System (backend/config.py):

  • Pydantic Settings implementation: Multi-source precedence with env vars → .env~/.taskgenie/config.toml → defaults
  • TOML config support: Custom TaskGenieTomlSettingsSource for loading and flattening TOML configuration files
  • App directory management: ensure_app_dirs() method creates required directories (data/, cache/, logs/, vector_store/) with proper permissions
  • Database URL resolution: database_url_resolved property handles SQLite path resolution and async driver configuration
  • Review focus: Verify precedence order works correctly, TOML parsing handles edge cases, and directory creation is idempotent

Database Initialization (backend/database.py):

  • Async SQLAlchemy setup: Async engine and sessionmaker with proper connection pooling
  • Automatic migrations: _run_migrations_if_needed() checks for database existence and alembic_version table, runs migrations automatically on startup
  • Foreign key enforcement: PRAGMA foreign_keys=ON executed on every session to ensure referential integrity
  • Session management: get_db() dependency injection generator with proper commit/rollback/close handling
  • Review focus: Verify migrations run correctly on fresh install, foreign keys are enforced, and session lifecycle is managed properly

Alembic Migrations (backend/migrations/):

  • Initial schema: 001_initial_schema.py creates tables for tasks, attachments, notifications, chat_history, config, and alembic_version
  • SQLite-specific configuration: env.py configured for async SQLAlchemy with SQLite, handles PRAGMA foreign_keys=ON
  • Migration environment: Proper Alembic configuration with script location and database URL resolution
  • Async URL conversion: CLI commands convert async URLs (sqlite+aiosqlite://) to sync URLs (sqlite://) to avoid asyncio conflicts during migrations
  • Review focus: Verify migration creates all required tables with correct schema, foreign key constraints work, and downgrade path is safe

CLI Database Commands (backend/cli/db.py):

  • Upgrade/downgrade: Wraps Alembic commands with proper error handling and user feedback
  • Revision creation: Supports --autogenerate flag for automatic migration generation from model changes
  • Backup/restore: SQLite .dump workflow with confirmation prompts for destructive operations
  • Alembic configuration: get_alembic_cfg() resolves Alembic config from project structure and settings, converts async URLs to sync URLs for migrations
  • Review focus: Verify commands execute correctly, error messages are clear, and confirmation prompts prevent accidental data loss

Data Models (backend/models/):

  • Task model: Core task entity with status, priority, ETA, tags, and metadata fields
  • Attachment model: File and link attachments with foreign key relationship to tasks
  • Notification model: Task notifications with scheduling and delivery tracking
  • ChatHistory model: LLM chat conversation history storage
  • Config model: Application configuration storage
  • Review focus: Verify relationships work correctly, JSON fields serialize/deserialize properly, and timestamp management is accurate

FastAPI Integration (backend/main.py):

  • Lifespan management: lifespan() context manager initializes database on startup and closes connections on shutdown
  • Database initialization: Calls init_db() which automatically runs migrations if needed
  • Health endpoint: Simple health check that verifies app is running
  • Review focus: Verify database initializes correctly on startup, migrations run automatically, and shutdown closes connections cleanly

Testing (tests/):

  • Configuration tests: Comprehensive tests for precedence, validation, TOML parsing, and directory creation (398 lines)
  • Database tests: Tests for initialization, session management, migration execution, and foreign key enforcement (373 lines)
  • CLI tests: Tests for all database commands including upgrade, downgrade, revision, dump, and restore (465 lines)
  • Model tests: Tests for model validation, relationships, JSON fields, and timestamp management (252 lines)
  • Migration tests: Tests for Alembic environment configuration and migration execution (204 lines)
  • Review focus: Verify test coverage is comprehensive, edge cases are handled, and tests are maintainable

Testing Notes

  • Worktree State: Working tree has uncommitted changes (modified docs files and untracked files). Diff analysis is based on committed changes only. Reviewers should verify the PR against the committed state.
  • Database Path: Default database location is ~/.taskgenie/data/taskgenie.db. Can be overridden via TASKGENIE_DATA_DIR env var or database.url in config.toml.
  • Migration Execution: Migrations run automatically on FastAPI startup if database doesn't exist or alembic_version table is missing. This satisfies AC1 requirement for fresh install.
  • SQLite Limitations: SQLite has limited support for ALTER TABLE operations. Reviewers should verify that migrations use compatible operations (CREATE TABLE, CREATE INDEX) rather than complex ALTER TABLE statements.
  • Foreign Key Enforcement: SQLite foreign keys are disabled by default. The implementation enables them on every session via PRAGMA foreign_keys=ON. Reviewers should verify this works correctly.
  • Async URL Conversion: CLI database commands automatically convert async database URLs (sqlite+aiosqlite://) to sync URLs (sqlite://) before passing them to Alembic to avoid asyncio conflicts. This ensures migrations run synchronously.
  • Configuration Precedence: The actual precedence order is: init_settings → env vars → .env → config.toml → file_secret_settings → defaults. The docstring in Settings class should be updated to reflect this accurately (noted in review feedback).
  • Code Quality: Some modules are missing from __future__ import annotations for consistency (noted in review feedback). This should be addressed but doesn't affect functionality.

Manual Verification Checklist

  • Run make test and verify all tests pass
  • Test configuration precedence: env vars → .env → config.toml → defaults
  • Delete ~/.taskgenie/data/taskgenie.db and start API, verify database is created automatically
  • Verify all tables exist: tasks, attachments, notifications, chat_history, config, alembic_version
  • Test uv run tgenie db upgrade command
  • Test uv run tgenie db downgrade -1 command
  • Create test data, run uv run tgenie db dump --out backup.sql, verify backup file is created
  • Delete database, run uv run tgenie db restore --in backup.sql, verify data is restored
  • Test health endpoint: curl http://localhost:8000/health
  • Verify foreign key constraints work: try to create attachment with invalid task_id
  • Test model relationships: create task with attachments, verify cascade delete works
  • Review migration file backend/migrations/versions/001_initial_schema.py for correctness
  • Verify CLI commands provide clear error messages for invalid inputs
  • Test app directory creation: verify ~/.taskgenie/data/, cache/, logs/, vector_store/ are created

- Update `.gitignore` to include coverage and pytest cache files for cleaner version control.
- Modify `.pre-commit-config.yaml` to set a maximum file size for large files and exclude specific files from checks.
- Introduce `AGENTS.md` as a concise guide for AI agents, detailing project structure and quickstart commands.
- Expand `Makefile` with new targets for testing coverage and specific PR dependencies.
- Update `README.md` to reference the new `AGENTS.md` for developers and agents.
- Add CI workflow configuration for automated linting and testing on pull requests.

These changes improve the development experience and provide clearer guidance for contributors.
- Update `.gitignore` to retain `.cursor/commands/` while excluding other cursor files for better version control.
- Add new command documentation for `post-review`, `pr-desc`, `review`, and `test-ac` to improve developer guidance and testing processes.
- Introduce new test results documentation structure to validate acceptance criteria for PR specifications.
- Implement automatic database migration execution on FastAPI startup to ensure compliance with project specifications.

These changes improve the development workflow, enhance documentation clarity, and ensure better adherence to project requirements.
@raychrisgdp raychrisgdp marked this pull request as draft December 31, 2025 07:59
@raychrisgdp raychrisgdp self-assigned this Dec 31, 2025
@raychrisgdp raychrisgdp changed the title C/pr 001 db config feat(db): implement database configuration, migrations, and CLI commands Dec 31, 2025
- Remove unnecessary import statements from test files to improve clarity and maintainability.
- Introduce new tests for migration scenarios, including cases for when the alembic_version table exists and when it does not.
- Enhance exception handling tests to ensure proper behavior during database migration checks.

These changes streamline the test code and improve coverage for migration-related functionality.
- Add automated test for AC1: test_fastapi_lifespan_creates_db_and_runs_migrations
  Verifies FastAPI startup creates DB and runs migrations automatically
  Tests that all required tables exist: tasks, attachments, notifications, chat_history, config, alembic_version

- Update .env.example to only include PR-001 relevant environment variables
  Removed LLM, Gmail, GitHub, and Notifications env vars
  Those belong to their respective PRs (PR-003, PR-006, PR-007, PR-011)

- Remove .cursor/commands directory (development tooling outside spec scope)

- Add .env.example update checklist items to relevant PR specs
  PR-002: Verify existing env vars (no new ones)
  PR-003-1: Add LLM_API_KEY, LLM_BASE_URL, LLM_MODEL
  PR-006: Add GMAIL_ENABLED, GMAIL_CREDENTIALS_PATH
  PR-007: Add GITHUB_TOKEN, GITHUB_USERNAME
  PR-011: Add NOTIFICATIONS_ENABLED, NOTIFICATION_SCHEDULE
Remove exception for .cursor/commands/ since .cursor files are
development tooling outside PR scope and should not be tracked
Remove temporary PR spec files that were created to document
.env.example requirements:
- SKILL_ENRICHMENT_SUMMARY.md
- PR-003-1 (llm-provider)
- PR-003-2 (streaming-chat-api)
- PR-003-3 (tui-chat-integration)
- PR-005-1 (chromadb-embeddings)
- PR-005-2 (semantic-search-rag)

These files were exploratory and will be created properly
in their respective PR implementations.
Revert changes to existing PR specs that were made during
.env.example update. The .env.example checklist items should be added
in their respective implementation PRs (PR-003-1, PR-006, PR-007, PR-011).

Affected files:
- PR-PLANS.md
- PR-specs/INDEX.md
- PR-002 through PR-012 (13 files total)
PR-PLANS.md had merge markers after restoration. Committing as-is
to mark that these will be resolved when future PRs add their .env.example
checklist items.
Restoring PR-PLANS.md to original version from before
.env.example modifications. This file should only receive .env.example
updates in its respective implementation PRs.
Resolve merge conflict by keeping approved version (0 findings).
Remove git conflict markers from reviews/REVIEW_c-pr-001-db-config.md.
- Refactor `.gitignore` to streamline local artifact exclusions, specifically retaining `.opencode/skill`, `.uv-cache/`, and `.tmp/`.
- Enhance `AGENTS.md` with detailed precommit workflow instructions and coding conventions, including common pitfalls and fixes.
- Update `Makefile` to utilize `uv` for running commands, ensuring consistency in the development environment.
- Add new testing guidelines in `TESTING_GUIDE.md` to address common issues and improve test organization.
- Introduce a new command in `.cursor/commands` for fixing unit tests, providing a structured approach to achieving target coverage.
- Update CI workflow to install `uv` as a dependency for better compatibility.

These changes aim to improve the development process, enhance documentation clarity, and ensure better adherence to coding standards.
…flow

- Refactor Makefile to consolidate dependency installation commands, enhancing clarity and efficiency.
- Update `pyproject.toml` to include `pytest-xdist` for parallel test execution and `httpx` for FastAPI testing.
- Modify database migration functions to ensure synchronous execution with appropriate error handling based on the environment.
- Add tests to verify migration behavior and ensure timely completion, addressing potential issues with database initialization.

These changes aim to streamline the development process, improve testing capabilities, and enhance error handling during migrations.
- Refactor Makefile to streamline the development environment setup by removing specific PR dependencies, focusing on core development dependencies.
- Update `pyproject.toml` to clarify optional dependencies and enhance documentation regarding FastAPI and testing requirements.
- Modify CI workflow to install dependencies using `uv`, ensuring a consistent installation process.
- Revise setup documentation to reflect changes in dependency installation and provide clearer instructions for new developers.

These updates aim to improve the development workflow, enhance clarity in dependency management, and ensure a smoother onboarding experience for contributors.
- Introduce asynchronous database initialization with `init_db_async()` to prevent blocking the event loop in async contexts.
- Update CLI commands to ensure migrations run synchronously by converting async database URLs to sync URLs, avoiding potential asyncio conflicts.
- Revise migration documentation to clarify the rationale for using sync URLs and the implementation details for both CLI and application startup.
- Add tests to validate the async migration path and ensure timely completion of database upgrades.

These changes aim to improve the reliability of database operations and enhance the overall development experience.
- Add a step to create a virtual environment using `uv venv` for better isolation of dependencies.
- Modify linting commands to run through `uv`, ensuring consistency in the execution environment.

These changes aim to improve the reliability of the CI process and streamline the development workflow.
- Update `.pre-commit-config.yaml` to include checks for both `backend` and `tests` directories.
- Modify `Makefile` to streamline linting commands, ensuring both checking and formatting are applied to the specified directories.
- Refactor CI workflow to utilize the updated `Makefile` for linting, improving consistency in the CI process.
- Simplify database URL handling in `config.py` to strip query parameters more efficiently.
- Add new tests to validate configuration precedence and database path handling, ensuring robustness in settings management.

These changes aim to improve code quality, streamline development processes, and enhance the reliability of configuration handling.
- Update `AGENTS.md` to clarify the use of `init_db_async()` in FastAPI lifespan to prevent event loop blocking.
- Modify `AGENT_GUIDE.md` to include detailed explanations on stripping query parameters from SQLite URLs in the `database_path` property.
- Ensure migration URLs are converted to sync format to avoid asyncio conflicts, with documentation updates reflecting these changes.

These updates aim to improve database initialization practices and enhance clarity in the documentation for better developer understanding.
- Remove caching configuration for pip in the CI workflow to streamline the dependency installation process.
- Ensure the workflow continues to utilize `uv` for setting up the environment and installing necessary packages.

These changes aim to simplify the CI process and improve the reliability of dependency management.
@raychrisgdp raychrisgdp marked this pull request as ready for review December 31, 2025 18:19
@raychrisgdp raychrisgdp merged commit fca7425 into main Dec 31, 2025
2 checks passed
@raychrisgdp raychrisgdp deleted the c/pr-001-db-config branch December 31, 2025 18:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant