Skip to content

Restart alembic versions with the table creation #513

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 27 commits into
base: main
Choose a base branch
from

Conversation

madhav165
Copy link
Collaborator

@madhav165 madhav165 commented Jul 18, 2025

🐛 Bug-fix PR

📌 Summary

Identified the right way to do alembic migration. First migration script to have table creation commands.

💡 Fix Description

  • Base.metadata.create_all() replaced with command.upgrade(cfg, "head") set in gunicorn.config.py.
  • Also fixes tests that create multiple randomly named databases

🧪 Verification

Check Command Status
Lint suite make lint pass
Unit tests make test pass

📐 MCP Compliance (if relevant)

  • Matches current MCP spec
  • No breaking change to MCP clients

✅ Checklist

  • Code formatted (make black isort pre-commit)
  • No secrets/credentials committed

Signed-off-by: Madhav Kandukuri <[email protected]>
@madhav165 madhav165 requested a review from crivetimihai as a code owner July 18, 2025 07:03
@crivetimihai crivetimihai changed the title Pending changes for alembic Signed-off-by: Madhav Kandukuri <[email protected]> Pending changes for alembic Jul 18, 2025
@madhav165 madhav165 marked this pull request as draft July 18, 2025 07:06
@madhav165 madhav165 marked this pull request as ready for review July 21, 2025 09:05
@madhav165 madhav165 self-assigned this Jul 21, 2025
@madhav165 madhav165 marked this pull request as draft July 21, 2025 14:38
madhav165 added 13 commits July 21, 2025 14:46
Signed-off-by: Madhav Kandukuri <[email protected]>
Signed-off-by: Madhav Kandukuri <[email protected]>
Signed-off-by: Madhav Kandukuri <[email protected]>
Signed-off-by: Madhav Kandukuri <[email protected]>
Signed-off-by: Madhav Kandukuri <[email protected]>
Signed-off-by: Madhav Kandukuri <[email protected]>
Signed-off-by: Madhav Kandukuri <[email protected]>
Signed-off-by: Madhav Kandukuri <[email protected]>
Signed-off-by: Madhav Kandukuri <[email protected]>
Signed-off-by: Madhav Kandukuri <[email protected]>
Signed-off-by: Madhav Kandukuri <[email protected]>
Signed-off-by: Madhav Kandukuri <[email protected]>
@madhav165 madhav165 marked this pull request as ready for review July 21, 2025 19:22
@madhav165 madhav165 requested a review from kevalmahajan as a code owner July 21, 2025 19:22
@madhav165 madhav165 changed the title Pending changes for alembic Restart alembic versions with the table creation Jul 21, 2025
@madhav165 madhav165 marked this pull request as draft July 21, 2025 19:33
@madhav165
Copy link
Collaborator Author

Having alembic migration on gunicorn startup works, tested with empty SQLite and PostgreSQL, however not optimal for CI/CD. Need to explore how to do it better for CI/CD.

Signed-off-by: Madhav Kandukuri <[email protected]>
@madhav165 madhav165 added this to the Release 0.5.0 milestone Jul 23, 2025
@crivetimihai
Copy link
Member

Fix the migration to be idempotent

In the migration file, wrap column additions in try/except or check existence

from sqlalchemy import inspect

def upgrade():
    bind = op.get_bind()
    inspector = inspect(bind)
    
    # Check if column exists before adding
    columns = [col['name'] for col in inspector.get_columns('gateways')]
    if 'slug' not in columns:
        op.add_column("gateways", sa.Column("slug", sa.String(), nullable=True))

Getting this:

sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) duplicate column name: slug
[SQL: ALTER TABLE gateways ADD COLUMN slug VARCHAR]
(Background on this error at: https://sqlalche.me/e/20/e3q8)

@madhav165 madhav165 marked this pull request as ready for review July 31, 2025 08:33
@crivetimihai crivetimihai requested a review from rakdutta August 1, 2025 09:05
@crivetimihai
Copy link
Member

This pull request makes several changes across multiple files, focusing on database migration management, container setup, and test improvements.


1. Migration & Database Setup

  • Alembic Migration Files:
    • A new migration file 9c07b4bc5774_auto_migration.py is added. This file consolidates the schema creation for all major tables (gateways, servers, tools, resources, prompts, metrics, etc.).
    • Old migration files (b77ca9d2de7e_uuid_pk_and_slug_refactor.py, e4fc04d1a442_add_annotations_to_tables.py, e75490e949b1_add_improved_status_to_tables.py) are removed.
  • Bootstrap Logic:
    • The call to bootstrap_db in main.py is removed, indicating a shift to managing DB schema entirely via Alembic migrations rather than runtime creation.
  • Docker/Helm Setup:
    • The Alembic migration command is updated everywhere to use a config file: alembic -c mcpgateway/alembic.ini upgrade head.
    • Docker build and compose files are updated to use Containerfile.lite for the migration container instead of the default Dockerfile.
    • Health checks for the migration service in compose files are re-enabled and improved.

2. Codebase Clean-up

  • Alembic Env.py:
    • Several commented-out code sections are removed, cleaning up the Alembic environment script.
  • Gunicorn Config:
    • Imports for Alembic are added (possibly for future DB migration hooks), but not used yet.
    • Removes two empty lines for tidiness.

3. Tests

  • Test Setup Changes:
    • The test client fixture in test_ui_version.py is scoped to use a temporary database via app_with_temp_db, improving test isolation.
    • Some test documentation is reworded for clarity.
    • Unused or commented-out test code is removed.
  • Gateway Service Test:
    • Removes a placeholder/pass statement from a test, indicating further improvements in test completeness.

4. Container & Compose Files

  • Travis CI Build:
    • The build script is updated to use Containerfile.lite for lighter container builds.
    • Health endpoint checks are commented out, possibly to avoid flakiness in CI.
  • Docker Compose:
    • Migration service is enabled and set to run with the new image/build context.
    • Healthcheck for the main service is removed.

General Assessment

Pros:

  • Migrating to Alembic for all schema management is a best practice for production systems.
  • Removing runtime schema creation reduces risk and increases reliability.
  • The new migration file appears to be comprehensive and well-structured.
  • Docker/compose/Helm updates improve clarity and ensure the migration container uses the proper config.
  • Test improvements make the suite more robust and isolated.

Cons/Risks:

  • Removing the bootstrap logic means environments must always be migrated before runtime—this should be clearly documented for contributors.
  • Removal of health checks may reduce visibility into service health in CI/CD.
  • If any migration file removal is premature, there could be upgrade path issues for existing installations—ensure that migration history is not lost for users upgrading from older versions.

Recommendations

  • Ensure upgrade documentation is updated—users must run Alembic migrations manually now.
  • Double-check that the new migration covers all schema evolution from previous versions.
  • Consider re-enabling health checks in CI/CD if possible, to catch runtime issues early.
  • Test upgrades from previous versions to this one to verify the new migration works as intended.

.travis.yml Outdated
curl -fsSL http://localhost:4444/health || {
echo "❌ Health check failed"; docker logs mcpgateway; exit 1;
}
# echo "🔍 Hitting health endpoint..."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should re-enable healthchecks

@@ -60,16 +60,8 @@ services:
condition: service_healthy # ▶ wait for DB
redis:
condition: service_started
# migration:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't remove healthcheck

@rakdutta
Copy link
Collaborator

rakdutta commented Aug 5, 2025

Testing Summary:
Docker

  1. make container-build

  2. make compose-up (not working)

  3. docker-compose up its start the below container successfully
    docker ps -a --format "{{.Image}} {{.Status}}" python:3.11-slim Created mcpgateway/mcpgateway:latest Up 6 minutes (unhealthy) redis/redisinsight:latest Up 7 minutes dpage/pgadmin4:latest Up 6 minutes mcpgateway/mcpgateway:latest Exited (0) 6 minutes ago postgres:17 Up 7 minutes (healthy) redis:latest Up 7 minutes ghcr.io/ibm/fast-time-server:latest Up 7 minutes

  4. Connect from admin ui

  5. Not able to add new gateway server. Getting below error. However, others are working (add/update)
    _```
    172.19.0.1:46312 - "GET /.well-known/appspecific/com.chrome.devtools.json HTTP/1.1" 404
    06:30:21 - mcpgateway.services.gateway_service - ERROR - GatewayConnectionError in group: (GatewayConnectionError('Failed to initialize gateway at http://localhost:8080/sse'),)
    172.19.0.1:34698 - "POST /admin/gateways HTTP/1.1" 502

6) Able to logging successfully into PostgreSQL below tables has been created successfully

                  List of relations
 Schema |            Name             | Type  |  Owner
--------+-----------------------------+-------+----------
 public | alembic_version             | table | postgres
 public | gateways                    | table | postgres
 public | mcp_messages                | table | postgres
 public | mcp_sessions                | table | postgres
 public | prompt_metrics              | table | postgres
 public | prompts                     | table | postgres
 public | resource_metrics            | table | postgres
 public | resource_subscriptions      | table | postgres
 public | resources                   | table | postgres
 public | server_metrics              | table | postgres
 public | server_prompt_association   | table | postgres
 public | server_resource_association | table | postgres
 public | server_tool_association     | table | postgres
 public | servers                     | table | postgres
 public | tool_metrics                | table | postgres
 public | tools                       | table | postgres

**Local:**
1) delete my existing mcp.db
2) make server(existing mcp db working fine)
getting below error.
`   cursor.execute(statement, parameters)
sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) no such table: gateways
[SQL: SELECT gateways.id AS gateways_id, gateways.name AS gateways_name, gateways.slug AS gateways_slug, gateways.url AS gateways_url, gateways.description AS gateways_description, gateways.transport AS gateways_transport, gateways.capabilities AS gateways_capabilities, gateways.created_at AS gateways_created_at, gateways.updated_at AS gateways_updated_at, gateways.enabled AS gateways_enabled, gateways.reachable AS gateways_reachable, gateways.last_seen AS gateways_last_seen, gateways.auth_type AS gateways_auth_type, gateways.auth_value AS gateways_auth_value
FROM gateways]
(Background on this error at: https://sqlalche.me/e/20/e3q8)`



Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants