Skip to content

Conversation

@jason-lynch
Copy link
Member

@jason-lynch jason-lynch commented Jan 14, 2026

Summary

Adds a migrate package with a mechanism to run an ordered set of migrations on startup, such as Etcd data migrations. The migrations runner uses the election package to ensure that only one server instance is running the migrations at a time. Other instances will block startup until the migrations are complete, so we should aim to keep migrations as lightweight as possible. As stated in the package documentation, these migrations should be idempotent and, where possible, non-destructive.

Testing

This PR does not add any migrations, so there are no functional changes. You should see a log message control-plane db is up to date, no migrations to run from one of the hosts.

PLAT-347

Summary by CodeRabbit

  • New Features

    • Added a migration framework that executes startup-blocking data migrations with distributed coordination, ensuring single execution across multiple server instances
    • Implemented automatic migration status tracking and revision management for safe, idempotent updates
    • Migrations now run before service startup with comprehensive error handling and failure recovery
  • Tests

    • Added comprehensive test coverage for migration execution, concurrent coordination, and failure scenarios

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link

coderabbitai bot commented Jan 14, 2026

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

📝 Walkthrough

Walkthrough

This PR introduces a startup-blocking migration framework for the server. It includes distributed locking via election, status tracking persisted to etcd, and sequential execution of migrations before service initialization. The system tracks migration revisions and execution metadata across instances.

Changes

Cohort / File(s) Summary
Server Integration
server/cmd/root.go, server/internal/app/app.go
Wires migrate.Provide into root initialization; integrates migration execution in app startup before service initialization with error handling
Migration Package Foundation
server/internal/migrate/doc.go, server/internal/migrate/migration.go, server/internal/migrate/all_migrations.go
Defines migrate package documentation; establishes Migration interface with Identifier() and Run(ctx, injector) methods; provides ordered allMigrations() function
Storage & State Management
server/internal/migrate/store.go, server/internal/migrate/revision_store.go, server/internal/migrate/result_store.go
Implements Store composite holding RevisionStore and ResultStore; RevisionStore tracks latest applied migration revision; ResultStore persists per-migration execution metadata (success/failure, timestamps, host ID, version info) to etcd
Migration Execution Orchestration
server/internal/migrate/runner.go, server/internal/migrate/runner_test.go
Runner acquires distributed lock via election candidate, determines pending migrations, executes sequentially with error propagation, tracks revision updates, and persists results; comprehensive test suite validates lock contention, ordering, failure handling, and concurrent safety
Dependency Injection Wiring
server/internal/migrate/provide.go
Registers Store and Runner dependencies with etcd client, host ID, injector, logger, election-based locking (30-second TTL), and migrations; orchestrates dependency assembly with error propagation

Sequence Diagram

sequenceDiagram
    participant App as App Startup
    participant Runner as Migration Runner
    participant Election as Election/Lock
    participant Store as etcd Store
    participant Migration as Migration Handler
    
    App->>Runner: Run(ctx)
    Runner->>Store: Get current revision
    Store-->>Runner: Current revision
    Runner->>Election: Acquire lock with candidate
    Election-->>Runner: Lock acquired (leadership)
    Runner->>Runner: Determine pending migrations from revision
    loop For each pending migration
        Runner->>Migration: Run(ctx, injector)
        Migration->>Store: Update revision & status
        Store-->>Migration: Persisted
        Migration-->>Runner: Success/Error
        Runner->>Store: Record execution metadata
    end
    Runner->>Store: Watch for revision changes (other instances)
    Runner->>Election: Release lock
    Runner-->>App: Return result
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes


🐰 A runner hops through migrations with care,
Elections lock, ensuring work is fair,
Revisions stored in etcd's careful keep,
Each change recorded in the vault so deep,
Now services start when migrations complete!

🚥 Pre-merge checks | ✅ 1 | ❌ 2
❌ Failed checks (1 warning, 1 inconclusive)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check ❓ Inconclusive The PR description covers the Summary and Testing sections well, but is missing the Changes, Checklist, and Notes for Reviewers sections from the template. Add the Changes section with a bulleted list of key changes, complete the Checklist items, and optionally add Notes for Reviewers to highlight important implementation details or areas needing reviewer attention.
✅ Passed checks (1 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat: add migrate package' clearly and concisely summarizes the main change—introducing a new migrate package—which aligns with the substantial additions across multiple files for migration infrastructure.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@jason-lynch jason-lynch force-pushed the feat/PLAT-347/election-pkg branch from 448017e to 847d42a Compare January 15, 2026 13:50
@jason-lynch jason-lynch force-pushed the feat/PLAT-347/migrations branch from 64e7057 to 24aad70 Compare January 15, 2026 13:50
@jason-lynch jason-lynch force-pushed the feat/PLAT-347/election-pkg branch from 847d42a to 9238c9f Compare January 15, 2026 13:57
@jason-lynch jason-lynch force-pushed the feat/PLAT-347/migrations branch from 24aad70 to 8b2d7e1 Compare January 15, 2026 13:59
@jason-lynch jason-lynch force-pushed the feat/PLAT-347/election-pkg branch from 9238c9f to 295b6d9 Compare January 15, 2026 14:02
@jason-lynch jason-lynch force-pushed the feat/PLAT-347/migrations branch from 8b2d7e1 to 278a4cb Compare January 15, 2026 14:03
@jason-lynch
Copy link
Member Author

@coderabbitai review

@coderabbitai
Copy link

coderabbitai bot commented Jan 15, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@jason-lynch jason-lynch force-pushed the feat/PLAT-347/migrations branch from 278a4cb to d35513b Compare January 15, 2026 17:35
Copy link
Contributor

@tsivaprasad tsivaprasad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good and has been verified through tests.

go test -v ./server/internal/migrate/                       
=== RUN   TestRunner
=== RUN   TestRunner/acquires_lock_and_runs_migrations
=== NAME  TestRunner
    writer.go:27: {"level":"debug","component":"migration_runner","message":"starting watch"}
    writer.go:27: {"level":"debug","component":"election_candidate","election_name":"migration_runner","candidate_id":"host-1","message":"attempting to claim leadership"}
    writer.go:27: {"level":"debug","component":"election_candidate","election_name":"migration_runner","candidate_id":"host-1","message":"successfully claimed leadership"}
    writer.go:27: {"level":"debug","component":"election_candidate","election_name":"migration_runner","candidate_id":"host-1","message":"i am the current leader"}
    writer.go:27: {"level":"debug","component":"election_candidate","election_name":"migration_runner","candidate_id":"host-1","message":"starting watch"}
    writer.go:27: {"level":"info","component":"migration_runner","migration":"test-migration","message":"running migration"}
=== RUN   TestRunner/multiple_runners
=== NAME  TestRunner
    writer.go:27: {"level":"debug","component":"migration_runner","message":"starting watch"}
    writer.go:27: {"level":"debug","component":"migration_runner","message":"starting watch"}
    writer.go:27: {"level":"debug","component":"election_candidate","election_name":"migration_runner","candidate_id":"host-1","message":"attempting to claim leadership"}
    writer.go:27: {"level":"debug","component":"election_candidate","election_name":"migration_runner","candidate_id":"host-2","message":"attempting to claim leadership"}
    writer.go:27: {"level":"debug","component":"election_candidate","election_name":"migration_runner","candidate_id":"host-1","message":"successfully claimed leadership"}
    writer.go:27: {"level":"debug","component":"election_candidate","election_name":"migration_runner","candidate_id":"host-1","message":"i am the current leader"}
    writer.go:27: {"level":"debug","component":"election_candidate","election_name":"migration_runner","candidate_id":"host-1","message":"starting watch"}
    writer.go:27: {"level":"info","component":"migration_runner","migration":"test-migration","message":"running migration"}
    writer.go:27: {"level":"debug","component":"election_candidate","election_name":"migration_runner","candidate_id":"host-2","message":"starting watch"}
{"level":"error","ts":"2026-01-16T17:51:22.191650+0530","caller":"embed/etcd.go:912","msg":"setting up serving from embedded etcd failed.","error":"http: Server closed","stacktrace":"go.etcd.io/etcd/server/v3/embed.(*Etcd).errHandler\n\t/Users/sivat/go/pkg/mod/go.etcd.io/etcd/server/[email protected]/embed/etcd.go:912\ngo.etcd.io/etcd/server/v3/embed.(*serveCtx).startHandler.func1\n\t/Users/sivat/go/pkg/mod/go.etcd.io/etcd/server/[email protected]/embed/serve.go:90"}
{"level":"error","ts":"2026-01-16T17:51:22.191637+0530","caller":"embed/etcd.go:912","msg":"setting up serving from embedded etcd failed.","error":"mux: server closed","stacktrace":"go.etcd.io/etcd/server/v3/embed.(*Etcd).errHandler\n\t/Users/sivat/go/pkg/mod/go.etcd.io/etcd/server/[email protected]/embed/etcd.go:912\ngo.etcd.io/etcd/server/v3/embed.(*serveCtx).startHandler.func1\n\t/Users/sivat/go/pkg/mod/go.etcd.io/etcd/server/[email protected]/embed/serve.go:90"}
{"level":"error","ts":"2026-01-16T17:51:22.191841+0530","caller":"embed/etcd.go:912","msg":"setting up serving from embedded etcd failed.","error":"accept tcp 127.0.0.1:58876: use of closed network connection","stacktrace":"go.etcd.io/etcd/server/v3/embed.(*Etcd).errHandler\n\t/Users/sivat/go/pkg/mod/go.etcd.io/etcd/server/[email protected]/embed/etcd.go:912\ngo.etcd.io/etcd/server/v3/embed.(*Etcd).startHandler.func1\n\t/Users/sivat/go/pkg/mod/go.etcd.io/etcd/server/[email protected]/embed/etcd.go:906"}
{"level":"error","ts":"2026-01-16T17:51:22.229545+0530","caller":"embed/etcd.go:912","msg":"setting up serving from embedded etcd failed.","error":"accept tcp 127.0.0.1:58877: use of closed network connection","stacktrace":"go.etcd.io/etcd/server/v3/embed.(*Etcd).errHandler\n\t/Users/sivat/go/pkg/mod/go.etcd.io/etcd/server/[email protected]/embed/etcd.go:912\ngo.etcd.io/etcd/server/v3/embed.(*Etcd).startHandler.func1\n\t/Users/sivat/go/pkg/mod/go.etcd.io/etcd/server/[email protected]/embed/etcd.go:906"}
--- PASS: TestRunner (0.57s)
    --- PASS: TestRunner/acquires_lock_and_runs_migrations (0.03s)
    --- PASS: TestRunner/multiple_runners (0.03s)
=== RUN   TestRunnerMigrationOrdering
=== RUN   TestRunnerMigrationOrdering/runs_migrations_in_order
=== RUN   TestRunnerMigrationOrdering/starts_from_current_revision
=== RUN   TestRunnerMigrationOrdering/stops_on_first_failure
=== RUN   TestRunnerMigrationOrdering/records_status_for_each_migration
=== RUN   TestRunnerMigrationOrdering/updates_revision_after_each_successful_migration
=== RUN   TestRunnerMigrationOrdering/does_not_update_revision_after_failed_migration
{"level":"error","ts":"2026-01-16T17:51:23.199168+0530","caller":"embed/etcd.go:912","msg":"setting up serving from embedded etcd failed.","error":"http: Server closed","stacktrace":"go.etcd.io/etcd/server/v3/embed.(*Etcd).errHandler\n\t/Users/sivat/go/pkg/mod/go.etcd.io/etcd/server/[email protected]/embed/etcd.go:912\ngo.etcd.io/etcd/server/v3/embed.(*serveCtx).startHandler.func1\n\t/Users/sivat/go/pkg/mod/go.etcd.io/etcd/server/[email protected]/embed/serve.go:90"}
{"level":"error","ts":"2026-01-16T17:51:23.199168+0530","caller":"embed/etcd.go:912","msg":"setting up serving from embedded etcd failed.","error":"mux: server closed","stacktrace":"go.etcd.io/etcd/server/v3/embed.(*Etcd).errHandler\n\t/Users/sivat/go/pkg/mod/go.etcd.io/etcd/server/[email protected]/embed/etcd.go:912\ngo.etcd.io/etcd/server/v3/embed.(*serveCtx).startHandler.func1\n\t/Users/sivat/go/pkg/mod/go.etcd.io/etcd/server/[email protected]/embed/serve.go:90"}
{"level":"error","ts":"2026-01-16T17:51:23.199328+0530","caller":"embed/etcd.go:912","msg":"setting up serving from embedded etcd failed.","error":"accept tcp 127.0.0.1:58881: use of closed network connection","stacktrace":"go.etcd.io/etcd/server/v3/embed.(*Etcd).errHandler\n\t/Users/sivat/go/pkg/mod/go.etcd.io/etcd/server/[email protected]/embed/etcd.go:912\ngo.etcd.io/etcd/server/v3/embed.(*Etcd).startHandler.func1\n\t/Users/sivat/go/pkg/mod/go.etcd.io/etcd/server/[email protected]/embed/etcd.go:906"}
{"level":"error","ts":"2026-01-16T17:51:23.227938+0530","caller":"embed/etcd.go:912","msg":"setting up serving from embedded etcd failed.","error":"accept tcp 127.0.0.1:58882: use of closed network connection","stacktrace":"go.etcd.io/etcd/server/v3/embed.(*Etcd).errHandler\n\t/Users/sivat/go/pkg/mod/go.etcd.io/etcd/server/[email protected]/embed/etcd.go:912\ngo.etcd.io/etcd/server/v3/embed.(*Etcd).startHandler.func1\n\t/Users/sivat/go/pkg/mod/go.etcd.io/etcd/server/[email protected]/embed/etcd.go:906"}
--- PASS: TestRunnerMigrationOrdering (1.00s)
    --- PASS: TestRunnerMigrationOrdering/runs_migrations_in_order (0.05s)
    --- PASS: TestRunnerMigrationOrdering/starts_from_current_revision (0.04s)
    --- PASS: TestRunnerMigrationOrdering/stops_on_first_failure (0.03s)
    --- PASS: TestRunnerMigrationOrdering/records_status_for_each_migration (0.03s)
    --- PASS: TestRunnerMigrationOrdering/updates_revision_after_each_successful_migration (0.03s)
    --- PASS: TestRunnerMigrationOrdering/does_not_update_revision_after_failed_migration (0.04s)
PASS
ok  	github.com/pgEdge/control-plane/server/internal/migrate	2.074s
➜  control-plane git:(feat/PLAT-347/migrations) 

@jason-lynch jason-lynch force-pushed the feat/PLAT-347/election-pkg branch from 295b6d9 to 51a12aa Compare January 16, 2026 14:21
Adds a `migrate` package with a mechanism to run an ordered set of
migrations on startup, such as Etcd data migrations. The migrations
runner uses the `election` package to ensure that only one server
instance is running the migrations at a time. Other instances will block
startup until the migrations are complete, so we should aim to keep
migrations as lightweight as possible. As stated in the package
documentation, these migrations should be idempotent and, where
possible, non-destructive.

PLAT-347
@jason-lynch jason-lynch force-pushed the feat/PLAT-347/migrations branch from d35513b to b81f4bf Compare January 16, 2026 14:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants