fix(vitess): recover late migration context#232
Conversation
There was a problem hiding this comment.
Pull request overview
This PR improves PlanetScale/Vitess progress enrichment by re-discovering a missing Vitess migration_context during Progress() polling (when the persisted resume state lacks it), using deploy/request timestamps to select the most relevant context and adding regression tests for the selection logic.
Changes:
- Add fallback migration context discovery in
Engine.Progress()whenResumeState.MigrationContextis empty. - Introduce timestamp-based context selection helpers (
migrationContextDiscoveryAfter,latestMigrationContextAfter) and parserequested_timestampfromSHOW VITESS_MIGRATIONS. - Add tests covering context cutoff selection and discovery “after” timestamp behavior.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| pkg/engine/planetscale/progress.go | Adds “late” migration context discovery during polling and implements timestamp-based context selection. |
| pkg/engine/planetscale/progress_test.go | Adds regression tests for context cutoff selection and discovery-after timestamp computation. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
317afd5 to
fd31453
Compare
Co-authored-by: Amp <amp@ampcode.com>
fd31453 to
ca91b28
Compare
| req.ResumeState = &engine.ResumeState{ | ||
| MigrationContext: migrationContext, | ||
| Metadata: req.ResumeState.Metadata, | ||
| } |
There was a problem hiding this comment.
perhaps we better persist the recovered number somewhere - ex: in a column that exists (migration_context) so that it turns the per-poll 9-query sweep into a one-time cost ?
example impact summary from my agent session:
Vitess DB with 8 keyspaces; Apply Z hit the late-context bug → stored migration_context = ""
Every CLI `schemabot status` / UI poll for Apply Z:
BuildResumeState(vad) → migration_context still "" (never persisted)
→ discoverMigrationContext:
ListKeyspaces (1 PlanetScale API call)
SHOW VITESS_MIGRATIONS × 8 keyspaces (8 vtgate queries)
→ result thrown away, not saved
→ next poll repeats the same 9 queries
There was a problem hiding this comment.
Good call — this is resolved now in 7db975b2. LocalClient.Progress persists a recovered ResumeState.MigrationContext back into vitess_apply_data.migration_context when the stored value is missing/different, so the expensive context discovery sweep becomes a one-time recovery path instead of repeating on every poll. I also added a regression test for that path.
Co-authored-by: Amp <amp@ampcode.com>
384004d to
7db975b
Compare
Co-authored-by: Amp <amp@ampcode.com>
Why
SchemaBot can sometimes create/start a PlanetScale deploy request before Vitess exposes the deploy's
migration_contextinSHOW VITESS_MIGRATIONS.When that happens, SchemaBot still knows the deploy request ID, but it cannot show the richer Vitess progress view: per-table/per-shard progress, row counts, ETA, and related migration details. Without a recovery path, progress can stay limited to coarse deploy-request status even after the Vitess rows appear.
What
vitess_apply_dataso it survives normal progress polling and restarts.migration_context, querySHOW VITESS_MIGRATIONSagain and recover a context from current rows.requested_timestampat or after the deploy request creation time, and avoid guessing when multiple untimed new contexts exist.Risk Assessment
Low. This only improves PlanetScale progress enrichment when resume state is missing
migration_context; deploy-request state remains the source of truth for apply state.The storage change is additive: a nullable JSON column on
vitess_apply_data.Generated with Amp