|
| 1 | +# pg-schema-diff |
| 2 | + |
| 3 | +A declarative schema migration tool for PostgreSQL that computes the difference between two database schemas and generates minimal, optimized SQL to migrate from one to the other with zero-downtime where possible. |
| 4 | + |
| 5 | +## Project Overview |
| 6 | + |
| 7 | +**Problem Solved**: Developers declare their desired database schema in DDL files, and pg-schema-diff automatically generates safe, optimized migration SQL that minimizes downtime and locks. |
| 8 | + |
| 9 | +**Key Features**: |
| 10 | +- Computes diffs between schemas (DDL files, databases, or directories) |
| 11 | +- Generates SQL using native Postgres online operations (concurrent index builds, online constraint validation) |
| 12 | +- Provides hazard warnings for dangerous operations |
| 13 | +- Validates migration plans against temporary databases before execution |
| 14 | + |
| 15 | +## Directory Structure |
| 16 | + |
| 17 | +``` |
| 18 | +cmd/pg-schema-diff/ # CLI entry point (Cobra-based) |
| 19 | +├── plan_cmd.go # 'plan' subcommand - generates migration SQL |
| 20 | +├── apply_cmd.go # 'apply' subcommand - applies migrations |
| 21 | +├── flags.go # Flag parsing and DB connection handling |
| 22 | +
|
| 23 | +pkg/ # Public API packages |
| 24 | +├── diff/ # Core diffing and plan generation (main library interface) |
| 25 | +├── tempdb/ # Temporary database factory for plan validation |
| 26 | +├── log/ # Logging interface |
| 27 | +├── schema/ # Public schema API wrapper |
| 28 | +├── sqldb/ # Database queryable interface |
| 29 | +
|
| 30 | +internal/ # Internal implementation |
| 31 | +├── schema/ # Complete schema representation types (schema.go is 46KB) |
| 32 | +├── queries/ # SQL queries via sqlc for schema introspection |
| 33 | +├── migration_acceptance_tests/ # Comprehensive test suite (24 test files) |
| 34 | +├── pgengine/ # Postgres engine management for tests |
| 35 | +├── pgdump/ # pg_dump integration |
| 36 | +├── graph/ # Dependency graph for statement ordering |
| 37 | +``` |
| 38 | + |
| 39 | +## Key Packages |
| 40 | + |
| 41 | +### pkg/diff/ - Core Diffing Engine |
| 42 | +- `plan_generator.go`: Orchestrates plan generation and validation |
| 43 | +- `sql_generator.go`: Generates SQL statements for all object types (2,700+ lines) |
| 44 | +- `sql_graph.go`: Dependency graph for correct statement ordering |
| 45 | +- `schema_source.go`: Schema sources (DDL files, database, directories) |
| 46 | + |
| 47 | +### internal/schema/ - Schema Representation |
| 48 | +Core types in `schema.go`: |
| 49 | +- `Schema`: Top-level container for all database objects |
| 50 | +- `Table`: Tables with columns, constraints, policies, triggers |
| 51 | +- `Index`: Index definitions including partial indexes and expressions |
| 52 | +- `Column`, `ForeignKeyConstraint`, `CheckConstraint`, `View`, `Function`, etc. |
| 53 | + |
| 54 | +### internal/queries/ - Database Queries |
| 55 | +Uses **sqlc** for type-safe SQL queries. To modify: |
| 56 | +1. Edit `queries.sql` |
| 57 | +2. Run `make sqlc` to regenerate `queries.sql.go` |
| 58 | + |
| 59 | +## Development Commands |
| 60 | + |
| 61 | +```bash |
| 62 | +# Run all tests (requires Docker or local Postgres) |
| 63 | +go test -v -race ./... -timeout 30m |
| 64 | + |
| 65 | +# Run specific acceptance tests |
| 66 | +go test -v ./internal/migration_acceptance_tests/... -run TestIndexAcceptance |
| 67 | + |
| 68 | +# Lint |
| 69 | +make lint |
| 70 | + |
| 71 | +# Fix lint issues |
| 72 | +make lint_fix |
| 73 | + |
| 74 | +# Regenerate sqlc code |
| 75 | +make sqlc |
| 76 | + |
| 77 | +# Tidy dependencies |
| 78 | +make go_mod_tidy |
| 79 | +``` |
| 80 | + |
| 81 | +## Testing |
| 82 | + |
| 83 | +### Acceptance Tests |
| 84 | +Located in `internal/migration_acceptance_tests/`. Each test file covers specific features: |
| 85 | +- `index_cases_test.go`: Index operations |
| 86 | +- `table_cases_test.go`: Table operations |
| 87 | +- `column_cases_test.go`: Column operations |
| 88 | +- `check_constraint_cases_test.go`, `foreign_key_constraint_cases_test.go`: Constraints |
| 89 | +- `view_cases_test.go`, `function_cases_test.go`, `trigger_cases_test.go`, etc. |
| 90 | + |
| 91 | +Test case structure: |
| 92 | +```go |
| 93 | +acceptanceTestCase{ |
| 94 | + name: "test name", |
| 95 | + oldSchemaDDL: []string{"CREATE TABLE ..."}, |
| 96 | + newSchemaDDL: []string{"CREATE TABLE ... (modified)"}, |
| 97 | + expectedHazardTypes: []diff.MigrationHazardType{...}, |
| 98 | + expectedPlanDDL: []string{"ALTER TABLE ..."}, // optional: assert exact DDL |
| 99 | + expectEmptyPlan: false, // optional: assert no changes |
| 100 | + planOpts: []diff.PlanOpt{...}, // optional: custom plan options |
| 101 | +} |
| 102 | +``` |
| 103 | + |
| 104 | +### Running Tests with Docker |
| 105 | +```bash |
| 106 | +docker build -f build/Dockerfile.test --build-arg PG_MAJOR=15 -t pg-schema-diff-test . |
| 107 | +docker run pg-schema-diff-test |
| 108 | +``` |
| 109 | + |
| 110 | +## Key Concepts |
| 111 | + |
| 112 | +### Migration Hazards |
| 113 | +Operations are flagged with hazard types: |
| 114 | +- `MigrationHazardTypeAcquiresAccessExclusiveLock`: Full table lock |
| 115 | +- `MigrationHazardTypeDeletesData`: Potential data loss |
| 116 | +- `MigrationHazardTypeIndexBuild`: Performance impact during build |
| 117 | +- `MigrationHazardTypeIndexDropped`: Query performance may degrade |
| 118 | +- `MigrationHazardTypeCorrectness`: Potential correctness issues |
| 119 | + |
| 120 | +### Plan and Statements |
| 121 | +```go |
| 122 | +type Plan struct { |
| 123 | + Statements []Statement |
| 124 | + CurrentSchemaHash string // For validation before applying |
| 125 | +} |
| 126 | + |
| 127 | +type Statement struct { |
| 128 | + DDL string // SQL to execute |
| 129 | + Timeout time.Duration // statement_timeout |
| 130 | + LockTimeout time.Duration // lock_timeout |
| 131 | + Hazards []MigrationHazard |
| 132 | +} |
| 133 | +``` |
| 134 | + |
| 135 | +### Online Migration Techniques |
| 136 | +- **Concurrent Index Building**: `CREATE INDEX CONCURRENTLY` |
| 137 | +- **Online Index Replacement**: Rename old, build new concurrently, drop old |
| 138 | +- **Online NOT NULL**: Uses check constraints temporarily |
| 139 | +- **Online Constraint Validation**: Add as `NOT VALID`, validate separately |
| 140 | + |
| 141 | +## CLI Usage |
| 142 | + |
| 143 | +```bash |
| 144 | +# Generate migration plan (from database to DDL files) |
| 145 | +pg-schema-diff plan \ |
| 146 | + --from-dsn "postgres://user:pass@localhost:5432/mydb" \ |
| 147 | + --to-dir ./schema |
| 148 | + |
| 149 | +# Generate plan between two databases |
| 150 | +pg-schema-diff plan \ |
| 151 | + --from-dsn "postgres://..." \ |
| 152 | + --to-dsn "postgres://..." |
| 153 | + |
| 154 | +# Apply migration (requires hazard approval) |
| 155 | +pg-schema-diff apply \ |
| 156 | + --from-dsn "postgres://user:pass@localhost:5432/mydb" \ |
| 157 | + --to-dir ./schema \ |
| 158 | + --allow-hazards INDEX_BUILD,ACQUIRES_ACCESS_EXCLUSIVE_LOCK |
| 159 | + |
| 160 | +# Output formats: sql (default), json, pretty |
| 161 | +pg-schema-diff plan --from-dsn "..." --to-dir ./schema --output-format json |
| 162 | +``` |
| 163 | + |
| 164 | +## Library Usage |
| 165 | + |
| 166 | +```go |
| 167 | +import ( |
| 168 | + "github.com/stripe/pg-schema-diff/pkg/diff" |
| 169 | + "github.com/stripe/pg-schema-diff/pkg/tempdb" |
| 170 | +) |
| 171 | + |
| 172 | +// Create temp database factory for plan validation |
| 173 | +tempDbFactory, _ := tempdb.NewOnInstanceFactory(ctx, func(ctx context.Context, dbName string) (*sql.DB, error) { |
| 174 | + return sql.Open("postgres", fmt.Sprintf(".../%s", dbName)) |
| 175 | +}) |
| 176 | + |
| 177 | +// Define schema sources |
| 178 | +currentSchema := diff.DBSchemaSource(db) // db is *sql.DB or sqldb.Queryable |
| 179 | +targetSchema, _ := diff.DirSchemaSource([]string{"./schema"}) // returns (SchemaSource, error) |
| 180 | + |
| 181 | +// Generate plan |
| 182 | +plan, _ := diff.Generate(ctx, currentSchema, targetSchema, |
| 183 | + diff.WithTempDbFactory(tempDbFactory), |
| 184 | +) |
| 185 | + |
| 186 | +// Apply statements (set timeouts before each statement) |
| 187 | +for _, stmt := range plan.Statements { |
| 188 | + conn.ExecContext(ctx, fmt.Sprintf("SET SESSION statement_timeout = %d", stmt.Timeout.Milliseconds())) |
| 189 | + conn.ExecContext(ctx, fmt.Sprintf("SET SESSION lock_timeout = %d", stmt.LockTimeout.Milliseconds())) |
| 190 | + conn.ExecContext(ctx, stmt.ToSQL()) |
| 191 | +} |
| 192 | +``` |
| 193 | + |
| 194 | +## Code Patterns |
| 195 | + |
| 196 | +### Adding New Schema Object Support |
| 197 | +1. Add type to `internal/schema/schema.go` |
| 198 | +2. Add query to `internal/queries/queries.sql`, run `make sqlc` |
| 199 | +3. Update schema fetching logic, schema structs, and tests in `internal/schema` |
| 200 | +4. Add diffing logic in `pkg/diff/diff.go` |
| 201 | +5. Add SQL generation logic in `pkg/diff/x_sql_generator.go` |
| 202 | +6. Add acceptance tests in `internal/migration_acceptance_tests/` |
| 203 | + |
| 204 | +### Error Handling |
| 205 | +Use `fmt.Errorf` with `%w` for error wrapping. Functions return `error` as last return value. |
| 206 | + |
| 207 | +### Testing Conventions |
| 208 | +- Use `testify/assert` and `testify/require` |
| 209 | +- Acceptance tests use shared Postgres via `pgengine` |
| 210 | +- Test cases are typically table-driven |
0 commit comments