Skip to content

Conversation

@noetica-cloud
Copy link

@noetica-cloud noetica-cloud commented Aug 16, 2025

Concept

Checksum validation feature can help developers and teams be aware of wrong use of database schema migration tool by displaying or raising error when an applied migration file's content has changed since application.

Supported modes are:

  • NONE: Disable checksum validation (default).
  • LENIENT: Warn if a migration file has changed after being applied.
  • STRICT: Fail if a migration file has changed after being applied.

Why?

This feature is inspired from Liquibase features. I used Liquibase for years and I had some complains about it. I think dbmate is a simplier powerful alternative to it by removing complex abstraction layer and lightening the execution.

But one of cool features I found in Liquibase was the explicit wrong use of database schema migrations by performing a "checksum" validation, hashing the changeset content each time and comparing the resulting hash to corresponding persisted applied changeset's one.

This feature helps ensure that applied migrations remain unchanged, improving database integrity and team collaboration by explicitly raising bad practices using database schema migration tool.

What's changed

This change adds optional checksum validation for migrations.

  • New env var: DBMATE_CHECKSUM_MODE and CLI flag --checksum-mode with values NONE (default), LENIENT, STRICT.
  • dbmate add checksum column if it is not existing yet in migrations table (default schema_migrations).
  • The checksum stored is SHA-256 (hex) of the migration contents.
  • On startup, dbmate will validate applied migrations that have a stored checksum. On mismatch:
    • STRICT: migration run fails with an error.
    • LENIENT: mismatch is logged as a WARNING but migration proceeds.
    • NONE: no checks performed (default).

Past migrations will have NULL checksum in TABLE, the codebase should be sufficiently resilient to ignore past migration checksum validation.

Design notes

  • I intentionally consider version as unique file, and choose to simpliest hashing concept : hashing the overall file contents.
  • This is opt-in only: users must define a checksum mode to enable behavior; no change occurs by default except adding checksum column to migrations table.

PR notes

This is my first open-source project's pull request, as well as first Golang project. So please be strict in code review and explanatory in comments. I will do my best to fix what it have to be fixed.

I checked the "retro-compatibility" (column creation on already existing migrations table) on classic SQL Database engines and clickhouse but I couldn't check for BigQuery apparently due to emulator limitations or something I don't understand: After adding column (/field) to table (/schema), the updated metadata looks good but the next select fail saying the field "checksum" does not exist (yet?).

If someone knows more about BigQuery architecture or have an idea, please let me know.

@noetica-cloud noetica-cloud force-pushed the feat/checksum branch 2 times, most recently from 6f50b23 to 89fdceb Compare August 16, 2025 16:12
@noetica-cloud noetica-cloud marked this pull request as draft August 16, 2025 16:13
@noetica-cloud noetica-cloud force-pushed the feat/checksum branch 2 times, most recently from c9d6225 to a1496f2 Compare August 16, 2025 17:35
@noetica-cloud noetica-cloud marked this pull request as ready for review August 16, 2025 18:06
Copy link

@beanow-at-crabnebula beanow-at-crabnebula left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👏 I was missing this functionality as well.
Not a maintainer, but two points that stood out.

README.md Outdated
Comment on lines 618 to 620
- `NONE`: Disable checksum validation (default).
- `LENIENT`: Warn if a migration file has changed after being applied.
- `STRICT`: Fail if a migration file has changed after being applied.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- `NONE`: Disable checksum validation (default).
- `LENIENT`: Warn if a migration file has changed after being applied.
- `STRICT`: Fail if a migration file has changed after being applied.
- `NONE`: Disable checksum validation (default).
- `WARN`: Warn if a migration file has changed after being applied.
- `FAIL`: Fail if a migration file has changed after being applied.

Perhaps more self-explanatory.
Also, I would suggest WARN is a good default that strikes a balance between backwards-compatible behavior and helping people avoid this footgun.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can assume that it is more self-explanatory! My first intent was to describe the "checksum strictness" rather than the direct result of the checksum validation 👍
Maybe I will let maintainers judge what they want in repository


func (drv *Driver) HasChecksumColumn(db *sql.DB) (bool, error) {
exists := false
err := db.QueryRow(fmt.Sprintf("SELECT 1 FROM system.columns WHERE database = '%s' AND table = '%s' AND name = 'checksum'", drv.databaseName(), drv.quotedMigrationsTableName())).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keep in mind that reading system tables may require additional privileges when you're using access control.
It may be worth documenting, or considering alternative approaches.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A quick look at how the grafana datasource does this for enumerating fields in the query builder UI.
https://github.com/grafana/clickhouse-datasource/blob/c0fb25dce3b7d9888dba574de9d73e25f23684c8/src/data/CHDatasource.ts#L561
It uses the DESCRIBE TABLE statement.
But this does not allow filtering by column name.

Instead we could SHOW COLUMNS FROM "%s"."%s" WHERE field = 'checksum'

As far as privileges goes, this requires GRANT SHOW COLUMNS on the migration table.
Rather than GRANT SELECT system.columns.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes good point, I forgot this point!
I've never work with Clickhouse, I will try to test and fix that.
Thank you for the hint 🙏

@noetica-cloud noetica-cloud force-pushed the feat/checksum branch 2 times, most recently from b0add98 to 059b5d8 Compare October 4, 2025 08:01
@noetica-cloud
Copy link
Author

blocked by #679

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants