-
Notifications
You must be signed in to change notification settings - Fork 339
✨ feat: add checksum validation #672
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
6f50b23 to
89fdceb
Compare
c9d6225 to
a1496f2
Compare
beanow-at-crabnebula
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👏 I was missing this functionality as well.
Not a maintainer, but two points that stood out.
README.md
Outdated
| - `NONE`: Disable checksum validation (default). | ||
| - `LENIENT`: Warn if a migration file has changed after being applied. | ||
| - `STRICT`: Fail if a migration file has changed after being applied. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| - `NONE`: Disable checksum validation (default). | |
| - `LENIENT`: Warn if a migration file has changed after being applied. | |
| - `STRICT`: Fail if a migration file has changed after being applied. | |
| - `NONE`: Disable checksum validation (default). | |
| - `WARN`: Warn if a migration file has changed after being applied. | |
| - `FAIL`: Fail if a migration file has changed after being applied. |
Perhaps more self-explanatory.
Also, I would suggest WARN is a good default that strikes a balance between backwards-compatible behavior and helping people avoid this footgun.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can assume that it is more self-explanatory! My first intent was to describe the "checksum strictness" rather than the direct result of the checksum validation 👍
Maybe I will let maintainers judge what they want in repository
pkg/driver/clickhouse/clickhouse.go
Outdated
|
|
||
| func (drv *Driver) HasChecksumColumn(db *sql.DB) (bool, error) { | ||
| exists := false | ||
| err := db.QueryRow(fmt.Sprintf("SELECT 1 FROM system.columns WHERE database = '%s' AND table = '%s' AND name = 'checksum'", drv.databaseName(), drv.quotedMigrationsTableName())). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Keep in mind that reading system tables may require additional privileges when you're using access control.
It may be worth documenting, or considering alternative approaches.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A quick look at how the grafana datasource does this for enumerating fields in the query builder UI.
https://github.com/grafana/clickhouse-datasource/blob/c0fb25dce3b7d9888dba574de9d73e25f23684c8/src/data/CHDatasource.ts#L561
It uses the DESCRIBE TABLE statement.
But this does not allow filtering by column name.
Instead we could SHOW COLUMNS FROM "%s"."%s" WHERE field = 'checksum'
As far as privileges goes, this requires GRANT SHOW COLUMNS on the migration table.
Rather than GRANT SELECT system.columns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes good point, I forgot this point!
I've never work with Clickhouse, I will try to test and fix that.
Thank you for the hint 🙏
b0add98 to
059b5d8
Compare
|
blocked by #679 |
84b471b to
2800ac6
Compare
Concept
Checksum validation feature can help developers and teams be aware of wrong use of database schema migration tool by displaying or raising error when an applied migration file's content has changed since application.
Supported modes are:
NONE: Disable checksum validation (default).LENIENT: Warn if a migration file has changed after being applied.STRICT: Fail if a migration file has changed after being applied.Why?
This feature is inspired from Liquibase features. I used Liquibase for years and I had some complains about it. I think dbmate is a simplier powerful alternative to it by removing complex abstraction layer and lightening the execution.
But one of cool features I found in Liquibase was the explicit wrong use of database schema migrations by performing a "checksum" validation, hashing the changeset content each time and comparing the resulting hash to corresponding persisted applied changeset's one.
This feature helps ensure that applied migrations remain unchanged, improving database integrity and team collaboration by explicitly raising bad practices using database schema migration tool.
What's changed
This change adds optional checksum validation for migrations.
DBMATE_CHECKSUM_MODEand CLI flag--checksum-modewith values NONE (default), LENIENT, STRICT.schema_migrations).Past migrations will have NULL checksum in TABLE, the codebase should be sufficiently resilient to ignore past migration checksum validation.
Design notes
PR notes
This is my first open-source project's pull request, as well as first Golang project. So please be strict in code review and explanatory in comments. I will do my best to fix what it have to be fixed.
I checked the "retro-compatibility" (column creation on already existing migrations table) on classic SQL Database engines and clickhouse but I couldn't check for BigQuery apparently due to emulator limitations or something I don't understand: After adding column (/field) to table (/schema), the updated metadata looks good but the next select fail saying the field "checksum" does not exist (yet?).
If someone knows more about BigQuery architecture or have an idea, please let me know.