Fix(WIP): hex and JSON to SIP003 encoding migration for chainstate databases #6586

jcnelson · 2025-10-10T04:29:22Z

This is an early draft of #6545 and #6546. It works by replacing the macro-generated FromSql and ToSql implementations for byte array types with an alternative SqlEncoded trait, which will attempt to decode the bytes as either the current encoding (hex and JSON strings) or the new encoding (SIP003 encoding). Most of the line noise here is due to the fact that these affected types no longer have a &dyn ToSql cast available to them, and must instead rely on an explicit .sql_encoded(...) or .sqlhex() call (the latter of which wraps .sql_encoded(...)).

Feel free to poke around. I'm still working on fixing the encodings for the non-byte-array types (mainly things we store as JSON) so we can avoid wasting space there. I executed #6546 concurrently with #6545 in order to convince myself that #6545 was adequately addressed.

…ght to be encoded on disk

…ommon types that would be candidates for a reencoding from hex or JSON to a binary representation

…termine how to read and write the DB. Also, switch away from get_unwrap() to get()

…ec, in order to make reencoding easier and unambiguous

…SON into binary. Add a macro to facilitate generating the code to perform the conversions, and add a macro to synthesize tests for individual type reencoding. Also, add unit test coverage for each affected table.

…f the macro-implemented `FromSql` and `ToSql` to preserve the existing behavior.

…-bin-db-migration

aaronb-stacks · 2025-10-10T15:16:30Z

It's hard to know by looking at the current implementation in this PR if getting rid of ToSql is the right thing to do, because
as of now, this PR seemingly doesn't change the encoding at all (all the types continue to output hex strings).

But based on this PR, I do have some early feedback and questions. My prior assumption was that the "migration path" would need to look more heterogenous: basically, because SQLite always lets you store a blob, an upgraded node would store all new values as byte-string blobs whether or not a migration has occurred. These would live together with values stored as hex strings. This would not be acceptable for fields used as lookup-keys because you'd need to know when doing the lookup whether or not the key is encoded as a hex or otherwise. But it would work fine for "values" (the FromSql impl in this PR would be happy to parse any encoding).

I'll admit that such a heterogenous db offends my sensibilities, but having to do a full migration first seems like too high of a bar (even if we let people do it whenever they want). My guess on how long a migration would take is multiple hours, which would definitely put node operators in a tough spot (how would they do this without experiencing downtime? would they have to snapshot their chainstate, run the migration on a separate machine, and then restart with their hours old snapshot?) If you think the migrations are faster than that, I think that may work, but if we need the migrations to occur to get the functionality, I'm not sure that doing it offline helps much. I think from the perspective of a node operator, this is something that would be handled ideally "in the background" -- basically, run with a heterogenous DB, and periodically migrate sections of the DB until the whole DB has been migrated.

jcnelson · 2025-10-10T17:56:23Z

But based on this PR, I do have some early feedback and questions. My prior assumption was that the "migration path" would need to look more heterogenous: basically, because SQLite always lets you store a blob, an upgraded node would store all new values as byte-string blobs whether or not a migration has occurred. These would live together with values stored as hex strings. This would not be acceptable for fields used as lookup-keys because you'd need to know when doing the lookup whether or not the key is encoded as a hex or otherwise. But it would work fine for "values" (the FromSql impl in this PR would be happy to parse any encoding).

Fortunately, this is not how I intend for this to work. The migration rollout would be incremental -- possibly across multiple node releases! -- but incremental at the granularity of an entire database. This PR provides the migration logic for the sortition DB, and follow-on PRs would migrate the other DBs.

The migration itself would be time-consuming -- you not only have to convert essentially every column of every row to a new encoding, but also you have to vacuum the DB. The reason the code offers two ways to decode a column value is because unlike before, we aren't going to force users to migrate the sortition DB right way. Instead, we'll let the user choose when to migrate this DB (and others) via an explicit command. This way, the user can schedule the migration themselves -- for example, they can take a snapshot of their node's chainstate, migrate that chainstate, and then stop their node and re-start it on the migrated chainstate while retaining the old chainstate as a backup.

This migration path is what required me to get rid of FromSql and ToSql implementations in favor of SqlEncoded -- the node has to be able to work with both the legacy encoding and the new encoding. Right now, the code prefixes a SIP003-encoded column value with a non-printable byte to ensure that it will fail to decode as a hex string; the intent is not to try and store both hex and SIP003 blobs in the same column.

I'll admit that such a heterogenous db offends my sensibilities, but having to do a full migration first seems like too high of a bar (even if we let people do it whenever they want). My guess on how long a migration would take is multiple hours, which would definitely put node operators in a tough spot (how would they do this without experiencing downtime? would they have to snapshot their chainstate, run the migration on a separate machine, and then restart with their hours old snapshot?) If you think the migrations are faster than that, I think that may work, but if we need the migrations to occur to get the functionality, I'm not sure that doing it offline helps much. I think from the perspective of a node operator, this is something that would be handled ideally "in the background" -- basically, run with a heterogenous DB, and periodically migrate sections of the DB until the whole DB has been migrated.

Oh, it offends my sensibilities as well :) Fortunately that's not what we're doing here. We're going for a less-offensive (but not totally inoffensive) migration path of reencoding each database separately, via different PRs and possibly different releases.

I can look into adding a "background migration" option, but that's pretty risky as well. We'd need to add some kind of chainstate-wide checkpointing system that would allow the background migration to produce a consistent snapshot while the chainstate continues to be appended to. There's value in having this for at some point, for taking automatic backups, but I'm not sure yet whether or not the migration times will require us to add this earlier rather than later.

jcnelson added 7 commits October 10, 2025 00:17

feat: add ColumnEncoding enum to describe how a database's columns ou…

2689ff9

…ght to be encoded on disk

feat: macro to automatically implement SqlEncoded trait for various c…

ad19b57

…ommon types that would be candidates for a reencoding from hex or JSON to a binary representation

chore: store column encoding in sortition DB config and load it to de…

7661044

…termine how to read and write the DB. Also, switch away from get_unwrap() to get()

feat: represent memo fields as a BurnOpMemo struct instead of a raw v…

bb856f4

…ec, in order to make reencoding easier and unambiguous

chore: Pervasively throughout the codebase, use .sqlhex() instead o…

ebbcd8c

…f the macro-implemented `FromSql` and `ToSql` to preserve the existing behavior.

Merge remote-tracking branch 'stacks-network/develop' into fix/hex-to…

64f691d

…-bin-db-migration

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix(WIP): hex and JSON to SIP003 encoding migration for chainstate databases #6586

Fix(WIP): hex and JSON to SIP003 encoding migration for chainstate databases #6586

Uh oh!

jcnelson commented Oct 10, 2025

Uh oh!

aaronb-stacks commented Oct 10, 2025

Uh oh!

jcnelson commented Oct 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fix(WIP): hex and JSON to SIP003 encoding migration for chainstate databases #6586

Are you sure you want to change the base?

Fix(WIP): hex and JSON to SIP003 encoding migration for chainstate databases #6586

Uh oh!

Conversation

jcnelson commented Oct 10, 2025

Uh oh!

aaronb-stacks commented Oct 10, 2025

Uh oh!

jcnelson commented Oct 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants