-
Notifications
You must be signed in to change notification settings - Fork 420
Remove serde json dependency from chain crate #1752
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove serde json dependency from chain crate #1752
Conversation
|
Great work! It will be cool if we can test backwards compatibility with an old db, but to do this we need some modifications. We need to be able to get the schema scripts externally. I think we can create a method for each schema version (instead of initializing the |
I should move the schemas to their own methods, but I should still keep the inline initialization from pub fn init_sqlite_tables(db_tx: &rusqlite::Transaction) -> rusqlite::Result<()> {
migrate_schema(
db_tx,
Self::SCHEMA_NAME,
&[Self::schema_v0(), Self::schema_v1()],
)
} |
|
@nymius yes, that looks good to me! |
…gesets We want to not depend on serde_json. If we keep it around for serializing anchors we won't be able to remove it in the future because it will always be needed to do migrations. Currently there is only one widely used anchor, ConfirmationBlockTime. The desicion was to constrain support to just be for a single anchor type ConfirmationBlockTime. The anchor table will represent all fields of ConfirmationBlockTime, each one in its own column. The reasons: - No one is using rusqlite with any other anchor type, and if they are, they can do something custom anyway. - The anchor representation may change in the future, supporting for multiple Anchor types here will cause more problems for migration later on.
AnchorImpl was a wrapper created to allow the implementation of foreign traits, like From/ToJson from serde_json for external unknown structs implementing the Anchor trait. As the Anchor generic in the rusqlite implementation for anchored ChangeSets was the only place where this AnchorImpl was used and it has been fixed to the anchor ConfirmationBlockTime, there is no more reason to keep this wrapper around.
…t<BlockId> The only struct implementing rustqlite is ChangeSet<ConfirmationBlockTime> from c49ea85 on.
We would like to test backward compatibility of new schemas. To do so, we should be able to apply schemas independently. Why to change `rusqlite::execute` by `rusqlite::execute_batch`? - we are going to need to return the statements of the schemas as Strings, because we are now returning them from methods, it seemed redundant to keep getting references to something is already referencing data, i.e., keep moving around &String instead of String (consider `rusqlite::execute_batch` takes multiple statements as a single String) - we were calling `rusqlite::execute` with empty params, so we weren't trapped by the method's signature - `rustqlite::execute_batch` does the same than we were doing applying statements secuentially in a loop - the code doesn't lose error expresivity: `rusqlite::execute_batch` is going to fail with the same errors `rusqlite::execute` does BREAKING CHANGE: changes public method `migrate_schema` signature.
Why just v0 to v1 test and not a general backward compatibility test? Is harder to craft a general compatibility test without prior knowledge of how future schemas would look like. Also, the creation of a backward compatibility test for each new schema change will allow the execution of incremental backward compatibility tests with better granularity.
8c70584 to
de28bcd
Compare
Questions for reviewers:
Calling
Code referencepub fn migrate_schema(
db_tx: &Transaction,
schema_name: &str,
versioned_scripts: &[String],
) -> rusqlite::Result<()> {
init_schemas_table(db_tx)?;
let current_version = schema_version(db_tx, schema_name)?;
let exec_from = current_version.map_or(0_usize, |v| v as usize + 1);
let scripts_to_exec = versioned_scripts.iter().enumerate().skip(exec_from);
for (version, script) in scripts_to_exec {
set_schema_version(db_tx, schema_name, version as u32)?;
db_tx.execute_batch(script)?;
}
Ok(())
}Changes rationale:
|
…ate_schema `&str` is documenting clearly that `migrate_schema` should only read `versioned_strings`. Also, as `schema_vN` methods will return `String`, rust will always automatically deref `&String` to `&str`. BREAKING CHANGE: changes parameter versioned_strings from public function migrate_schema from type &[String] to &[&str].
de28bcd to
1c81cd5
Compare
@nymius Could you elaborate more on this? Do you mean, instead of applying schemas sequentially, we want to be able to initialize in one go for empty databases? |
evanlinjin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK 1c81cd5
Yes. In the added test I wanted to:
TLDR: if you don't add the |
If you're saying there is a bug in the schema migration logic please put that in a new PR, if it's a blocker for this PR then we'll need to review and merge it first. But should be easier to review as a stand-alone change. |
|
@notmandatory this isn't a blocker, but I'm not sure if this merits another PR, will continue discussion in another issue if that's OK. |
notmandatory
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK 1c81cd5
Description
As expressed by @LLFourn We want to not depend on serde_json. If we keep it around for serializing anchors we won't be able to remove it in the future because it will always be needed to do migrations
Currently there is only one widely used anchor,
ConfirmationBlockTime.The decision was to constrain support to just be for a single anchor type ConfirmationBlockTime. The anchor table will represent all fields of
ConfirmationBlockTime, each one in its own column.The reasons:
Resolves #1695
Notes to the reviewers
Why the type of the confirmation_time column is INTEGER?
By sqlite3 docs:
(Remember confirmation time is
u64)But anchor table was defined using the STRICT keyword, again, by sqlite docs:
So, the TLDR, with some help of this blog post is:
-2**63 to (2**63-1)Why not setting confirmation_time as BLOB or NUMERIC?
I don't have a strong opinion on this. INTEGER was the first numeric type I found, then later I found NUMERIC and they seemed to behave in the same way, so I didn't change the type.
I discarded BLOB and ANY first because they didn't provide a clear idea of what was being stored in the column, but in retrospective they seem to remove all the considerations that I had to do above, so maybe they are better fitted for the type.
Why adding a default value to confirmation_time column if the anchor column constraint is to be NOT NULL so all copied values will be filled?
confirmation_timeshould have the same constraint asanchor, to beNOT NULL, but as UPDATE statements might be executed in already filled tables, you must provide a default value for all the rows you are going to add the new column to. As theconfirmation_timeextraction of the json blob in anchor cannot be performed in the same step, I had to add this default value.This is flexibilizing the schema of the tables and extending the bug surface it may have, but I'm assuming the application layer will enforce the addition of a valid
confirmation_timealways.Why the default value of confirmation_time column is -1?
Considering the other alternatives were to use the max value, min value or zero and confirmation time should always be positive, I considered
-1just to be computer and human readable enough to perceive there must be something wrong if theConfirmationBlockTimeretrieved by the load of the wallet has this value set as the confirmation time.Why to not be STRICT with each statement?
It is a constraint only applicable to tables on creation.
Why not creating a whole new table without anchor column and with the confirmation_time column, copy the content from one to the other and drop the former table?
Computation cost. I didn't benchmark it, and I don't know how efficient is SQLite engine under the hood, but at first sight it seems copying a single column is better than copying four.
Changelog notice
rusqliteimplementation only toChangeSet<ConfirmationBlockTime>.anchorcolumn in sqlite byconfirmation_time.migrate_schemaversioned_scriptparameter's type to&[&str].Checklists
All Submissions:
cargo fmtandcargo clippybefore committingNew Features:
Bugfixes: