-
Notifications
You must be signed in to change notification settings - Fork 100
v7 to v9 config migration script #669
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
node_management/migration/README.md
Outdated
|
||
### Usage Example | ||
|
||
alias rpc-v8=~/Documents/share/repo/smr-moonshot-testnet/target/devopt/rpc_node |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The node operators use Docker.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
en.. I am thinking operator guide will be on web page not here. We can update the example here but need to have both native and docker one
…plate Sc/upgrade config template
support dump template to toml file in cli
path = "./configs/rpc_archive" | ||
# Whether the database should be pruned. If `true`, data that is more than `epochs_to_retain` | ||
# old will be deleted. | ||
enable_pruning = false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I mentioned in my comment on the readme, I now think we should set this to true
for all databases on both nodes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
our archive rpc should not set it to true, it needs to be careful when using --assume-yes
we can set default to true, and have devops be aware of when migrating archive rpc it should be kept as false
,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. How do you suggest that we ensure the operators activate it then? Tell them not to update the value manually in the release docs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Operator should always run in prompt mode. Because operator may want to run rpc as an archive.
Also I agree it is better to set default to true if the prune function is ready now for rpc. The validator template is already set to true. The update to rpc is done in a496619
It is easier to communicate to our devops
than to operator, so ok to change it default to true, while I need to ask devops to always run migration-config manually for special denoted nodes (i.e. archive)
Similarly, as enable_snapshot
default to false, also need to tell devops to not using assume-yes
for snapshot uploader nodes
node_management/config_migration/src/rpc_config/rpc_config_v9_1_x_mainnet_template.toml
Outdated
Show resolved
Hide resolved
node_management/config_migration/src/smr_settings/smr_settings_v9_1_x_mainnet_template.toml
Outdated
Show resolved
Hide resolved
@@ -615,6 +615,12 @@ EOF | |||
if [ "$NETWORK" == "mainnet" ]; then | |||
export AWS_ACCESS_KEY_ID="c64bed98a85ccd3197169bf7363ce94f" | |||
export AWS_SECRET_ACCESS_KEY="0b7f15dbeef4ebe871ee8ce483e3fc8bab97be0da6a362b2c4d80f020cae9df7" | |||
|
|||
if is_validator; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These buckets don't exist yet, so we can't make this update just yet. Both types of nodes need to default to the mainnet-data
bucket for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I intended to put a palceholder, and then ask devops to create the bucket
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, but this script is currently regularly downloaded by the operators, so we won't be able to merge it until the new buckets are in place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, that is fine, we can verify it work before merging
# Finished migration, and follow guide that start the node with the new image with new config and sync | ||
# ------ | ||
|
||
# ./manage_supra_nodes.sh \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this code still required?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I intended to keep it in case we need to create a script for operator case
i.e. migrate config only, then run sync
echo "Migrate cli profile from v8 to v9" | ||
supra-v9 profile migrate | ||
echo "Migrate smr_settings from v7 to v9" | ||
# TODO(SC) to be run in docker context, update path to `./configs/config.toml` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this still WIP?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, because package of the migrate-config
script into docker image is not yet ready
echo "Migrate db from v7 to v8" | ||
rpc-v8 migrate-db configs/config.toml | ||
echo "Migrate db from v8 to v9" | ||
rpc-v9 migrate-db configs/config.toml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't want to run the v9 DB migration because it will take far too long. As with the testnet release, we'll run the migration ourselves, upload the snapshot and then ask the operators to download it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This script is intended to be used by us, not by operator
in my mind, operator is guided by the user-guide to run migrate-config
and sync
manually.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the script was intended for us now, I thought we will provide user guide for operator to upgrade manually, if neded, we can create a new script for operator use case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't want to run this migration even on our own nodes. It takes 24+ hours. Syncing the snapshot takes maybe an hour.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also don't use Docker on our machines.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to run it on a pair of rpc+validator snapshot uploader nodes, right?
I remember we had a few nodes running docker image also, or now it is completely native?
The docker image is only used for running migration, then removed, not for starting the nodes.
if the docker image is less performant to run migration, we can change it to use native build
We can also change the name of the script that is for supra devops for first step only. Then we can later create a script for operator use case
# Localnet only (Optional: local env path is different from docker env path, need to be modified to use docker env path) | ||
sed -i "" "s#${HOST_SUPRA_HOME}#configs#g" ${HOST_SUPRA_HOME}/smr_settings.toml | ||
echo "Migrate db from v7 to v9" | ||
supra-v9 data migrate -p configs/smr_settings.toml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's not do this either. We'll migrate the database of one of our validators, then upload a snapshot for the operators to sync from. I'm concerned that anyone who has synced the snapshot in mainnet will end up having to migrate the full history of the chain store due to the RPC node's version of the database not having any entries in the prune index for historical values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as explained in above about script purpose
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same response. Migrating will take too long if the full database has to be migrated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess it is the name of the script make confusing, or the script should not be put in this repo.
the script was intended used for upgrading the snapshot uploaders nodes, not for all nodes. We can add a script for operator use case, and test on a pair of rpc+validator on mainnet after snapshot uploaders are done.
after that, then we can provide the verified operator specific script to operators
…1_x_mainnet_template.toml Co-authored-by: Isaac Doidge <[email protected]>
Requirements
We need devops team to package this
migrate-config
python script to rpc_node and supra version 9 docker image for mainnet, so that operator do not need to install locally @darpan-supraoraclesWe need to have snapshot bucket created for mainnet rpc and validator
Note that aws access key may need to be updated in script
--assume-yes
option to apply default migration values without prompting, it is most common use case for regular nodes, so that it is possible to batch processing config migration for our foundation nodes.IMPORTANT note for DEVOPS
But for specially denoted nodes (i.e. archives, snapshot uploaders), use interactive prompt:
Do not use
--assume-yes
for these nodes, because theenable_snapshots
andenable_pruning
should retain original value instead of using the default.Run the script in interactive prompt and confirm each update, and also check final output of migration .
v7-v9 config migration docs
Install the config migration script
from root dir of repo, you can install it like below
pip install node_management/config_migration
Usage
$ migrate-config --help Usage: migrate-config [OPTIONS] COMMAND [ARGS]... Migration CLI for Supra configs. Options: --help Show this message and exit. Commands: rpc Migrate RPC config. smr Migrate SMR config.
example output
E2e migration process from v7 to v9 example script
Note that below scripts are example, should be adapted and not directly used
node_management/migrate_config_and_db_mainnet_v7_to_v9.sh
is example to be used for our foundation node to migrate config and dbnode_management/migrate_config_v7_to_v9_docker.sh
is example to be used for migrate config only for node using docker.