- 
                Notifications
    You must be signed in to change notification settings 
- Fork 232
Modifying the database schema
This is a temporary section with instructions for developers to have them test the database migrations that will be released with v2.0. The instructions below hopefully make it as easy as possible to test this.
- Checkout the latest developbranch:git checkout develop && git pull
- Install latest dependencies: pip install -U -e .[tests,pre-commit]
- Run verdi status: this will update your configuration to the latest schema version
- 
Create a clone of the PostgreSQL database you want to migrate - Login as the postgresuser:sudo su - postgres
- Load the postgres program: psql
- If it is already loaded in postgres, you can clone it in psqldirectly:CREATE DATABASE aiida_clone WITH TEMPLATE aiida_original_db OWNER aiida;Make sure to change the names of the databases and the owner of course. IMPORTANT: if you are setting a new database owner, you'll need also to assign permissions for each table. Check here to know how to do it.
- If the database is on another machine and you want to test the migration on your workstation.
- Go to the remote machine and dump the database: pg_dump --no-privileges --no-owner -h localhost -d aiida_original_db -U aiida -W > aiida_original_db.psql
- Copy over the aiida_original_db.psqlfile to your workstation
- Create a new database in psql:CREATE DATABASE aiida_clone OWNER aiida;
- Load the database dump: psql -h localhost -d aiida_original_db -U aiida -W < aiida_original_db.psql
 
- Go to the remote machine and dump the database: 
 
- Login as the 
- 
Check statistics of the database (this information should be kept for reporting): - Note whether it is Django or SqlAlchemy. If you don't know, run SELECT * FROM alembic_version;inpsql. If it returns a value, it is SqlAlchemy, if it errors withERROR: relation "alembic_version" does not existit is Django
- Get database node count: SELECT count(*) FROM db_dbnode;
- Get database size: SELECT pg_size_pretty(pg_database_size('aiida_clone'));
- Get database revision:
- For SqlAlchemy: SELECT * FROM alembic_version;
- For Django: SELECT name FROM django_migrations WHERE app = 'db' ORDER BY id DESC LIMIT 1;
 
- For SqlAlchemy: 
 
- Note whether it is Django or SqlAlchemy. If you don't know, run 
- 
Create a clone of the repository (Note: this is only necessary if your database revision is below a certain revision; the migrations above it will not affect the repository, including the repository migration itself, as it will leave the original repo intact and simply write the new disk object store in parallel.) - Django: if you have revision 0027 or above, there is no need to clone the repo
- SqlAlchemy: if your revision is in the following list, there is no need to clone the repo: ['1de112340b16', '1de112340b17', '1de112340b18', '34a831f4286d', '535039300e4a', '1feaea71bd5a', '7536a82b2cc4', '0edcdd5a30f0', 'bf591f31dd12', '118349c10896', '91b573400be5', '7b38a9e783e7', 'e734dd5e50d7', 'e797afa09270', '26d561acd560', '07fac78e6209', 'de2eaf6978b4', '1830c8430131', '1b8ed3425af9', '3d6190594e19', '5a49629f0d45', '5ddd24e52864', 'd254fdfed416', '61fc0913fae9', 'ce56d84bcc35']
 
- 
Create a profile with the correct database and repository configured - Easiest is to open config.jsonand clone an entry and simply update the name of the database and the location of the repository
- For the storage.backendkey you should simply put"psql_dos"regardless whether the database has a Django or SqlAlchemy schema. The migrations will automatically detect this from the database itself.
- 
IMPORTANT: if the database has an old schema version (see the point above) you should have made a clone of the repository and you should make sure that the storage.config.repository_urikey points to the correct path
 
- Easiest is to open 
- 
Before running any migration, you should be able to run verdi -p PROFILE statusandverdi -p PROFILE storage version
- Make sure the daemon is not running
- Run time verdi -p aiida-profile storage migrate -f. IMPORTANT do not forget thetimein front. We would like to gather this information to get an idea of how long the migrations typically take.
- Copy the log messages from the migrations printed to stdout.
Note: You can use verdi devel run-sql "SQL_TEXT", to run any SQL against the profile's database
- Rerun the statistics database size and node count in psql:- SELECT count(*) FROM db_dbnode;
- SELECT pg_size_pretty(pg_database_size('aiida_clone'));
 
- Run verdi statusand check that storage connection is green
- Run verdi storage info --statisticsand check the outcome is as expected
- Open verdi shelland do some tests: queries, opening repository files of some nodes etc.
For each database for which you test the migration, please report the following:
- Database backend (Django or SqlAlchemy)
- Starting revision
- Node count before migration
- Node count after migration
- Database size before migration
- Database size after migration
- Time taken for the actual migration
- Messages printed to stdout by the migrations
- Any errors you encountered or problems you noticed afterwards when manually inspecting the data
- Output of the following command: verdi devel run-sql "SELECT pt.tablename AS TableName, t.indexname AS IndexName, pc.reltuples, pg_size_pretty(pg_relation_size(quote_ident(pt.tablename)::text)), pg_size_pretty(pg_relation_size(quote_ident(t.indexrelname)::text)), t.idx_scan FROM pg_tables AS pt LEFT OUTER JOIN pg_class AS pc ON pt.tablename=pc.relname LEFT OUTER JOIN (SELECT pc.relname AS TableName, pc2.relname AS IndexName, psai.idx_scan, psai.indexrelname FROM pg_index AS pi JOIN pg_class AS pc ON pc.oid = pi.indrelid JOIN pg_class AS pc2 ON pc2.oid = pi.indexrelid JOIN pg_stat_all_indexes AS psai ON pi.indexrelid = psai.indexrelid )AS T ON pt.tablename = T.TableName WHERE pt.schemaname='public';"
The Django database schema can be found in aiida.backends.djsite.db.models <https://github.com/aiidateam/aiida-core/tree/develop/aiida/backends/djsite/db/models.py>_.
If you need to change the database schema follow these steps:
- 
Make all the necessary changes to aiida.backends.djsite.db.models <https://github.com/aiidateam/aiida-core/tree/develop/aiida/backends/djsite/db/models.py>_.
- 
Create a new migration file by running:: $ python aiida/backends/djsite/manage.py makemigrationsThis will create the migration file in aiida/backends/djsite/db/migrationswhose name begins with a number followed by some description. If the description is not appropriate then change it to something better but retain the number.
- 
Open the generated file and make the following changes: .. code-block:: python from aiida.backends.djsite.db.migrations import upgrade_schema_version ... REVISION = # choose an appropriate version number (hint: higher than the last migration!) DOWN_REVISION = # the revision number of the previous migration ... class Migration(migrations.Migration): ... operations = [ ... upgrade_schema_version(REVISION, DOWN_REVISION) ]
- 
The migration file now contains some migrations steps that were generated automatically. Please make sure that they are correct. Also, if you want to add some changes that affect the content of the database -- you should do it "manually" by adding some sql commands that will run directly on your database: .. code-block:: python forward_sql = [ """UPDATE db_dbgroup SET type_string = 'auto.import' WHERE type_string = 'aiida.import';""", """UPDATE db_dbgroup SET type_string = 'auto.run' WHERE type_string = 'autogroup.run';""", ] reverse_sql = [ """UPDATE db_dbgroup SET type_string = 'aiida.import' WHERE type_string = 'auto.import';""", """UPDATE db_dbgroup SET type_string = 'autogroup.run' WHERE type_string = 'auto.run';""", ] ... operations = [ ... migrations.RunSQL( sql='\n'.join(forward_sql), reverse_sql='\n'.join(reverse_sql)), upgrade_schema_version(REVISION, DOWN_REVISION), ... ]As you can see here, you should not only provide the SQL commands to upgrade your database, but also the commands to revert these changes in case you want to perform a downgrade (see: sql=forward_sql,reverse_sql=reverse_sql)
- 
Change the LATEST_MIGRATIONvariable inaiida/backends/djsite/db/migrations/__init__.pyto the name of your migration file:.. code-block:: python LATEST_MIGRATION = '0003_my_db_update'This allows AiiDA to get the version number from your migration and make sure the database and the code are in sync. 
- 
Migrate your database to the new version using verdiand specifying the correct profile::$ verdi -p {profile} database migrateIn case you want to (and, most probably, you should) test the downgrade operation, please check the list of available versions of the database:: $ python aiida/backends/djsite/manage.py showmigrations dbThe output will look something like the following:: db [X] 0001_initial [X] 0002_db_state_change [X] 0003_add_link_type [X] 0004_add_daemon_and_uuid_indices [X] 0005_add_cmtime_indices [X] 0006_delete_dbpath [X] 0007_update_linktypes [X] 0008_code_hidden_to_extra [X] 0009_base_data_plugin_type_string [X] 0010_process_type [X] 0011_delete_kombu_tables [X] 0012_drop_dblock [X] 0013_django_1_8 [X] 0014_add_node_uuid_unique_constraint [X] 0015_invalidating_node_hash [X] 0016_code_sub_class_of_data [X] 0017_drop_dbcalcstate [X] 0018_django_1_11Choose the previous migration step and migrate to it:: $ python aiida/backends/djsite/manage.py migrate db 0017_drop_dbcalcstateCheck that both upgrade and downgrade changes are successful and if yes, go to the next step. 
- 
Add tests for your migrations to the aiida-core/aiida/backends/djsite/db/subtests/migrationsmodule.
| Note | Such a test can only be applied to the migration of the database content. | For example, you can not test modifications of the database column names.
The SQLAlchemy database schema can be found in aiida.backends.sqlalchemy.models <https://github.com/aiidateam/aiida-core/tree/develop/aiida/backends/sqlalchemy/models>_.
If you need to change the database schema follow these steps:
- 
Make all the necessary changes to the model than you would like to modify located in the aiida/backends/sqlalchemy/modelsdirectory.
- 
Create a new migration file by going to aiida/backends/sqlalchemyand executing::$ python aiida/backends/sqlalchemy/manage.py revision "This is the description for the next revision"This will create a new migration file in aiida/backends/sqlalchemy/migrations/versionswhose names begins with an automatically generated hash and the provided message for this new migration. Modify the migration message to accurately describe the purpose of the migration.
- 
Have a look at the generated migration file and ensure the migration is correct. The file should contain automatically generated hashes that point to the previous and to the current revision: .. code-block:: python revision = 'e72ad251bcdb' down_revision = 'b8b23ddefad4'Also upgrade()anddowngrade()function definitions should be present in the file:.. code-block:: python def upgrade(): # some upgrade operations def downgrade(): # some downgrade operationsIf you want to add some changes that affect the content of the database -- you should do it "manually" by adding some SQL commands that will run directly on your database. See the following example and adapt it for your needs: .. code-block:: python from sqlalchemy.sql import text forward_sql = [ """UPDATE db_dbgroup SET type_string = 'auto.import' WHERE type_string = 'aiida.import';""", """UPDATE db_dbgroup SET type_string = 'auto.run' WHERE type_string = 'autogroup.run';""", ] reverse_sql = [ """UPDATE db_dbgroup SET type_string = 'aiida.import' WHERE type_string = 'auto.import';""", """UPDATE db_dbgroup SET type_string = 'autogroup.run' WHERE type_string = 'auto.run';""", ] def upgrade(): conn = op.get_bind() statement = text('\n'.join(forward_sql)) conn.execute(statement) def downgrade(): conn = op.get_bind() statement = text('\n'.join(reverse_sql)) conn.execute(statement)If you want to learn more about the migration operations, you can have a look at the Alembic documentation <https://alembic.sqlalchemy.org/en/latest/>_.
- 
Migrate your database to the new version using verdiand specifying the correct profile::$ verdi -p {profile} database migrate
- 
Add tests for your migrations to aiida-core/aiida/backends/sqlalchemy/tests/test_migrations.py
The alembic_manage.py provides several options to control your SQLAlchemy migrations. By executing::
$ python aiida/backends/sqlalchemy/manage.py --help
you will get a full list of the available arguments that you can pass and commands. Briefly, the available commands are:
- upgrade: Upgrade to the later version.
- downgrade: Downgrade the version of your database.
- history: Lists the available migrations in chronological order.
- current: Displays the current version of the database.
- revision: Creates a new migration file based on the model changes.
Alembic migrations should work automatically and migrate your database to the latest version. However, if you were using SQLAlchemy before we introduced Alembic, you may get a message like to following during the first migration::
sqlalchemy.exc.ProgrammingError: (psycopg2.ProgrammingError) relation
"db_dbuser" already exists [SQL: '\nCREATE TABLE db_dbuser (\n\tid SERIAL
NOT NULL, \n\temail VARCHAR(254), \n\tpassword VARCHAR(128),
\n\tis_superuser BOOLEAN NOT NULL, \n\tfirst_name VARCHAR(254),
\n\tlast_name VARCHAR(254), \n\tinstitution VARCHAR(254), \n\tis_staff
BOOLEAN, \n\tis_active BOOLEAN, \n\tlast_login TIMESTAMP WITH TIME ZONE,
\n\tdate_joined TIMESTAMP WITH TIME ZONE, \n\tCONSTRAINT db_dbuser_pkey
PRIMARY KEY (id)\n)\n\n']
In this case, you should manually create the Alembic table in your database and add a line with the database version number. To do so, use psql to connect to the desired database::
$ psql aiidadb_sqla
where you should replace aiidadb_sqla with the name of the database that you would like to modify.
Then, execute the following commands:
.. code-block:: sql
CREATE TABLE alembic_version (version_num character varying(32) not null, PRIMARY KEY(version_num));
INSERT INTO alembic_version VALUES ('e15ef2630a1b');
GRANT ALL ON alembic_version TO aiida;