Skip to content

Latest commit

 

History

History
182 lines (130 loc) · 7.16 KB

File metadata and controls

182 lines (130 loc) · 7.16 KB

Database Management

Basic operations

Initialize

To start a local Postgres database container in a detached state and run any pending migrations, run make init-db. During initial setup, init-db is called automatically when running make init.

Start

To only start the database container, run the following command:

make start-db

This command is not needed when starting the application with make start

Destroy and reinitialize

To clean the database, use the following command:

make volume-recreate

This will remove all docker project volumes, rebuild the database volume, and run all pending migrations. Once completed, only the database container will be running. Simply run make start to bring up all other project containers.

Running migrations

When you're first setting up your environment, ensure that migrations are run against your db so it has all the required tables. make init does this, but if needing to work with the migrations directly, some common commands:

make db-migrate       # Apply pending migrations to db
make db-migrate-down     # Rollback last migration to db
make db-migrate-down-all # Rollback all migrations

Creating new migrations

If you've changed a python object model, auto-generate a migration file for the database and run it:

$ make db-migrate-create MIGRATE_MSG="<brief description of change>"
$ make db-migrate
Example: Adding a new column to an existing table:
  1. Manually update the database models with the changes (example_models.py in this example)
class ExampleTable(Base):
    ...
    my_new_timestamp: Mapped[datetime] = mapped_column(TIMESTAMP(timezone=True)) # Newly added line
  1. Automatically generate a migration file with make db-migrate-create MIGRATE_MSG="Add created_at timestamp to address table"
...
def upgrade():
    # ### commands auto generated by Alembic - please adjust! ###
    op.add_column("example_table", sa.Column("my_new_timestamp", sa.TIMESTAMP(timezone=True), nullable=True))
    # ### end Alembic commands ###


def downgrade():
    # ### commands auto generated by Alembic - please adjust! ###
    op.drop_column("example_table", "my_new_timestamp")
    # ### end Alembic commands ###
  1. Manually adjust the migration file as needed. Some changes will not fully auto-generate (like foreign keys), so make sure that all desired changes are included.

Multi-head situations

Alembic migrations form an ordered history, with each migration having at least one parent migration as specified by the down_revision variable. This can be visualized by:

make db-migrate-history

When multiple migrations are created that point to the same down_revision a branch is created, with the tip of each branch being a "head". The above history command will show this, but a list of just the heads can been retrieved with:

make db-migrate-heads

CI/CD runs migrations to reach the "head". When there are multiple, Alembic can't resolve which migrations need to be run. If you run into this error, you'll need to fix the migration branches/heads before merging to main.

If the migrations don't depend on each other, which is likely if they've branched, then you can just run:

make db-migrate-merge-heads

Which will create a new migration pointing to all current "head"s, effectively pulling them all together.

Or, if you wish to avoid creating extra migrations, you can manually adjust the down_revision of one of the migrations to point to the other one. This is also the necessary approach if the migrations need to happen in a defined order.

Alembic Check

To make sure we haven't forgotten to generate migrations after modifying the database models, we rely on alembic check which can be run by make db-check-migrations. What this command does is compare the current local database to your migration files, and see if anything would be generated if you attempted to create a new migration. If there would be anything created, it errors.

This check also runs as part of our CI/CD, so even if you forget to run it yourself during development, it will be caught when you send a pull request.

Operations

Removing a column

First Deploy

In order to remove a column, we'll first need to remove all usage of the column in the API code base.

One usage case is in transformation module (e.g. \simpler-grants-gov\api\src\data_migration\transformation\transform_util.py).

def transform_example():
    example.first_column = example_source.first_column
-   example.my_column = example_source.my_column    # column to be removed
    return example

However, since the column will still be in the database model, SQLAlchemy will still include it in SELECTs and INSERTs for the table. We'll need to set this column to deferred, in order to exclude it from queries, and to evaluates_none, to stop inserting null for it. This does require that the column be nullable to work, if it isn't you'll first need to make it nullable in an earlier migration.

Note: evaulates_none is required if the column has a server_default as it will make it so if a field isn't set, rather than excluding it from the insert into the DB, it sets it to null so the server_default isn't used. If the column has no server_default, it isn't technically required to use it, but we recommend adding it because it's simpler to have a single pattern to follow rather than multiple.

class Example(ApiSchemaTable, TimestampMixin):
    __tablename__ = "example"

    example_id: Mapped[uuid.UUID] = mapped_column(
        UUID, primary_key=True, default=uuid.uuid4
    )

    # The column we want to deprecate, needs both
    # the call to evaluates_none() and deferred=True
    my_column: Mapped[str | None] = mapped_column(Text().evaluates_none(), deferred=True)

Second Deploy

Once the column usage removal has been deployed (including deprecating the column in the database model), we can safely remove the column from the database model and the actual database. To do so, simply delete the deprecated column from the database model and create a corresponding database migration.

Removing a table

First Deploy

Removing a table is similar to, but simpler than, removing a column. You'll still start by removing all usage of the table in the API code base. No further steps are required at this point, since SQLAlchemy won't be interacting with the table at this point.

Second Deploy

Once the table usage removal has been deployed, we can safely remove the table from the database model and the actual database. Again, simply delete the table from the database model and create a corresponding migration.