Skip to content

CAPI Migration Style Guide

Seth Boyles edited this page May 21, 2019 · 14 revisions

Overview

Cloud Controller supports both MySQL and PostgreSQL databases. We need to make sure our migrations are compatible with each and are using common features in these database distributions.

Minimal Schema Changes in Migrations

Avoid extensive schema changes in a single migration

  1. If any single change fails, the whole migration needs to be rolled back manually. Atomic migration files are easier to roll back.
  2. Each migration is run in a database synchronize call. On an issue filed with Sequel where the filer encountered a deadlock, Jeremy Evans wrote: "You shouldn't be calling Sequel.synchronize around an expensive operation like that, it should only be used to protect access to a data structure that can be accessed by multiple threads simultaneously."
  3. MySQL DDLs cannot rollback

Data Definition Language (DDL) statements in MySQL

If a structure altering migration fails during a migration in MySQL, the migrator cannot rollback the change. The biggest implication of this behavior is that the Cloud Foundry operator has to manually revert the changes. Here is an example of a migration that fails to rollback. This is MySQL's behavior for all DDL commands, including variants such as MariaDB.

An exception in the example migration below will result in no rollbacks. This is especially concerning for migrations that touch multiple tables and change data.

# This migration will not run in a transaction

Sequel.migration do
  up do
    alter_table :dogs do
     add_column :testing, String, size: 255
    end

    # Each update will be in its own transaction
    VCAP::CloudController::Dog.each_with_index do |dog, i|
      dog.update(guid: "c"*(i+1))
    end

    # Forces one transaction for all updates
    transaction do
      VCAP::CloudController::Dog.each_with_index do |dog, i|
        dog.update(guid: "d"*(i+1))
      end
    end
  end
end

In the example below, MySQL will not rollback any change before raise, including the data changes. The DDL action will cancel out any effects of a transaction.

In the example below, an explicit transaction is forced with a DDL statement within. While MySQL logs as though the migration is running in the transaction, it does not rollback any changes including the updates to the dogs table. The database column continues to exist too.

Sequel.migration do
  up do
    transaction do
      VCAP::CloudController::Dog.each_with_index do |dog, i|
        dog.update(guid: "a"*(i+1))
      end

      alter_table :dogs do
       add_column :testing, String, size: 255
      end

      raise "Forced error"
    end
  end
end

Strategies to deal with transactions in MySQL

  • Separate data changing migrations from schema changing DDL statements.
  • Explicitly wrap data changes in a transaction that does not include DDL statements. Avoid inclusion of DDL in that transaction.

Let’s rewrite the examples above keeping the listed strategies in mind:

Sequel.migration do
  up do
    alter_table :dogs do
     add_column :testing, String, size: 255
    end
  end
end

Sequel.migration do
	up do
    # Forces one transaction for all updates
    transaction do
      VCAP::CloudController::Dog.each_with_index do |dog, i|
        dog.update(guid: "d"*(i+1))
      end
    end
  end
end

Name all constraints

If you don't, MySQL and PostgreSQL will default to different constraint names, this makes it more difficult to edit the constraints in the future.

Uniqueness constraints with null

If you write a uniqueness constraint where any of the values can be null, remember that null != null. For example the values [1, 1, null] and [1, 1, null] are unique.

Different code for different databases

If you must do different things for mysql and postgres you can check the type like this:

Sequel::Model.db.adapter_scheme == :postgres

Always specify a size for a string (and don't use text)

Postgres and MySQL have different size limits on String and TEXT fields

  • MySQL: String is varchar(255), String, text: true has a max size of 16_000 for UTF8 encoded DBs

  • Postgres: String and String, text: true are TEXT and has a max size of ~1GB

To ensure customers can migrate data across the two databases (and others in the future) without running into problems, a maximum size should always be specified. As of 2017-07-30, a rubocop linter enforces this.

NOTE: In some cases you may have to use a text column in cases where the row limit will exceed the maximum by using String. This may be a smell that the table was not modeled correctly.

Migrations must work with rolling deploys

Cloud Foundry deployments with multiple API instances will run migrations when the first instance is updated. As a result, the previous version of the API code will be running against the new migrations until the deploy has finished. This means that all migrations must be backwards compatible in order to prevent API downtime during the deploy.

Known differences between MySQL and Postgres

Drop column

When dropping dropping columns there is a difference in resulting behaviour in MySQL and Postgres. Postgres auto removes indexes while MySql narrows them.

As an example image I have a new user table defined:

create table :users do
  :id String
  :name String
  :index [:id :name], :unique, name: my_user_idx
end

If later I create a subsequent migration to drop a field:

alter table do
   drop column :name
end

The table in Postgres will look like this:

create table :users do
  :id String
end

And the table in MySQL will look like this

create table :users do
  :id String
  :index [:id] unqiue: true, name: my_user_idx
end

So when we drop a column we not rely on auto-removal. We should ensure that the indexes (and potentially other things) are cleaned up explicitly so that MySQL and Postgres do not end up being inconsistent.

Clone this wiki locally