|
| 1 | +--- |
| 2 | +status: proposed |
| 3 | +creation-date: "2023-10-02" |
| 4 | +authors: [ "@pursultani" ] |
| 5 | +approvers: [ "@product-manager", "@engineering-manager" ] |
| 6 | +owning-stage: "~devops::systems" |
| 7 | +participating-stages: [] |
| 8 | +--- |
| 9 | + |
| 10 | +# Multiple databases support |
| 11 | + |
| 12 | +## Summary |
| 13 | + |
| 14 | +This document explains how to support a component with one or more databases. It |
| 15 | +describes different levels of support and offers an implementation model for |
| 16 | +each level to overcome the several challenges of the [recommended deployment models](https://docs.gitlab.com/ee/administration/reference_architectures/). |
| 17 | + |
| 18 | +The [architecture page](../index.md#multiple-databases) provides some |
| 19 | +background on this subject. |
| 20 | + |
| 21 | +A [development document](../../development/database_support.md) accompanies this |
| 22 | +blueprint. It details the implementation model and provides a few examples. |
| 23 | + |
| 24 | +## Goals |
| 25 | + |
| 26 | +- Offer [higher levels of support](#levels-of-support) for current and new |
| 27 | + components with database requirements. |
| 28 | +- Implementation refactors maintain the current configuration options |
| 29 | + already present in `gitlab.rb`. |
| 30 | +- Minimize breaking changes and refactors in database code with a consistent, |
| 31 | + testable, and extensible implementation model. |
| 32 | +- Migrate code to the newer implementation method. |
| 33 | + |
| 34 | +## Proposal |
| 35 | + |
| 36 | +### Terminology |
| 37 | + |
| 38 | +|Term|Definition| |
| 39 | +|-|-| |
| 40 | +|Database|A _logical_ database that a component, such as Rails application, uses. For example, `gitlabhq_production`. A component can have more than one database.| |
| 41 | +|Database server| A _standalone process_ or a _cluster_ that provides PostgreSQL database service. Not to be confused with database objects or data.| |
| 42 | +|Database objects| Anything that is created with Data Definition Language (DDL), such as `DATABASE`, `SCHEMA`, `ROLE`, or `FUNCTION`. It may include reference data or indices as well. These are partially created by Omnibus GitLab and the rest are created by application-specific _database migrations_.| |
| 43 | +|Standalone database server| A single PostgreSQL database server. It can be accessed through a PgBouncer instance.| |
| 44 | +|Database server cluster|Encompasses multiple PostgreSQL database servers, managed by Patroni services, backed by a Consul cluster, accessible by using one or more PgBouncer instances, and may include an HAProxy (in TCP mode) as a frontend.| |
| 45 | + |
| 46 | +### Levels of support |
| 47 | + |
| 48 | +There are different levels of database support for Omnibus GitLab components. |
| 49 | +Higher levels indicate more integration into Omnibus GitLab. |
| 50 | + |
| 51 | +#### Level 1 |
| 52 | + |
| 53 | +Configure the component with user-provided parameters from `gitlab.rb` to work |
| 54 | +with the database server. For example, `database.yml` is rendered with database |
| 55 | +server connection details of the Rails application or database parameters of |
| 56 | +Container Registry are passed to its `config.yml`. |
| 57 | + |
| 58 | +#### Level 2 |
| 59 | + |
| 60 | +Create database objects and run migrations of the component. Full support at |
| 61 | +this level requires Omnibus GitLab to not only create the required database |
| 62 | +objects, such as `DATABASE` and `ROLE`, but also to run the application |
| 63 | +migration to for the component. |
| 64 | + |
| 65 | +#### Level 3 |
| 66 | + |
| 67 | +Static configuration of PgBouncer. At this level, Omnibus GitLab can create a |
| 68 | +_dedicated PgBouncer user_ for the component and configure it with user-provided |
| 69 | +(from `gitlab.rb`) or application-mandated connection settings. |
| 70 | + |
| 71 | +This is not specific to clustered database server setups but it is a requirement |
| 72 | +for it. There are scenarios where PgBouncer is configured with a standalone |
| 73 | +database server. However, all clustered database server setups depend on |
| 74 | +PgBouncer configuration. |
| 75 | + |
| 76 | +#### Level 4 |
| 77 | + |
| 78 | +Configuration of database server cluster in high-availability (HA) mode. At this |
| 79 | +level, Omnibus GitLab supports various deployment models, ranging from _one |
| 80 | +cluster for all databases_ to _one cluster per database_. |
| 81 | + |
| 82 | +Therefore the HA configuration of logical databases must be independent of the |
| 83 | +deployment model. |
| 84 | + |
| 85 | +Consul [services](https://developer.hashicorp.com/consul/docs/srvices/configuration/services-configuration-reference) |
| 86 | +can have multiple health-checks and [watches](https://developer.hashicorp.com/consul/docs/dynamic-app-config/watches#service). |
| 87 | +At this level, Omnibus GitLab defines _a Consul service per database cluster_ |
| 88 | +and _a service watch per logical database_. |
| 89 | + |
| 90 | +Omnibus GitLab configures [Patroni to register a Consul service](https://patroni.readthedocs.io/en/latest/yaml_configuration.html#consul). |
| 91 | +The name of the service is the scope parameter as its tag is the role of the |
| 92 | +node which can be one of `master`, `primary`, `replica`, or `standby-leader`. It |
| 93 | +uses this service name, which is the same as the scope of Patroni cluster, to |
| 94 | +address a database cluster and associate it to any logical database that the |
| 95 | +cluster serves. |
| 96 | + |
| 97 | +This is done with Consul watches that track Patroni services. They find cluster |
| 98 | +leaders and notify PgBouncer with the details of both the database cluster and |
| 99 | +the logical database. |
| 100 | + |
| 101 | +#### Level 5 |
| 102 | + |
| 103 | +Automated or assisted transition from previous deployment models. Not all |
| 104 | +components require this level of support but, in some cases, where a recommended |
| 105 | +yet deprecated database configuration is in use, Omnibus GitLab may provide |
| 106 | +specialized tools or procedures to allow transitioning to the new database |
| 107 | +model. In most cases, this is not supported unless specified. |
| 108 | + |
| 109 | +### Design overview |
| 110 | + |
| 111 | +Each component manages every aspect of its own database requirements, _except |
| 112 | +its database users_. It means that component-specific implementation of database |
| 113 | +operations are done in the specific cookbooks of each component. For example, |
| 114 | +Rails or Registry database requirements are exclusively addressed in `gitlab` |
| 115 | +and `registry` cookbooks and not in `postgresql`, `pgbouncer`, or `patroni` |
| 116 | +cookbooks. |
| 117 | + |
| 118 | +The database users are excluded because `SUPERUSER` or users with `CREATEROLE` |
| 119 | +privilege can create PostgreSQL users. Due to security considerations we do not |
| 120 | +grant this privilege to the users that are connected over TCP connection. So |
| 121 | +components that may connect to a remote database do not have the permission to |
| 122 | +create their users. |
| 123 | + |
| 124 | +Hence each component creates its own database objects, _except its database user_. |
| 125 | +`postgresql` and `patroni` cookbooks create the database users but each component |
| 126 | +creates the rest of its database objects. The database users must have `CREATEDB` |
| 127 | +privilege to allow components create their own `DATABASE` and trusted `EXTENSION`. |
| 128 | + |
| 129 | +To impose a structure and fix some of the shortcomings of this approach, such as |
| 130 | +locality and limited reusability, we use [Chef resource model](https://docs.chef.io/resources/) |
| 131 | +and leverage [custom resources](https://docs.chef.io/custom_resources/) for |
| 132 | +database configuration and operations, including: |
| 133 | + |
| 134 | +- Manage lifecycle of component-specific database objects |
| 135 | +- Run application-specific database migrations |
| 136 | +- Set up PgBouncer to serve the application |
| 137 | +- Set up Consul watches to track Patroni clusters |
| 138 | + |
| 139 | +Cross-cutting concerns such as [central on/off switch for auto-migration](https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/7716), |
| 140 | +logging control, and [pre-flight checks](https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/5428) |
| 141 | +are addressed with [helper classes](https://docs.chef.io/helpers/) that are |
| 142 | +available to all components. The `package` cookbook is a suitable place for |
| 143 | +these helpers. |
| 144 | + |
| 145 | +Helper classes also provide a place to translate the existing user configuration |
| 146 | +model (in `gitlab.rb`) to the new model needed for management of |
| 147 | +multiple databases. |
| 148 | + |
| 149 | +### Implementation details |
| 150 | + |
| 151 | +[Development document](../../development/database_support.md) provides |
| 152 | +implementation details and concrete examples for the proposed design. |
0 commit comments