Skip to content

Commit 41caf3b

Browse files
Robert Marshallpursultanieread
committed
Merge branch 'docs-1349-multi-db-blueprint' into 'master'
Blueprint to support multiple databases Closes gitlab-org/distribution/team-tasks#1349 See merge request https://gitlab.com/gitlab-org/omnibus-gitlab/-/merge_requests/7172 Merged-by: Robert Marshall <[email protected]> Approved-by: Balasankar 'Balu' C <[email protected]> Approved-by: Robert Marshall <[email protected]> Reviewed-by: Balasankar 'Balu' C <[email protected]> Reviewed-by: Robert Marshall <[email protected]> Reviewed-by: Evan Read <[email protected]> Reviewed-by: Vladimir Shushlin <[email protected]> Reviewed-by: João Pereira <[email protected]> Co-authored-by: Hossein Pursultani <[email protected]> Co-authored-by: Evan Read <[email protected]>
2 parents 26d486d + 6f138ce commit 41caf3b

File tree

3 files changed

+624
-0
lines changed

3 files changed

+624
-0
lines changed

doc/architecture/index.md

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -235,3 +235,19 @@ The cache mechanism can be summarized as follows:
235235
### What happens during `gitlab-ctl reconfigure`
236236

237237
One of the commonly used commands while managing a GitLab instance is `gitlab-ctl reconfigure`. This command, in short, parses the config file and runs the recipes with the values supplied from it. The recipes to be run are defined in a file called `dna.json` present in the `embedded` folder inside the installation directory (This file is generated by a software dependency named `gitlab-cookbooks` that's defined in the software definitions). In the case of GitLab CE, the cookbook named `gitlab` will be selected as the master recipe, which in turn invokes all other necessary recipes, including runit. In short, reconfigure is a chef-client run that configures different files and services with the values provided in the configuration template.
238+
239+
## Multiple databases
240+
241+
Previously, the GitLab Rails application was the sole client connected to the
242+
Omnibus GitLab database. Over time, this has changed:
243+
244+
- Praefect and Container Registry use their own databases.
245+
- The Rails application now uses a [decomposed database](https://gitlab.com/groups/gitlab-org/-/epics/5883).
246+
247+
Because additional databases might be necessary:
248+
249+
- The [multi-database blueprint](multiple_database_support/index.md) explains
250+
how to add database support to Omnibus GitLab for new components and features.
251+
- The [accompanying development document](../development/database_support.md)
252+
details the implementation model and provides examples of adding database
253+
support.
Lines changed: 152 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,152 @@
1+
---
2+
status: proposed
3+
creation-date: "2023-10-02"
4+
authors: [ "@pursultani" ]
5+
approvers: [ "@product-manager", "@engineering-manager" ]
6+
owning-stage: "~devops::systems"
7+
participating-stages: []
8+
---
9+
10+
# Multiple databases support
11+
12+
## Summary
13+
14+
This document explains how to support a component with one or more databases. It
15+
describes different levels of support and offers an implementation model for
16+
each level to overcome the several challenges of the [recommended deployment models](https://docs.gitlab.com/ee/administration/reference_architectures/).
17+
18+
The [architecture page](../index.md#multiple-databases) provides some
19+
background on this subject.
20+
21+
A [development document](../../development/database_support.md) accompanies this
22+
blueprint. It details the implementation model and provides a few examples.
23+
24+
## Goals
25+
26+
- Offer [higher levels of support](#levels-of-support) for current and new
27+
components with database requirements.
28+
- Implementation refactors maintain the current configuration options
29+
already present in `gitlab.rb`.
30+
- Minimize breaking changes and refactors in database code with a consistent,
31+
testable, and extensible implementation model.
32+
- Migrate code to the newer implementation method.
33+
34+
## Proposal
35+
36+
### Terminology
37+
38+
|Term|Definition|
39+
|-|-|
40+
|Database|A _logical_ database that a component, such as Rails application, uses. For example, `gitlabhq_production`. A component can have more than one database.|
41+
|Database server| A _standalone process_ or a _cluster_ that provides PostgreSQL database service. Not to be confused with database objects or data.|
42+
|Database objects| Anything that is created with Data Definition Language (DDL), such as `DATABASE`, `SCHEMA`, `ROLE`, or `FUNCTION`. It may include reference data or indices as well. These are partially created by Omnibus GitLab and the rest are created by application-specific _database migrations_.|
43+
|Standalone database server| A single PostgreSQL database server. It can be accessed through a PgBouncer instance.|
44+
|Database server cluster|Encompasses multiple PostgreSQL database servers, managed by Patroni services, backed by a Consul cluster, accessible by using one or more PgBouncer instances, and may include an HAProxy (in TCP mode) as a frontend.|
45+
46+
### Levels of support
47+
48+
There are different levels of database support for Omnibus GitLab components.
49+
Higher levels indicate more integration into Omnibus GitLab.
50+
51+
#### Level 1
52+
53+
Configure the component with user-provided parameters from `gitlab.rb` to work
54+
with the database server. For example, `database.yml` is rendered with database
55+
server connection details of the Rails application or database parameters of
56+
Container Registry are passed to its `config.yml`.
57+
58+
#### Level 2
59+
60+
Create database objects and run migrations of the component. Full support at
61+
this level requires Omnibus GitLab to not only create the required database
62+
objects, such as `DATABASE` and `ROLE`, but also to run the application
63+
migration to for the component.
64+
65+
#### Level 3
66+
67+
Static configuration of PgBouncer. At this level, Omnibus GitLab can create a
68+
_dedicated PgBouncer user_ for the component and configure it with user-provided
69+
(from `gitlab.rb`) or application-mandated connection settings.
70+
71+
This is not specific to clustered database server setups but it is a requirement
72+
for it. There are scenarios where PgBouncer is configured with a standalone
73+
database server. However, all clustered database server setups depend on
74+
PgBouncer configuration.
75+
76+
#### Level 4
77+
78+
Configuration of database server cluster in high-availability (HA) mode. At this
79+
level, Omnibus GitLab supports various deployment models, ranging from _one
80+
cluster for all databases_ to _one cluster per database_.
81+
82+
Therefore the HA configuration of logical databases must be independent of the
83+
deployment model.
84+
85+
Consul [services](https://developer.hashicorp.com/consul/docs/srvices/configuration/services-configuration-reference)
86+
can have multiple health-checks and [watches](https://developer.hashicorp.com/consul/docs/dynamic-app-config/watches#service).
87+
At this level, Omnibus GitLab defines _a Consul service per database cluster_
88+
and _a service watch per logical database_.
89+
90+
Omnibus GitLab configures [Patroni to register a Consul service](https://patroni.readthedocs.io/en/latest/yaml_configuration.html#consul).
91+
The name of the service is the scope parameter as its tag is the role of the
92+
node which can be one of `master`, `primary`, `replica`, or `standby-leader`. It
93+
uses this service name, which is the same as the scope of Patroni cluster, to
94+
address a database cluster and associate it to any logical database that the
95+
cluster serves.
96+
97+
This is done with Consul watches that track Patroni services. They find cluster
98+
leaders and notify PgBouncer with the details of both the database cluster and
99+
the logical database.
100+
101+
#### Level 5
102+
103+
Automated or assisted transition from previous deployment models. Not all
104+
components require this level of support but, in some cases, where a recommended
105+
yet deprecated database configuration is in use, Omnibus GitLab may provide
106+
specialized tools or procedures to allow transitioning to the new database
107+
model. In most cases, this is not supported unless specified.
108+
109+
### Design overview
110+
111+
Each component manages every aspect of its own database requirements, _except
112+
its database users_. It means that component-specific implementation of database
113+
operations are done in the specific cookbooks of each component. For example,
114+
Rails or Registry database requirements are exclusively addressed in `gitlab`
115+
and `registry` cookbooks and not in `postgresql`, `pgbouncer`, or `patroni`
116+
cookbooks.
117+
118+
The database users are excluded because `SUPERUSER` or users with `CREATEROLE`
119+
privilege can create PostgreSQL users. Due to security considerations we do not
120+
grant this privilege to the users that are connected over TCP connection. So
121+
components that may connect to a remote database do not have the permission to
122+
create their users.
123+
124+
Hence each component creates its own database objects, _except its database user_.
125+
`postgresql` and `patroni` cookbooks create the database users but each component
126+
creates the rest of its database objects. The database users must have `CREATEDB`
127+
privilege to allow components create their own `DATABASE` and trusted `EXTENSION`.
128+
129+
To impose a structure and fix some of the shortcomings of this approach, such as
130+
locality and limited reusability, we use [Chef resource model](https://docs.chef.io/resources/)
131+
and leverage [custom resources](https://docs.chef.io/custom_resources/) for
132+
database configuration and operations, including:
133+
134+
- Manage lifecycle of component-specific database objects
135+
- Run application-specific database migrations
136+
- Set up PgBouncer to serve the application
137+
- Set up Consul watches to track Patroni clusters
138+
139+
Cross-cutting concerns such as [central on/off switch for auto-migration](https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/7716),
140+
logging control, and [pre-flight checks](https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/5428)
141+
are addressed with [helper classes](https://docs.chef.io/helpers/) that are
142+
available to all components. The `package` cookbook is a suitable place for
143+
these helpers.
144+
145+
Helper classes also provide a place to translate the existing user configuration
146+
model (in `gitlab.rb`) to the new model needed for management of
147+
multiple databases.
148+
149+
### Implementation details
150+
151+
[Development document](../../development/database_support.md) provides
152+
implementation details and concrete examples for the proposed design.

0 commit comments

Comments
 (0)