Skip to content

ADR for Backup Encryption#167

Merged
mikeshootzz merged 8 commits intomasterfrom
adr-backup-encryption
Oct 21, 2025
Merged

ADR for Backup Encryption#167
mikeshootzz merged 8 commits intomasterfrom
adr-backup-encryption

Conversation

@mikeshootzz
Copy link
Contributor

@mikeshootzz mikeshootzz commented Oct 6, 2025

Summary

Documents the decision to use rclone as an S3 encryption proxy for database backups. Rclone will deploy per-instance to transparently encrypt all backup data before it reaches object storage, working with both CNPG and MariaDB Operator.

@mikeshootzz mikeshootzz added the decision A decision that changes the architecture label Oct 6, 2025
@mikeshootzz mikeshootzz marked this pull request as draft October 6, 2025 07:48
@mikeshootzz mikeshootzz force-pushed the adr-backup-encryption branch from b06a4e3 to 80c3680 Compare October 6, 2025 15:16
@mikeshootzz mikeshootzz requested review from a team, Kidswiss, TheBigLee, mdnix and zugao and removed request for a team October 14, 2025 13:41
@mikeshootzz mikeshootzz marked this pull request as ready for review October 14, 2025 13:41

== Decision

We will use rclone as an S3 encryption proxy to provide encrypted backups for MariaDB Galera and PostgreSQL services. Rclone will be deployed as a separate service in each instance namespace, configured with crypt remotes to transparently encrypt all backup data before it reaches the object storage backend.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you created a little PoC to test backup and restore with CNPG or Mariadb using the rclone solution?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, a quick test if the operators are actually compatible with this would be nice.

It doesn't even have to be via AppCat. Simply test it directly with cnpg and mariadb-operator resources.

Copy link
Contributor

@zugao zugao Oct 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree here, we need a solid PoC, without AppCat. We have to use the CRDs from the operator to see that:

  1. Backup is happening and we confirm (via S3cmd for instance) the data is encrypted, safe and can be decoded.
  2. If I point to that S3 endpoint the restore is happening without issues.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to that from my side. Also testing the throughput with various resource settings (especially cpu limits) should be done in the PoC as well

@@ -0,0 +1,133 @@
= ADR 42 - Backup Encryption for MariaDB Galera and PostgreSQL
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
= ADR 42 - Backup Encryption for MariaDB Galera and PostgreSQL
= ADR 0042 - Backup Encryption for MariaDB Galera and PostgreSQL

** xref:adr/0037-mariadb-bitnami-replacement.adoc[]
** xref:adr/0038-appcat-redis-alternative.adoc[] No newline at end of file
** xref:adr/0038-appcat-redis-alternative.adoc[]
** xref:adr/42-backup-encryption-for-mariadb-galera-and-postgresql.adoc[] No newline at end of file
Copy link
Member

@mdnix mdnix Oct 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You will also have to add your document to: docs/modules/ROOT/pages/adr/index.adoc
And for consistency change the filename to 0042-backup-encryption-for-mariadb-galera-and-postgresql.adoc


With the decision to use rclone as an encryption proxy for MariaDB Galera and PostgreSQL backups, we need to:

* Implement rclone deployment manifests for each database instance namespace
Copy link
Contributor

@zugao zugao Oct 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might want to evaluate if we could use a side care instead of a deployment or other solutions such as a controller or an operator. Having a deployment per each instance can and will increase the resource consumption

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or as sort of a prebackup pod that only gets started when an actual backup is run (no idea how feasible this actually is though)

Copy link
Contributor

@zugao zugao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would further investigate if we could use a controller or even an operator that would stay in between the service operator and crossplane. It would manage perhaps via a CR the layer in between XObjectBucket and composite.


With the decision to use rclone as an encryption proxy for MariaDB Galera and PostgreSQL backups, we need to:

* Implement rclone deployment manifests for each database instance namespace
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or as sort of a prebackup pod that only gets started when an actual backup is run (no idea how feasible this actually is though)

* Implement rclone deployment manifests for each database instance namespace
* Configure CNPG Barman and MariaDB Operator to use the rclone endpoint instead of direct S3 access
* Establish secret management for rclone encryption passwords
* Monitor rclone proxy performance and availability alongside database services
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The performance really needs to be analyzed thoroughly. We want to use the least amount of resources as possible to not have too much overhead. However, it needs to be enough, as to not stall the backup data transfer.


== Decision

We will use rclone as an S3 encryption proxy to provide encrypted backups for MariaDB Galera and PostgreSQL services. Rclone will be deployed as a separate service in each instance namespace, configured with crypt remotes to transparently encrypt all backup data before it reaches the object storage backend.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to that from my side. Also testing the throughput with various resource settings (especially cpu limits) should be done in the PoC as well

@mikeshootzz mikeshootzz force-pushed the adr-backup-encryption branch from 828d60f to 18e7b24 Compare October 16, 2025 08:01
Copy link
Contributor

@Kidswiss Kidswiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

42!

Copy link
Member

@tobru tobru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great research and architecture document; thanks a lot for putting the effort into it!
My comments mostly relate to improve the structure of the document, so that it follows ADR 0001 more closely.

I like the idea of putting rclone in between the backup source and destination. This could - in a future iteration - allow storing backups on other targets than just object storage, depending on customers requests and needs!


=== Constellation s3proxy - not viable

The s3proxy is part of the Constellation Kubernetes engine and provides transparent client-side encryption for S3-compatible storage backends. This would enable seamless integration with PostgreSQL and Galera.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add links to the mentioned tools for proper context, e.g.

https://s3proxy.com[s3proxy^]


We want to provide encrypted backups for all AppCat services. The backup solutions for MariaDB and PostgreSQL are tightly integrated with their respective operators, which prevents us from using k8up. Neither the MariaDB Operator nor the PostgreSQL Operator support encrypted backups natively, so we need an alternative solution.

== Evaluated Options
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be part of "Context" to keep the ADR structure intact, so:

Suggested change
== Evaluated Options
=== Evaluated Options

And of course the subsequent headers need to be aligned


== Evaluated Options

=== Constellation s3proxy - not viable
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not add any judgment in the title yet; this comes in the conclusion section. The "Context" section is neutral in that regard.

Suggested change
=== Constellation s3proxy - not viable
=== Constellation s3proxy


==== Storing encryption passwords

Since we don't plan to rotate the password used for the encryption key, the password can be saved inside of a Kubernetes secret. This is also how we do it with the repository password for K8up.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to keep in mind, password rotation could be required should we somehow leak the password or be hacked.


=== Rclone - recommended

Rclone is a command-line program for managing files on cloud storage, supporting various backends including S3-compatible storage. Using `rclone serve` with crypt remotes, we can implement a transparent encryption proxy.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here as above, link to external resources where it makes sense, e.g. to the rclone website. This is very important to get the correct context when reading.


This is the case for both CNPG and MariaDB.

== Using Rclone with CNPG and MariaDB
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
== Using Rclone with CNPG and MariaDB
== Using Rclone with CloudNativePG and MariaDB

Abbreviations in titles are not always good.

@mikeshootzz mikeshootzz merged commit df6e997 into master Oct 21, 2025
1 check passed
@mikeshootzz mikeshootzz deleted the adr-backup-encryption branch October 21, 2025 05:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

decision A decision that changes the architecture

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants