Skip to content

Conversation

@pooknull
Copy link
Contributor

@pooknull pooknull commented Feb 18, 2025

K8SPSMDB-1237 Powered by Pull Request Badge

https://perconadev.atlassian.net/browse/K8SPSMDB-1237

DESCRIPTION

This PR introduces support for incremental physical backups.

A new backup types for the field .spec.type have been added to the PerconaServerMongoDBBackup resource:

  • incremental: makes an incremental backup. Important note: percona.com/delete-backup finalizer and .spec.backup.tasks.[].keep option are not available for this type.
  • incremental-base: makes a full incremental backup

CHECKLIST

Jira

  • Is the Jira ticket created and referenced properly?
  • Does the Jira ticket have the proper statuses for documentation (Needs Doc) and QA (Needs QA)?
  • Does the Jira ticket link to the proper milestone (Fix Version field)?

Tests

  • Is an E2E test/test case added for the new feature/change?
  • Are unit tests added where appropriate?
  • Are OpenShift compare files changed for E2E tests (compare/*-oc.yml)?

Config/Logging/Testability

  • Are all needed new/changed options added to default YAML files?
  • Are all needed new/changed options added to the Helm Chart?
  • Did we add proper logging messages for operator actions?
  • Did we ensure compatibility with the previous version or cluster upgrade process?
  • Does the change support oldest and newest supported MongoDB version?
  • Does the change support oldest and newest supported Kubernetes version?

@pull-request-size pull-request-size bot added the size/XXL 1000+ lines label Feb 18, 2025
@hors hors added this to the v1.20.0 milestone Feb 25, 2025
@github-actions github-actions bot added the tests label Feb 25, 2025
@pooknull pooknull marked this pull request as ready for review February 25, 2025 13:45
Copy link
Collaborator

@hors hors left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hors hors self-requested a review February 25, 2025 14:42
Comment on lines 52 to 55
const (
BackupTypeIncrementalBase defs.BackupType = defs.IncrementalBackup + "-base"
)

Copy link
Contributor

@gkech gkech Feb 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can also move this type to pkg/apis/psmdb/v1/perconaservermongodbbackup_types.go since it is more related to

type PerconaServerMongoDBBackupSpec struct {
        ...
        ...
        ....
	// +kubebuilder:validation:Enum={logical,physical,incremental}
	Type defs.BackupType `json:"type,omitempty"`
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pooknull pooknull requested a review from egegunes February 28, 2025 12:16
gkech
gkech previously approved these changes Mar 4, 2025
nmarukovich
nmarukovich previously approved these changes Mar 4, 2025
egegunes
egegunes previously approved these changes Mar 4, 2025
hors added a commit that referenced this pull request Mar 4, 2025
* K8SPSMDB-1219: PBM multi storage support

Operator always supported multiple storages but didn't have native
support for having multiple backup storages until v2.6.0. In operator we
were reconfiguring PBM every time user selects a storage for their
backups/restores different than the previous storage. This was causing
long wait periods esp. in storages with lots of backups due to resync
operation.

Another limitation was forcing users to have only 1 backup storage if
they want to enable point-in-time-recovery. In case of multiple
storages, PBM would upload oplog chunks to whatever storage is last used
by a backup/restore and this would make consistent recovery impossible.

PBM v2.6.0 added native support for multiple storages and these changes
introduce it to our operator:

* User can have one main storage in PBM configuration. Any other
  storages can be added as profiles.

Main storage can be found in:
```
kubectl exec cluster1-rs0-0 -c backup-agent -- pbm config
```

This commit introduces a new field `main` in storage spec:
```
storages:
  s3-us-west:
    main: true
    type: s3
    s3:
      bucket: operator-testing
      credentialsSecret: cluster1-s3-secrets
      region: us-west-2
````

If user only has 1 storage configured in `cr.yaml`, operator will
automatically use it as main storage. If more than 1 storage is
configured, one of them must have `main: true`. User can't have more
than 1 storage with `main: true`.

If user changes main storage in cr.yaml, operator will configure PBM
with the new storage and start resync.

Any other storage in `cr.yaml` will be added to PBM as a profile.

User can see profiles using cli:
```
kubectl exec cluster1-rs0-0 -c backup-agent -- pbm profile list
```

When user adds a new profile to `cr.yaml`, operator will add it to PBM
but won't start resync.

**`pbm config --force-resync` only start resync for the main storage.**

To manually resync a profile:
```
kubectl exec cluster1-rs0-0 -c backup-agent -- pbm profile sync <storage-name>
```

If user starts a restore using a backup in a storage configured as
profile, operator will start resync operation for profile and block
restore until resync finishes.

Note: Profiles are also called external storages in PBM documentation.

If user has multiple storages in `cr.yaml` and changes main storage
between them, operator:
1. configures PBM with the new main storage.
2. adds the old main as a profile.
2. deletes profile for the new main storage.

If user configures `backupSource` in backups/restores:
* if `cr.yaml` has no storages configured, operator configures PBM with
  storage data in `backupSource` field. This storage will effectively be
  the main storage until user adds a storage to `cr.yaml`. After a
  storage is configured PBM configuration will be overwritten and
  `backupSource` storage will be gone.
* if `cr.yaml` has a storage configured, operator adds `backupSource`
  storage as a profile.

* Oplog chunks will be only be uploaded to main storage.

User can use any backup as base backup for point-in-time-recovery.

* Incremental backup chains all need to be stored in the same storage.

TBD after #1836 merged.

---

Other significant changes in operator behavior:

* Operator now configures automatically configures PBM on a fresh
  cluster.

  Before this changes, PBM was not configured until user starts a
  backup/restore after deploying a fresh cluster. Now, PBM will directly
  configured with main storage in `cr.yaml` and resync will be started
  in the background.

  There's a new field in `PerconaServerMongoDB` status:
  `backupConfigHash`. Operator will maintain hash of the current PBM
  configuration and reconfigures PBM if hash is changed. Fields in
  `spec.backup.pitr` are excluded from hash calculation, they're handled
  separately.
* If `PerconaServerMongoDB` is annotated with `percona.com/resync-pbm=true`,
  operator will start resync operation both for main storage and
  profiles. Resync for profiles are started with the equivalent of `pbm
  profile sync --all`. These resync operations will be run in the
  background and will not block the reconciliation.
* If a backup that has `percona.com/delete-backup` finalizer is deleted,
  operator will only delete oplogs chunks if it's in main storage.

* fix users

* fix repeatedly sending backup commands

* fix users

* address review comments

* assert err in main storage unit test

* don't use break labels

---------

Co-authored-by: Viacheslav Sarzhan <[email protected]>
@hors hors dismissed stale reviews from egegunes, gkech, and nmarukovich via b5c517f March 4, 2025 23:21
# - name: weekly-s3-us-west-incremental
# enabled: false
# schedule: "0 1 * * *"
# keep: 5
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pooknull please check how the retention works for incremental backups. We can implement it in next release but now we need to remove "keep" option from incremental.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

# enabled: false
# schedule: "0 1 * * *"
# keep: 5
# type: incremental
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pooknull I have error when I try to use it

# perconaservermongodbs.psmdb.percona.com "some-name" was not valid:
# * spec.backup.tasks[0].type: Unsupported value: "incremental": supported values: "logical", "physical"

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need to skip percona.com/delete-backup finalizer for incremental backups
If we delete the base backup, all incremental backups should be deleted as well

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

# enabled: false
# schedule: "0 1 * * *"
# keep: 5
# type: incremental
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need to skip percona.com/delete-backup finalizer for incremental backups
If we delete the base backup, all incremental backups should be deleted as well

@github-actions github-actions bot added the dependencies Pull requests that update a dependency file label Mar 6, 2025
@hors hors self-requested a review March 6, 2025 18:54
@JNKPercona
Copy link
Collaborator

Test name Status
arbiter passed
balancer passed
cross-site-sharded passed
custom-replset-name passed
custom-tls passed
custom-users-roles passed
custom-users-roles-sharded passed
data-at-rest-encryption passed
data-sharded passed
demand-backup passed
demand-backup-eks-credentials-irsa passed
demand-backup-fs passed
demand-backup-incremental passed
demand-backup-incremental-sharded passed
demand-backup-physical passed
demand-backup-physical-sharded passed
demand-backup-sharded passed
expose-sharded passed
finalizer passed
ignore-labels-annotations passed
init-deploy passed
ldap passed
ldap-tls passed
limits passed
liveness passed
mongod-major-upgrade passed
mongod-major-upgrade-sharded passed
monitoring-2-0 passed
multi-cluster-service passed
multi-storage passed
non-voting passed
one-pod passed
operator-self-healing-chaos passed
pitr passed
pitr-physical passed
pitr-sharded passed
preinit-updates passed
pvc-resize passed
recover-no-primary passed
replset-overrides passed
rs-shard-migration passed
scaling passed
scheduled-backup passed
security-context passed
self-healing-chaos passed
service-per-pod passed
serviceless-external-nodes passed
smart-update passed
split-horizon passed
stable-resource-version passed
storage passed
tls-issue-cert-manager passed
upgrade passed
upgrade-consistency passed
upgrade-consistency-sharded-tls passed
upgrade-sharded passed
users passed
version-service passed
We run 58 out of 58

commit: ed8e6bc
image: perconalab/percona-server-mongodb-operator:PR-1836-ed8e6bc3

@hors hors merged commit 796b6e3 into main Mar 7, 2025
18 of 19 checks passed
@hors hors deleted the K8SPSMDB-1237 branch March 7, 2025 09:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dependencies Pull requests that update a dependency file size/XXL 1000+ lines tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants