-
Notifications
You must be signed in to change notification settings - Fork 514
Description
When recovering a database previously configured with WAL failover, the OPTIONS file encodes the directory that was previously used as a secondary (which may contain relevant WAL files that need to be replayed). Open will error out if this directory encoded within the OPTIONS file is not provided as a recovery directory or the current WAL failover secondary.
However it's also possible for an operator to accidentally remount the wrong disks in the wrong places, so although the correct directory path is configured, its contents are incorrect. We should persist a stable identifier (a UUID?) to both the OPTIONS file and a file within the secondary directory. If recovery finds that the secondary directory does not contain a matching identifier, we can abort recovery indicating that the secondary seems incorrect / corrupt.
This would've helped with a recent DRT test cluster issue where the loss of a VM's host forced a migration of a VM, and the node came back up with incorrect disk mountpoints.
Jira issue: PEBBLE-358
Epic PEBBLE-1158