Skip to content

Commit 931a06a

Browse files
authored
fix: disable management of end-of-wal file flag during backup restoration (#604)
When the end of the WAL stream is reached, the parallel WAL restore feature attempts to predict the names of subsequent WAL files to restore and records the first missing WAL file. On high-availability (HA) replicas, if PostgreSQL requests the first missing WAL file, the code returns an error status that prompts PostgreSQL to switch to streaming replication. Currently, the code assumes a `wal_segment_size` of 16MB for predicting the next WAL file names. If the configured WAL segment size exceeds 16MB, it may request non-existent WAL files. For instance, with 16MB segments, the names would range from `000000010000000100000000` to `0000000100000001000000FF` before moving to the next segment. For 1GB segments, they would range from `000000010000000100000000` to `000000010000000100000003`. With the assumption of a 16MB segment size, the code will not find the WALs from `000000010000000100000004` to `0000000100000001000000FF`. While this assumption does not affect HA replicas - which can shift to streaming mode - it's problematic for a PostgreSQL instance seeking consistency after a restore, as the restore process will fail. This patch disables end-of-wal file marker management during replication, addressing restore issues for backups that were: 1. using a custom WAL file segment size 2. utilizing parallel WAL recovery 3. initiated on one WAL segment and concluded on a different one Fixes: #603 Signed-off-by: Leonardo Cecchi <[email protected]>
1 parent 8ec400a commit 931a06a

File tree

1 file changed

+8
-1
lines changed

1 file changed

+8
-1
lines changed

internal/cnpgi/common/wal.go

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -428,7 +428,14 @@ func isStreamingAvailable(cluster *cnpgv1.Cluster, podName string) bool {
428428
return false
429429
}
430430

431-
// Easy case: If this pod is a replica, the streaming is always available
431+
// Easy case take 1: we are helping PostgreSQL to create the first
432+
// instance of a Cluster. No streaming connection is possible.
433+
if cluster.Status.CurrentPrimary == "" {
434+
return false
435+
}
436+
437+
// Easy case take 2: If this pod is a replica, the streaming is always
438+
// available
432439
if cluster.Status.CurrentPrimary != podName {
433440
return true
434441
}

0 commit comments

Comments
 (0)