Fix how flush_pos is compared to commit_lsn. #50

VadimPushtaev · 2024-09-23T11:12:37Z

When we logically replicate the message with LSN equal to exactly X, we need to wait for flush_pos of other replicas to become strictly greater than X, not greater or equal.

Both restart_lsn and confirmed_flush represent the oldest LSN that must still be kept so it can still be replicated. That means that it might NOT yet be replicated.

--

I made some trivial experiment to check how pg_failover_slots.standby_slot_names behaves and noticed that my logical replica always gets one extra record when the physical replica becomes unavailable. With this fix it works as expected: logical replica is never delayed more then necessary, but no records that were not replicated to physical replica are replicated to the logical one.

When we logically replicate the message with LSN equal to exactly X, we need to wait for `flush_pos` of other replicas to become strictly greater than X, not greater or equal. Both `restart_lsn` and `confirmed_flush` represent the oldest LSN that must still be kept so it can still be replicated. That means that it might NOT yet be replicated.

petere requested a review from PJMODOS September 18, 2025 18:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix how flush_pos is compared to commit_lsn. #50

Fix how flush_pos is compared to commit_lsn. #50

Uh oh!

VadimPushtaev commented Sep 23, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Fix how flush_pos is compared to commit_lsn. #50

Are you sure you want to change the base?

Fix how flush_pos is compared to commit_lsn. #50

Uh oh!

Conversation

VadimPushtaev commented Sep 23, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant