Skip to content

Conversation

@rdunklau
Copy link
Contributor

Instead of blocking indefinitely for a replication slot to be syncable, introduce a new GUC pg_failover_slots.sync_timeout after which we will move to the next one.

To avoid waiting from scratch, we create the replication as temporary ones instead of ephemeral ones, allowing them to keep their state between runs. When the slot is finally synced, we persist it to disk.

Since we do not block in waiting state anymore, we need to cleanup the inconsistent slots after promotion.

This solves the possible issue of having an inactive slot in the primary which prevents every other slots to be synced to the standby.

@kfcss
Copy link

kfcss commented Oct 5, 2023

We have noticed the same problems. This results in the secondary consuming more disk space than the primary.

Copy link
Collaborator

@PJMODOS PJMODOS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks interesting, I am starting to wonder whether we need some kind of status function that will report state of all the slots in some user consumable way.

@rdunklau
Copy link
Contributor Author

This looks interesting, I am starting to wonder whether we need some kind of status function that will report state of all the slots in some user consumable way.

For monitoring purposes that's a pretty good idea. As of now, one can rely on the persistence of the slot to infer it's status but it's not ideal.

@rdunklau
Copy link
Contributor Author

Do you have any opinion on that design ?

Instead of blocking indefinitely for a replication slot to be syncable,
introduce a new GUC pg_failover_slots.sync_timeout after which we will
move to the next one.

To avoid waiting from scratch, we create the replication as temporary
ones instead of ephemeral ones, allowing them to keep their state
between runs. When the slot is finally synced, we persist it to disk.

Since we do not block in waiting state anymore, we need to cleanup the
inconsistent slots after promotion.
@rdunklau
Copy link
Contributor Author

I just rebased it on the current master branch.

@jmealo
Copy link

jmealo commented May 1, 2025

@petere @PJMODOS Curious if this will get merged? 🤔

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants