Skip to content

fix: ensure Watchdog kills ReplicationConnection on failing healthcheck#1763

Merged
edgurgel merged 2 commits intomainfrom
fix/replication-watchdog
Mar 22, 2026
Merged

fix: ensure Watchdog kills ReplicationConnection on failing healthcheck#1763
edgurgel merged 2 commits intomainfrom
fix/replication-watchdog

Conversation

@edgurgel
Copy link
Member

What kind of change does this PR introduce?

  • Fix how Watchdog works ensuring that the child is terminated.
  • Use Postgrex with query timeouts so that we don't need the Wrapper GenServer function

Copilot AI review requested due to automatic review settings March 20, 2026 03:57
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes ReplicationConnection watchdog behavior so unresponsive replication connections are actively terminated, while simplifying initialization by removing the Wrapper GenServer and relying on Postgrex/DBConnection timeouts.

Changes:

  • Update Watchdog to terminate the ReplicationConnection when health checks time out.
  • Remove the ReplicationConnection Wrapper GenServer and introduce query_timeout-driven query timeouts during initialization steps.
  • Update tests and bump Postgrex (and related) dependencies.

Reviewed changes

Copilot reviewed 4 out of 5 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
lib/realtime/tenants/replication_connection/watchdog.ex On health-check timeout, logs and triggers termination of the replication connection.
lib/realtime/tenants/replication_connection.ex Removes wrapper-based init timeout, adds query_timeout, adds stop/2, and applies timeouts to init queries.
test/realtime/tenants/replication_connection_test.exs Adds watchdog regression test, adjusts timeout test, removes Mimic usage, and improves assert_process_down/2.
mix.exs Bumps Postgrex requirement to ~> 0.21.0.
mix.lock Locks updated versions for postgrex, db_connection, and telemetry.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@edgurgel edgurgel force-pushed the fix/replication-watchdog branch from 16b5186 to 0290beb Compare March 22, 2026 20:33
@coveralls
Copy link

Coverage Status

coverage: 92.231% (+0.06%) from 92.168%
when pulling 0290beb on fix/replication-watchdog
into 194632b on main.

@edgurgel edgurgel merged commit 1f85df0 into main Mar 22, 2026
8 checks passed
@edgurgel edgurgel deleted the fix/replication-watchdog branch March 22, 2026 21:28
@realtime-release-bot
Copy link

🎉 This PR is included in version 2.78.16 🎉

The release is available on GitHub release

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants