Skip to content

isaias-sanchez-utiq/patroni-workshop

Repository files navigation

Patroni Workshop

Patroni training workshop using docker. This workshop is an updated version of Zalando's training repository from back in 2019 and including Ivory into the mix for testing purposes.


Contents


Set up the environment

PatroniCluster

Clone, build and deploy

> git clone https://github.com/isaias-sanchez-utiq/patroni-workshop.git
> cd patroni-workshop
> docker compose up -d
> docker ps
CONTAINER ID   IMAGE                       COMMAND                  CREATED         STATUS         PORTS                                                                                                          NAMES
65180f97e573   patroni-workshop-patroni1   "docker-entrypoint.s…"   7 seconds ago   Up 7 seconds   0.0.0.0:5431->5432/tcp, [::]:5431->5432/tcp                                                                    demo-patroni1
3f13f0e38d20   patroni-workshop-patroni3   "docker-entrypoint.s…"   7 seconds ago   Up 7 seconds   0.0.0.0:5433->5432/tcp, [::]:5433->5432/tcp                                                                    demo-patroni3
cb6a6b0bc4d8   haproxy:3.2.6-alpine        "docker-entrypoint.s…"   7 seconds ago   Up 7 seconds   0.0.0.0:5050-5051->5050-5051/tcp, [::]:5050-5051->5050-5051/tcp, 0.0.0.0:9001->9001/tcp, [::]:9001->9001/tcp   demo-haproxy
cdb7ef9c8ed9   patroni-workshop-patroni2   "docker-entrypoint.s…"   7 seconds ago   Up 7 seconds   0.0.0.0:5432->5432/tcp, [::]:5432->5432/tcp                                                                    demo-patroni2
5489fd502e0d   hashicorp/consul:1.18.2     "docker-entrypoint.s…"   7 seconds ago   Up 7 seconds   0.0.0.0:8500->8500/tcp, [::]:8500->8500/tcp, 0.0.0.0:8600->8600/udp, [::]:8600->8600/udp                       demo-consul
1ffd76de0164   veegres/ivory               "entrypoint.sh"          7 seconds ago   Up 7 seconds   0.0.0.0:80->80/tcp, [::]:80->80/tcp                                                                            demo-ivory 

Quick checks

Checking patroni cluster

  • Login in into one of the patroni containers
  • Execute the patronictl list command to check the cluster.
> docker exec -it demo-patroni1 bash
postgres@patroni1:/$ patronictl -c /postgres.yml list
+ Cluster: demo (7599608371094798356) ------+----+-------------+-----+------------+-----+
| Member   | Host     | Role    | State     | TL | Receive LSN | Lag | Replay LSN | Lag |
+----------+----------+---------+-----------+----+-------------+-----+------------+-----+
| patroni1 | patroni1 | Replica | streaming |  1 |   0/5000060 |   0 |  0/5000060 |   0 |
| patroni2 | patroni2 | Replica | streaming |  1 |   0/5000060 |   0 |  0/5000060 |   0 |
| patroni3 | patroni3 | Leader  | running   |  1 |             |     |            |     |
+----------+----------+---------+-----------+----+-------------+-----+------------+-----+
postgres@patroni1:/$

Checking HAProxy

  • Connect to postgres using port 5050 and verify is the primary.
  • Connect to postgres using port 5051 and verify it connect to every node in round-robin.
> ❯ psql -h localhost -U postgres -p 5050
psql (18.1)
Type "help" for help.

postgres=# select pg_is_in_recovery();
 pg_is_in_recovery
-------------------
 f
(1 row)

postgres=# select inet_server_addr();
 inet_server_addr
------------------
 172.21.0.5
(1 row)

postgres=# \q
❯ psql -h localhost -U postgres -p 5051
psql (18.1)
Type "help" for help.

postgres=# select inet_server_addr();
 inet_server_addr
------------------
 172.21.0.5
(1 row)

postgres=# \q
❯ psql -h localhost -U postgres -p 5051
psql (18.1)
Type "help" for help.

postgres=# select inet_server_addr();
 inet_server_addr
------------------
 172.21.0.7
(1 row)

postgres=# \q
❯ psql -h localhost -U postgres -p 5051
psql (18.1)
Type "help" for help.

postgres=# select inet_server_addr();
 inet_server_addr
------------------
 172.21.0.2
(1 row)

postgres=# \q

Checking Consul

Consul has its own UI to look into it answering on port 8500, so by entering on: http://localhost:8500/ui and surfing to Key/value → patroni → demo we'll see:

Consul intereface

Failure scenarios with patroni

We're going to explore all different scenarios we can think of and the troubleshooting procedure for each of them.

Failover

Failover happens when the primary dies abruptly, so to test it let's kill docker container with the primary:

  • Let's tail logs for the two replica containers in different consoles to record its results:
docker logs -f demo-patroni1
docker logs -f demo-patroni2
  • Let's kill the primary now.
docker kill demo-patroni3

Replica promotion

After a kill we can see the reaction of both replica instances until a new leader is promoted.

  • In demo-patroni2 (new leader).
2026-01-26 10:09:44.706 UTC [36] FATAL:  could not receive data from WAL stream: server closed the connection unexpectedly
		This probably means the server terminated abnormally
		before or while processing the request.
2026-01-26 10:09:44.707 UTC [31] LOG:  invalid record length at 0/504F7A0: expected at least 24, got 0
2026-01-26 10:09:44.713 UTC [446] FATAL:  streaming replication receiver "demo" could not connect to the primary server: connection to server at "patroni3" (172.21.0.5), port 5432 failed: Connection refused
		Is the server running on that host and accepting TCP/IP connections?
2026-01-26 10:09:44.713 UTC [31] LOG:  waiting for WAL to become available at 0/504F7B8
2026-01-26 10:09:49.727 UTC [451] FATAL:  streaming replication receiver "demo" could not connect to the primary server: could not translate host name "patroni3" to address: Name or service not known
2026-01-26 10:09:49.727 UTC [31] LOG:  waiting for WAL to become available at 0/504F7B8
2026-01-26 10:09:51,768 INFO: no action. I am (patroni2), a secondary, and following a leader (patroni3)
2026-01-26 10:09:54.731 UTC [456] FATAL:  streaming replication receiver "demo" could not connect to the primary server: could not translate host name "patroni3" to address: Name or service not known
2026-01-26 10:09:54.731 UTC [31] LOG:  waiting for WAL to become available at 0/504F7B8
2026-01-26 10:09:59.739 UTC [462] FATAL:  streaming replication receiver "demo" could not connect to the primary server: could not translate host name "patroni3" to address: Name or service not known
2026-01-26 10:09:59.740 UTC [31] LOG:  waiting for WAL to become available at 0/504F7B8
2026-01-26 10:10:01,311 INFO: no action. I am (patroni2), a secondary, and following a leader (patroni3)
2026-01-26 10:10:04.743 UTC [467] FATAL:  streaming replication receiver "demo" could not connect to the primary server: could not translate host name "patroni3" to address: Name or service not known
2026-01-26 10:10:04.744 UTC [31] LOG:  waiting for WAL to become available at 0/504F7B8
2026-01-26 10:10:09,003 INFO: Got response from patroni1 http://patroni1:8008/patroni: {"state": "running", "postmaster_start_time": "2026-01-26 10:02:00.136783+00:00", "role": "replica", "server_version": 180001, "xlog": {"received_location": 84211616, "replayed_location": 84211616, "replayed_timestamp": "2026-01-26 10:03:49.030674+00:00", "paused": false}, "timeline": 1, "cluster_unlocked": true, "dcs_last_seen": 1769422208, "database_system_identifier": "7599608371094798356", "patroni": {"version": "4.1.0", "scope": "demo", "name": "patroni1"}}
2026-01-26 10:10:09,004 WARNING: Request failed to patroni3: GET http://patroni3:8008/patroni (HTTPConnectionPool(host='patroni3', port=8008): Max retries exceeded with url: /patroni (Caused by NameResolutionError("<urllib3.connection.HTTPConnection object at 0xffffa50889d0>: Failed to resolve 'patroni3' ([Errno -2] Name or service not known)")))
2026-01-26 10:10:09,010 INFO: promoted self to leader by acquiring session lock
server promoting
2026-01-26 10:10:09.012 UTC [31] LOG:  received promote request
2026-01-26 10:10:09.012 UTC [31] LOG:  redo done at 0/504F768 system usage: CPU: user: 0.00 s, system: 0.00 s, elapsed: 488.83 s
2026-01-26 10:10:09.012 UTC [31] LOG:  last completed transaction was at log time 2026-01-26 10:03:49.030674+00
2026-01-26 10:10:09.014 UTC [31] LOG:  selected new timeline ID: 2
2026-01-26 10:10:09.040 UTC [31] LOG:  archive recovery complete
2026-01-26 10:10:09.041 UTC [29] LOG:  checkpoint starting: force
2026-01-26 10:10:09.042 UTC [25] LOG:  database system is ready to accept connections
2026-01-26 10:10:09.046 UTC [29] LOG:  checkpoint complete: wrote 0 buffers (0.0%), wrote 2 SLRU buffers; 0 WAL file(s) added, 0 removed, 0 recycled; write=0.001 s, sync=0.001 s, total=0.006 s; sync files=2, longest=0.001 s, average=0.001 s; distance=0 kB, estimate=29776 kB; lsn=0/504F830, redo lsn=0/504F7D8
2026-01-26 10:10:10,023 INFO: Lock owner: patroni2; I am patroni2
2026-01-26 10:10:10,034 INFO: Register service demo, params {'service_id': 'demo/patroni2', 'address': 'patroni2', 'port': 5432, 'check': {'http': 'http://patroni2:8008/primary', 'interval': '5s', 'DeregisterCriticalServiceAfter': '150.0s'}, 'tags': ['primary', master], 'enable_tag_override': True}
2026-01-26 10:10:10,043 INFO: no action. I am (patroni2), the leader with the lock
2026-01-26 10:10:20,039 INFO: no action. I am (patroni2), the leader with the lock
2026-01-26 10:10:30,030 INFO: no action. I am (patroni2), the leader with the lock
  • In demo-patroni1 (replica following a new leader)
2026-01-26 10:09:44.706 UTC [33] FATAL:  could not receive data from WAL stream: server closed the connection unexpectedly
		This probably means the server terminated abnormally
		before or while processing the request.
2026-01-26 10:09:44.707 UTC [31] LOG:  invalid record length at 0/504F7A0: expected at least 24, got 0
2026-01-26 10:09:44.713 UTC [452] FATAL:  streaming replication receiver "demo" could not connect to the primary server: connection to server at "patroni3" (172.21.0.5), port 5432 failed: Connection refused
		Is the server running on that host and accepting TCP/IP connections?
2026-01-26 10:09:44.713 UTC [31] LOG:  waiting for WAL to become available at 0/504F7B8
2026-01-26 10:09:49.727 UTC [457] FATAL:  streaming replication receiver "demo" could not connect to the primary server: could not translate host name "patroni3" to address: Name or service not known
2026-01-26 10:09:49.727 UTC [31] LOG:  waiting for WAL to become available at 0/504F7B8
2026-01-26 10:09:50,285 INFO: no action. I am (patroni1), a secondary, and following a leader (patroni3)
2026-01-26 10:09:54.731 UTC [462] FATAL:  streaming replication receiver "demo" could not connect to the primary server: could not translate host name "patroni3" to address: Name or service not known
2026-01-26 10:09:54.731 UTC [31] LOG:  waiting for WAL to become available at 0/504F7B8
2026-01-26 10:09:59.739 UTC [468] FATAL:  streaming replication receiver "demo" could not connect to the primary server: could not translate host name "patroni3" to address: Name or service not known
2026-01-26 10:09:59.740 UTC [31] LOG:  waiting for WAL to become available at 0/504F7B8
2026-01-26 10:10:00,288 INFO: no action. I am (patroni1), a secondary, and following a leader (patroni3)
2026-01-26 10:10:04.743 UTC [473] FATAL:  streaming replication receiver "demo" could not connect to the primary server: could not translate host name "patroni3" to address: Name or service not known
2026-01-26 10:10:04.743 UTC [31] LOG:  waiting for WAL to become available at 0/504F7B8
2026-01-26 10:10:09,002 INFO: Got response from patroni2 http://patroni2:8008/patroni: {"state": "running", "postmaster_start_time": "2026-01-26 10:02:00.144814+00:00", "role": "replica", "server_version": 180001, "xlog": {"received_location": 84211616, "replayed_location": 84211616, "replayed_timestamp": "2026-01-26 10:03:49.030674+00:00", "paused": false}, "timeline": 1, "cluster_unlocked": true, "dcs_last_seen": 1769422208, "database_system_identifier": "7599608371094798356", "patroni": {"version": "4.1.0", "scope": "demo", "name": "patroni2"}}
2026-01-26 10:10:09,003 WARNING: Request failed to patroni3: GET http://patroni3:8008/patroni (HTTPConnectionPool(host='patroni3', port=8008): Max retries exceeded with url: /patroni (Caused by NameResolutionError("<urllib3.connection.HTTPConnection object at 0xffff971489d0>: Failed to resolve 'patroni3' ([Errno -2] Name or service not known)")))
2026-01-26 10:10:09,007 INFO: Could not take out TTL lock
2026-01-26 10:10:09.011 UTC [25] LOG:  received SIGHUP, reloading configuration files
server signaled
2026-01-26 10:10:09.012 UTC [25] LOG:  parameter "primary_conninfo" changed to "dbname=postgres user=replicator passfile=/tmp/pgpass0 host=patroni2 port=5432 sslmode=prefer application_name=patroni1 gssencmode=prefer channel_binding=prefer sslnegotiation=postgres"
2026-01-26 10:10:09.017 UTC [487] LOG:  started streaming WAL from primary at 0/5000000 on timeline 1
2026-01-26 10:10:09,018 INFO: following new leader after trying and failing to obtain lock
2026-01-26 10:10:09.041 UTC [487] LOG:  replication terminated by primary server
2026-01-26 10:10:09.041 UTC [487] DETAIL:  End of WAL reached on timeline 1 at 0/504F7A0.
2026-01-26 10:10:09.041 UTC [487] LOG:  fetching timeline history file for timeline 2 from primary server
2026-01-26 10:10:09.042 UTC [31] LOG:  new target timeline is 2
2026-01-26 10:10:09.043 UTC [487] LOG:  restarted WAL streaming at 0/5000000 on timeline 2
2026-01-26 10:10:19,302 INFO: Lock owner: patroni2; I am patroni1
2026-01-26 10:10:19,310 INFO: Local timeline=2 lsn=0/504F8E0
2026-01-26 10:10:19,319 INFO: primary_timeline=2
2026-01-26 10:10:19,327 INFO: no action. I am (patroni1), a secondary, and following a leader (patroni2)
2026-01-26 10:10:29,367 INFO: no action. I am (patroni1), a secondary, and following a leader (patroni2)

Patronictl output

❯ docker exec -it demo-patroni2 bash
postgres@patroni2:/$ patronictl -c /postgres.yml list
+ Cluster: demo (7599608371094798356) ------+----+-------------+-----+------------+-----+
| Member   | Host     | Role    | State     | TL | Receive LSN | Lag | Replay LSN | Lag |
+----------+----------+---------+-----------+----+-------------+-----+------------+-----+
| patroni1 | patroni1 | Replica | streaming |  2 |   0/504F8E0 |   0 |  0/504F8E0 |   0 |
| patroni2 | patroni2 | Leader  | running   |  2 |             |     |            |     |
+----------+----------+---------+-----------+----+-------------+-----+------------+-----+
postgres@patroni2:/$

Starting former primary

❯ docker start demo-patroni3
demo-patroni3

Fail to rewind

2026-01-26 10:14:04,823 ERROR: Failed to rewind from healthy primary: patroni2
2026-01-26 10:14:04,824 WARNING: Postgresql is not running.
2026-01-26 10:14:04,824 INFO: Lock owner: patroni2; I am patroni3
...
2026-01-26 10:14:04.944 UTC [46] FATAL:  could not start WAL streaming: ERROR:  requested starting point 0/6000000 on timeline 1 is not in this server's history
	DETAIL:  This server's history forked from timeline 1 at 0/504F7A0.
2026-01-26 10:14:04.944 UTC [44] LOG:  new timeline 2 forked off current database system timeline 1 before current recovery point 0/60000A0
2026-01-26 10:14:04.946 UTC [47] FATAL:  could not start WAL streaming: ERROR:  requested starting point 0/6000000 on timeline 1 is not in this server's history
	DETAIL:  This server's history forked from timeline 1 at 0/504F7A0.
2026-01-26 10:14:04.947 UTC [44] LOG:  new timeline 2 forked off current database system timeline 1 before current recovery point 0/60000A0
2026-01-26 10:14:04.947 UTC [44] LOG:  waiting for WAL to become available at 0/60000B8

Need to reinit this node:

postgres@patroni2:/$ patronictl -c /postgres.yml list
+ Cluster: demo (7599608371094798356) ------+----+-------------+-----+------------+-----+
| Member   | Host     | Role    | State     | TL | Receive LSN | Lag | Replay LSN | Lag |
+----------+----------+---------+-----------+----+-------------+-----+------------+-----+
| patroni1 | patroni1 | Replica | streaming |  2 |   0/504F8E0 |   0 |  0/504F8E0 |   0 |
| patroni2 | patroni2 | Leader  | running   |  2 |             |     |            |     |
| patroni3 | patroni3 | Replica | running   |  1 |   0/6000000 |   0 |  0/60000A0 |   0 |
+----------+----------+---------+-----------+----+-------------+-----+------------+-----+
postgres@patroni2:/$ patronictl -c /postgres.yml reinit demo patroni3
+ Cluster: demo (7599608371094798356) ------+----+-------------+-----+------------+-----+
| Member   | Host     | Role    | State     | TL | Receive LSN | Lag | Replay LSN | Lag |
+----------+----------+---------+-----------+----+-------------+-----+------------+-----+
| patroni1 | patroni1 | Replica | streaming |  2 |   0/504F8E0 |   0 |  0/504F8E0 |   0 |
| patroni2 | patroni2 | Leader  | running   |  2 |             |     |            |     |
| patroni3 | patroni3 | Replica | running   |  1 |   0/6000000 |   0 |  0/60000A0 |   0 |
+----------+----------+---------+-----------+----+-------------+-----+------------+-----+
Are you sure you want to reinitialize members patroni3? [y/N]: y
Success: reinitialize for member patroni3
postgres@patroni2:/$ patronictl -c /postgres.yml list
+ Cluster: demo (7599608371094798356) ------+----+-------------+-----+------------+-----+
| Member   | Host     | Role    | State     | TL | Receive LSN | Lag | Replay LSN | Lag |
+----------+----------+---------+-----------+----+-------------+-----+------------+-----+
| patroni1 | patroni1 | Replica | streaming |  2 |   0/7000000 |   0 |  0/7000000 |   0 |
| patroni2 | patroni2 | Leader  | running   |  2 |             |     |            |     |
| patroni3 | patroni3 | Replica | running   |  1 |   0/6000000 |  16 |  0/60000A0 |  16 |
+----------+----------+---------+-----------+----+-------------+-----+------------+-----+
postgres@patroni2:/$ patronictl -c /postgres.yml list
+ Cluster: demo (7599608371094798356) ------+----+-------------+-----+------------+-----+
| Member   | Host     | Role    | State     | TL | Receive LSN | Lag | Replay LSN | Lag |
+----------+----------+---------+-----------+----+-------------+-----+------------+-----+
| patroni1 | patroni1 | Replica | streaming |  2 |   0/7000060 |   0 |  0/7000060 |   0 |
| patroni2 | patroni2 | Leader  | running   |  2 |             |     |            |     |
| patroni3 | patroni3 | Replica | streaming |  2 |   0/7000060 |   0 |  0/7000060 |   0 |
+----------+----------+---------+-----------+----+-------------+-----+------------+-----+
postgres@patroni2:/$

Disabling pg_rewind

One way to avoid a manual reinit is disabling pg_rewind, so patroni will perform an auto-reinit instead:

postgres@patroni2:/$ patronictl -c /postgres.yml edit-config
---
+++
@@ -4,6 +4,6 @@
   parameters:
     max_connections: 100
     max_replication_slots: 5
     ...
-  use_pg_rewind: true
+  use_pg_rewind: false
 retry_timeout: 10
 ttl: 30

Apply these changes? [y/N]: y
Configuration changed
postgres@patroni2:/$
postgres@patroni2:/$
postgres@patroni2:/$ patronictl -c /postgres.yml list
+ Cluster: demo (7599616026651713562) ------+----+-------------+-----+------------+-----+
| Member   | Host     | Role    | State     | TL | Receive LSN | Lag | Replay LSN | Lag |
+----------+----------+---------+-----------+----+-------------+-----+------------+-----+
| patroni1 | patroni1 | Replica | streaming |  1 |   0/5000060 |   0 |  0/5000060 |   0 |
| patroni2 | patroni2 | Leader  | running   |  1 |             |     |            |     |
| patroni3 | patroni3 | Replica | streaming |  1 |   0/5000060 |   0 |  0/5000060 |   0 |
+----------+----------+---------+-----------+----+-------------+-----+------------+-----+
postgres@patroni2:/$ exit
❯ docker kill demo-patroni2
demo-patroni2
❯ docker exec -it demo-patroni1 bash
postgres@patroni1:/$ patronictl -c /postgres.yml list
+ Cluster: demo (7599616026651713562) ------+----+-------------+-----+------------+-----+
| Member   | Host     | Role    | State     | TL | Receive LSN | Lag | Replay LSN | Lag |
+----------+----------+---------+-----------+----+-------------+-----+------------+-----+
| patroni1 | patroni1 | Replica | streaming |  1 |   0/5000060 |   0 |  0/5000060 |   0 |
| patroni2 | patroni2 | Leader  | running   |  1 |             |     |            |     |
| patroni3 | patroni3 | Replica | streaming |  1 |   0/5000060 |   0 |  0/5000060 |   0 |
+----------+----------+---------+-----------+----+-------------+-----+------------+-----+
postgres@patroni1:/$ patronictl -c /postgres.yml list
+ Cluster: demo (7599616026651713562) ------+----+-------------+-----+------------+-----+
| Member   | Host     | Role    | State     | TL | Receive LSN | Lag | Replay LSN | Lag |
+----------+----------+---------+-----------+----+-------------+-----+------------+-----+
| patroni1 | patroni1 | Replica | streaming |  2 |   0/50001A0 |   0 |  0/50001A0 |   0 |
| patroni3 | patroni3 | Leader  | running   |  2 |             |     |            |     |
+----------+----------+---------+-----------+----+-------------+-----+------------+-----+
postgres@patroni1:/$ patronictl -c /postgres.yml list
+ Cluster: demo (7599616026651713562) ------+----+-------------+-----+------------+-----+
| Member   | Host     | Role    | State     | TL | Receive LSN | Lag | Replay LSN | Lag |
+----------+----------+---------+-----------+----+-------------+-----+------------+-----+
| patroni1 | patroni1 | Replica | streaming |  2 |   0/50001A0 |   0 |  0/50001A0 |   0 |
| patroni2 | patroni2 | Replica | streaming |  2 |   0/50001A0 |   0 |  0/50001A0 |   0 |
| patroni3 | patroni3 | Leader  | running   |  2 |             |     |            |     |
+----------+----------+---------+-----------+----+-------------+-----+------------+-----+
postgres@patroni1:/$

After starting again the demo-patroni2:

❯ docker start demo-patroni2
demo-patroni2
❯ docker exec -it demo-patroni1 bash
postgres@patroni1:/$ patronictl -c /postgres.yml list
+ Cluster: demo (7599616026651713562) ------+----+-------------+-----+------------+-----+
| Member   | Host     | Role    | State     | TL | Receive LSN | Lag | Replay LSN | Lag |
+----------+----------+---------+-----------+----+-------------+-----+------------+-----+
| patroni1 | patroni1 | Replica | streaming |  2 |   0/50001A0 |   0 |  0/50001A0 |   0 |
| patroni2 | patroni2 | Replica | streaming |  2 |   0/50001A0 |   0 |  0/50001A0 |   0 |
| patroni3 | patroni3 | Leader  | running   |  2 |             |     |            |     |
+----------+----------+---------+-----------+----+-------------+-----+------------+-----+
postgres@patroni1:/$

Fixing pg_rewind

Now we'll try to fix the pg_rewind issue before attempting to reinit the node.

Let's enable back the pg_rewind:

postgres@patroni1:/$ patronictl -c /postgres.yml edit-config
---
+++
@@ -4,6 +4,6 @@
   parameters:
     max_connections: 100
     max_replication_slots: 5
     ...
-  use_pg_rewind: false
+  use_pg_rewind: true
 retry_timeout: 10
 ttl: 30

Apply these changes? [y/N]: y
Configuration changed
postgres@patroni1:/$

Then we must look into patroni or PostgreSQL logs for information about the pg_rewind issue, the most common failure is due to password mismatch or missing configuration in PostgreSQL.

postgres@patroni1:/$ cat /var/log/postgresql/patroni.log
...
2026-01-26 13:11:40,572 ERROR: Exception when working with leader
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/patroni/postgresql/rewind.py", line 82, in check_leader_is_not_in_recovery
    with get_connection_cursor(connect_timeout=3, options='-c statement_timeout=2000', **conn_kwargs) as cur:
         ~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.13/contextlib.py", line 141, in __enter__
    return next(self.gen)
  File "/usr/lib/python3/dist-packages/patroni/postgresql/connection.py", line 158, in get_connection_cursor
    conn = psycopg.connect(**kwargs)
  File "/usr/lib/python3/dist-packages/patroni/psycopg.py", line 136, in connect
    ret = _connect(*args, **kwargs)
  File "/usr/lib/python3/dist-packages/psycopg2/__init__.py", line 122, in connect
    conn = _connect(dsn, connection_factory=connection_factory, **kwasync)
psycopg2.OperationalError: connection to server at "patroni3" (172.21.0.5), port 5432 failed: FATAL:  pg_hba.conf rejects connection for host "172.21.0.6", user "rewind_user", database "postgres", no encryption

2026-01-26 13:11:40,575 INFO: no action. I am (patroni1), a secondary, and following a leader (patroni3)

Seems the rewind users is rejected the connection and the error indicate to look into pg_hba.conf file to solve the issue.

Switchover

The main difference between switchover and failover is that the leader node is active when running a switchover command when failover doesn't.

Immediate switchover

❯ docker exec -it demo-patroni2 bash
postgres@patroni2:/$ patronictl -c /postgres.yml list
+ Cluster: demo (7599660909656199194) ----+----+-------------+-----+------------+-----+
| Member   | Host     | Role    | State   | TL | Receive LSN | Lag | Replay LSN | Lag |
+----------+----------+---------+---------+----+-------------+-----+------------+-----+
| patroni1 | patroni1 | Replica | running |  1 |   0/5000000 |   0 |  0/5000000 |   0 |
| patroni2 | patroni2 | Leader  | running |  1 |             |     |            |     |
| patroni3 | patroni3 | Replica | running |  1 |   0/4000000 |   0 |  0/4000000 |   0 |
+----------+----------+---------+---------+----+-------------+-----+------------+-----+
postgres@patroni2:/$ patronictl -c /postgres.yml swithover demo
Usage: patronictl [OPTIONS] COMMAND [ARGS]...
Try 'patronictl --help' for help.

Error: No such command 'swithover'.
postgres@patroni2:/$ patronictl -c /postgres.yml switchover demo
Current cluster topology
+ Cluster: demo (7599660909656199194) ------+----+-------------+-----+------------+-----+
| Member   | Host     | Role    | State     | TL | Receive LSN | Lag | Replay LSN | Lag |
+----------+----------+---------+-----------+----+-------------+-----+------------+-----+
| patroni1 | patroni1 | Replica | streaming |  1 |   0/5000060 |   0 |  0/5000060 |   0 |
| patroni2 | patroni2 | Leader  | running   |  1 |             |     |            |     |
| patroni3 | patroni3 | Replica | streaming |  1 |   0/5000060 |   0 |  0/5000060 |   0 |
+----------+----------+---------+-----------+----+-------------+-----+------------+-----+
Primary [patroni2]:
Candidate ['patroni1', 'patroni3'] []: patroni1
When should the switchover take place (e.g. 2026-01-26T14:27 )  [now]:
Are you sure you want to switchover cluster demo, demoting current leader patroni2? [y/N]: y
2026-01-26 13:27:40.32003 Successfully switched over to "patroni1"
+ Cluster: demo (7599660909656199194) ------+----+-------------+-----+------------+-----+
| Member   | Host     | Role    | State     | TL | Receive LSN | Lag | Replay LSN | Lag |
+----------+----------+---------+-----------+----+-------------+-----+------------+-----+
| patroni1 | patroni1 | Leader  | running   |  1 |             |     |            |     |
| patroni2 | patroni2 | Replica | stopped   |    |     unknown |     |    unknown |     |
| patroni3 | patroni3 | Replica | streaming |  1 |   0/5000060 |   0 |  0/5000060 |   0 |
+----------+----------+---------+-----------+----+-------------+-----+------------+-----+
postgres@patroni2:/$ patronictl -c /postgres.yml list
+ Cluster: demo (7599660909656199194) ------+----+-------------+-----+------------+-----+
| Member   | Host     | Role    | State     | TL | Receive LSN | Lag | Replay LSN | Lag |
+----------+----------+---------+-----------+----+-------------+-----+------------+-----+
| patroni1 | patroni1 | Leader  | running   |  2 |             |     |            |     |
| patroni2 | patroni2 | Replica | streaming |  2 |   0/50002E8 |   0 |  0/50002E8 |   0 |
| patroni3 | patroni3 | Replica | streaming |  2 |   0/50002E8 |   0 |  0/50002E8 |   0 |
+----------+----------+---------+-----------+----+-------------+-----+------------+-----+
postgres@patroni2:/$

Scheduled switchover

❯ docker exec -it demo-patroni2 bash
postgres@patroni2:/$ date
Mon Jan 26 01:28:59 PM UTC 2026
postgres@patroni2:/$ patronictl -c /postgres.yml switchover demo
Current cluster topology
+ Cluster: demo (7599660909656199194) ------+----+-------------+-----+------------+-----+
| Member   | Host     | Role    | State     | TL | Receive LSN | Lag | Replay LSN | Lag |
+----------+----------+---------+-----------+----+-------------+-----+------------+-----+
| patroni1 | patroni1 | Leader  | running   |  2 |             |     |            |     |
| patroni2 | patroni2 | Replica | streaming |  2 |   0/50002E8 |   0 |  0/50002E8 |   0 |
| patroni3 | patroni3 | Replica | streaming |  2 |   0/50002E8 |   0 |  0/50002E8 |   0 |
+----------+----------+---------+-----------+----+-------------+-----+------------+-----+
Primary [patroni1]:
Candidate ['patroni2', 'patroni3'] []: patroni3
When should the switchover take place (e.g. 2026-01-26T14:29 )  [now]: 2026-01-26T13:32
Are you sure you want to schedule switchover of cluster demo at 2026-01-26T13:32:00+00:00, demoting current leader patroni1? [y/N]: y
2026-01-26 13:29:37.32834 Switchover scheduled
+ Cluster: demo (7599660909656199194) ------+----+-------------+-----+------------+-----+
| Member   | Host     | Role    | State     | TL | Receive LSN | Lag | Replay LSN | Lag |
+----------+----------+---------+-----------+----+-------------+-----+------------+-----+
| patroni1 | patroni1 | Leader  | running   |  2 |             |     |            |     |
| patroni2 | patroni2 | Replica | streaming |  2 |   0/50002E8 |   0 |  0/50002E8 |   0 |
| patroni3 | patroni3 | Replica | streaming |  2 |   0/50002E8 |   0 |  0/50002E8 |   0 |
+----------+----------+---------+-----------+----+-------------+-----+------------+-----+
 Switchover scheduled at: 2026-01-26T13:32:00+00:00
                    from: patroni1
                      to: patroni3
postgres@patroni2:/$ date
Mon Jan 26 01:29:53 PM UTC 2026
postgres@patroni2:/$ date
Mon Jan 26 01:32:19 PM UTC 2026
postgres@patroni2:/$ patronictl -c /postgres.yml list
+ Cluster: demo (7599660909656199194) ------+----+-------------+-----+------------+-----+
| Member   | Host     | Role    | State     | TL | Receive LSN | Lag | Replay LSN | Lag |
+----------+----------+---------+-----------+----+-------------+-----+------------+-----+
| patroni1 | patroni1 | Replica | streaming |  3 |   0/5000570 |   0 |  0/5000570 |   0 |
| patroni2 | patroni2 | Replica | streaming |  3 |   0/5000570 |   0 |  0/5000570 |   0 |
| patroni3 | patroni3 | Leader  | running   |  3 |             |     |            |     |
+----------+----------+---------+-----------+----+-------------+-----+------------+-----+
postgres@patroni2:/$

DCS failure (Consul)

Patroni is heavily relying on Distributed Configuration Store (DCS) for leader election, detect network issues and store dynamic postgres configuration by using its consensus algorithm.

Patroni continuously communicate with this system to read/update the cluster status and the leader lock.

Breaking DCS Consensus

By killing consul we can break DCS consensus and the complete cluster enters in read-only mode.

❯ docker kill demo-consul
demo-consul
❯ psql -h localhost -p 5051 -U postgres
psql (18.1)
Type "help" for help.

postgres=# select pg_is_in_recovery();
 pg_is_in_recovery
-------------------
 t
(1 row)

postgres=# \q
❯ psql -h localhost -p 5050 -U postgres
psql: error: connection to server at "localhost" (::1), port 5050 failed: server closed the connection unexpectedly
	This probably means the server terminated abnormally
	before or while processing the request.
❯ docker exec -it demo-patroni2 bash
postgres@patroni2:/$ psql -h patroni1 -U postgres
psql (18.1 (Debian 18.1-1.pgdg13+2))
Type "help" for help.

postgres=# select pg_is_in_recovery();
 pg_is_in_recovery
-------------------
 t
(1 row)

postgres=# \q
postgres@patroni2:/$ psql -h patroni2 -U postgres
psql (18.1 (Debian 18.1-1.pgdg13+2))
Type "help" for help.

postgres=# select pg_is_in_recovery();
 pg_is_in_recovery
-------------------
 t
(1 row)

postgres=# \q
postgres@patroni2:/$ psql -h patroni3 -U postgres
psql (18.1 (Debian 18.1-1.pgdg13+2))
Type "help" for help.

postgres=# select pg_is_in_recovery();
 pg_is_in_recovery
-------------------
 t
(1 row)

postgres=# \q
postgres@patroni2:/$

Enable failsafe_mode

We'll enable failsafe_mode and test again the DCS failure, in this case even when the patroni instances cannot read the coordinator status, the instances keep their last state.

❯ docker start demo-consul
demo-consul
❯ docker exec -it demo-patroni2 bash
postgres@patroni2:/$ patronictl -c /postgres.yml edit-config
---
+++
@@ -1,5 +1,5 @@
 check_timeline: true
-failsafe_mode: false
+failsafe_mode: true
 loop_wait: 10
 maximum_lag_on_failover: 1048576
 postgresql:

Apply these changes? [y/N]: y
Configuration changed
postgres@patroni2:/$

Let's run all the tests again:

❯ docker kill demo-consul
demo-consul
❯ psql -h localhost -p 5050 -U postgres
psql (18.1)
Type "help" for help.

postgres=# select pg_is_in_recovery();
 pg_is_in_recovery
-------------------
 f
(1 row)

postgres=# \q
❯ psql -h localhost -p 5051 -U postgres
psql (18.1)
Type "help" for help.

postgres=# select pg_is_in_recovery();
 pg_is_in_recovery
-------------------
 t
(1 row)

postgres=# \q 
❯ docker exec -it demo-patroni2 bash
postgres@patroni2:/$ psql -h patroni1 -U postgres
psql (18.1 (Debian 18.1-1.pgdg13+2))
Type "help" for help.

postgres=# select pg_is_in_recovery();
 pg_is_in_recovery
-------------------
 t
(1 row)

postgres=# \q
postgres@patroni2:/$ psql -h patroni2 -U postgres
psql (18.1 (Debian 18.1-1.pgdg13+2))
Type "help" for help.

postgres=# select pg_is_in_recovery();
 pg_is_in_recovery
-------------------
 t
(1 row)

postgres=# \q
postgres@patroni2:/$ psql -h patroni3 -U postgres
psql (18.1 (Debian 18.1-1.pgdg13+2))
Type "help" for help.

postgres=# select pg_is_in_recovery();
 pg_is_in_recovery
-------------------
 f
(1 row)

postgres=# \q
postgres@patroni2:/$

HA Proxy failure

HA Proxy looks as a single point of failure, but it's very resilient.

HA Proxy fail without restart

In this case we need to recover the HA Proxy to get connection again.

❯ docker kill demo-haproxy
demo-haproxy
❯ psql -h localhost -p 5050 -U postgres
psql: error: connection to server at "localhost" (::1), port 5050 failed: Connection refused
	Is the server running on that host and accepting TCP/IP connections?
connection to server at "localhost" (127.0.0.1), port 5050 failed: Connection refused
	Is the server running on that host and accepting TCP/IP connections?
❯ psql -h localhost -p 5051 -U postgres
psql: error: connection to server at "localhost" (::1), port 5051 failed: Connection refused
	Is the server running on that host and accepting TCP/IP connections?
connection to server at "localhost" (127.0.0.1), port 5051 failed: Connection refused
	Is the server running on that host and accepting TCP/IP connections?
❯ psql -h localhost -p 5433 -U postgres
psql (18.1)
Type "help" for help.

postgres=# select pg_is_in_recovery();
 pg_is_in_recovery
-------------------
 f
(1 row)

postgres=# \q

HA Proxy fail with restart policy

We set restart: always in docker-compose.yml file to get something like:

...
  haproxy:
    container_name: demo-haproxy
    hostname: haproxy
    image: haproxy:3.2.6-alpine
    restart: always
    networks: [ demo ]
    ports:
      - "5050:5050"
      - "5051:5051"
      - "9001:9001"
    volumes:
      - ./haproxy.cfg:/usr/local/etc/haproxy/haproxy.cfg
...      

then we reload the compose file with: docker compose up -d If we want to simulate a restart, we do:

❯ docker ps
CONTAINER ID   IMAGE                       COMMAND                  CREATED          STATUS          PORTS                                                                                                          NAMES
8bf1871b01a9   haproxy:3.2.6-alpine        "docker-entrypoint.s…"   4 minutes ago    Up 59 seconds   0.0.0.0:5050-5051->5050-5051/tcp, [::]:5050-5051->5050-5051/tcp, 0.0.0.0:9001->9001/tcp, [::]:9001->9001/tcp   demo-haproxy
c6f7708e5236   patroni-workshop-patroni2   "docker-entrypoint.s…"   39 minutes ago   Up 39 minutes   0.0.0.0:5432->5432/tcp, [::]:5432->5432/tcp                                                                    demo-patroni2
297ef3b0df56   patroni-workshop-patroni1   "docker-entrypoint.s…"   39 minutes ago   Up 39 minutes   0.0.0.0:5431->5432/tcp, [::]:5431->5432/tcp                                                                    demo-patroni1
e4ec9dd61492   patroni-workshop-patroni3   "docker-entrypoint.s…"   39 minutes ago   Up 39 minutes   0.0.0.0:5433->5432/tcp, [::]:5433->5432/tcp                                                                    demo-patroni3
4ea7b6e75fc9   hashicorp/consul:1.18.2     "docker-entrypoint.s…"   39 minutes ago   Up 8 minutes    0.0.0.0:8500->8500/tcp, [::]:8500->8500/tcp, 0.0.0.0:8600->8600/udp, [::]:8600->8600/udp                       demo-consul
4fbf9019c118   veegres/ivory               "entrypoint.sh"          39 minutes ago   Up 39 minutes   0.0.0.0:80->80/tcp, [::]:80->80/tcp                                                                            demo-ivory
❯ docker exec -it demo-haproxy sh
~ $ ps -ef
PID   USER     TIME  COMMAND
    1 haproxy   0:00 haproxy -W -db -f /usr/local/etc/haproxy/haproxy.cfg
    8 haproxy   0:00 haproxy -W -db -f /usr/local/etc/haproxy/haproxy.cfg
   23 haproxy   0:00 sh
   29 haproxy   0:00 ps -ef
~ $ kill -9 8
~ $ ^[[50;5R
What's next:
    Try Docker Debug for seamless, persistent debugging tools in any container or image → docker debug demo-haproxy
    Learn more at https://docs.docker.com/go/debug-cli/
❯ docker ps
CONTAINER ID   IMAGE                       COMMAND                  CREATED          STATUS          PORTS                                                                                                          NAMES
8bf1871b01a9   haproxy:3.2.6-alpine        "docker-entrypoint.s…"   4 minutes ago    Up 2 seconds    0.0.0.0:5050-5051->5050-5051/tcp, [::]:5050-5051->5050-5051/tcp, 0.0.0.0:9001->9001/tcp, [::]:9001->9001/tcp   demo-haproxy
c6f7708e5236   patroni-workshop-patroni2   "docker-entrypoint.s…"   39 minutes ago   Up 39 minutes   0.0.0.0:5432->5432/tcp, [::]:5432->5432/tcp                                                                    demo-patroni2
297ef3b0df56   patroni-workshop-patroni1   "docker-entrypoint.s…"   39 minutes ago   Up 39 minutes   0.0.0.0:5431->5432/tcp, [::]:5431->5432/tcp                                                                    demo-patroni1
e4ec9dd61492   patroni-workshop-patroni3   "docker-entrypoint.s…"   39 minutes ago   Up 39 minutes   0.0.0.0:5433->5432/tcp, [::]:5433->5432/tcp                                                                    demo-patroni3
4ea7b6e75fc9   hashicorp/consul:1.18.2     "docker-entrypoint.s…"   39 minutes ago   Up 8 minutes    0.0.0.0:8500->8500/tcp, [::]:8500->8500/tcp, 0.0.0.0:8600->8600/udp, [::]:8600->8600/udp                       demo-consul
4fbf9019c118   veegres/ivory               "entrypoint.sh"          39 minutes ago   Up 39 minutes   0.0.0.0:80->80/tcp, [::]:80->80/tcp                                                                            demo-ivory
❯ psql -h localhost -p 5050 -U postgres
psql (18.1)
Type "help" for help.

postgres=# \q

Pause mode

Patroni stops controlling postgres in this mode, it's useful for performing maintenance on the PostgreSQL cluster or DCS.

  • The mode is cluster-wide (all nodes or no nodes)
  • Takes up to loop_wait seconds for a node to be paused
  • Nodes might not be paused simultaneously
  • Automatic failover is disabled
  • No automatic read-only mode when DCS is not accessible (no failsafe_mode)
  • PostgreSQL is not shut down when Patroni is stopped
  • PostgreSQL is not started automatically when shut down
  • PostgreSQL master will update the leader key (or acquire it if it is not taken) However
  • New replicas can be created
  • Manual switchover/failover works

To enter in pause mode you should:

❯ docker exec -it demo-patroni2 bash
postgres@patroni2:/$ patronictl -c /postgres.yml list
+ Cluster: demo (7599660909656199194) ------+----+-------------+-----+------------+-----+
| Member   | Host     | Role    | State     | TL | Receive LSN | Lag | Replay LSN | Lag |
+----------+----------+---------+-----------+----+-------------+-----+------------+-----+
| patroni1 | patroni1 | Replica | streaming |  4 |   0/5000728 |   0 |  0/5000728 |   0 |
| patroni2 | patroni2 | Replica | streaming |  4 |   0/5000728 |   0 |  0/5000728 |   0 |
| patroni3 | patroni3 | Leader  | running   |  4 |             |     |            |     |
+----------+----------+---------+-----------+----+-------------+-----+------------+-----+
postgres@patroni2:/$ patronictl -c /postgres.yml pause --wait
'pause' request sent, waiting until it is recognized by all nodes
Success: cluster management is paused
postgres@patroni2:/$ patronictl -c /postgres.yml list
+ Cluster: demo (7599660909656199194) ------+----+-------------+-----+------------+-----+
| Member   | Host     | Role    | State     | TL | Receive LSN | Lag | Replay LSN | Lag |
+----------+----------+---------+-----------+----+-------------+-----+------------+-----+
| patroni1 | patroni1 | Replica | streaming |  4 |   0/5000728 |   0 |  0/5000728 |   0 |
| patroni2 | patroni2 | Replica | streaming |  4 |   0/5000728 |   0 |  0/5000728 |   0 |
| patroni3 | patroni3 | Leader  | running   |  4 |             |     |            |     |
+----------+----------+---------+-----------+----+-------------+-----+------------+-----+
 Maintenance mode: on
postgres@patroni2:/$

Promoting a replica in pause mode

Patroni isn't aware of any change into the PostgreSQL cluster, and this can cause a split brain scenario:

❯ docker exec -it demo-patroni1 bash
postgres@patroni1:/$ psql -h patroni1 -U postgres
psql (18.1 (Debian 18.1-1.pgdg13+2))
Type "help" for help.

postgres=# select pg_promote(); create table test (id integer, data text);
 pg_promote
------------
 t
(1 row)

postgres=# \q
postgres@patroni1:/$ patronictl -c /postgres.yml list
+ Cluster: demo (7599660909656199194) ------+----+-------------+-----+------------+-----+
| Member   | Host     | Role    | State     | TL | Receive LSN | Lag | Replay LSN | Lag |
+----------+----------+---------+-----------+----+-------------+-----+------------+-----+
| patroni1 | patroni1 | Replica | running   |  9 |     unknown |     |  0/501A1C0 |   0 |
| patroni2 | patroni2 | Replica | streaming |  4 |   0/5000728 |   0 |  0/5000728 |   0 |
| patroni3 | patroni3 | Leader  | running   |  4 |             |     |            |     |
+----------+----------+---------+-----------+----+-------------+-----+------------+-----+
 Maintenance mode: on
postgres@patroni1:/$ exit
❯ psql -h localhost -p 5050 -U postgres
psql (18.1)
Type "help" for help.

postgres=# select pg_is_in_recovery();
 pg_is_in_recovery
-------------------
 f
(1 row)

postgres=# select inet_server_addr();
 inet_server_addr
------------------
 172.21.0.4
(1 row)

postgres=#

As you can see patroni3 and patroni1 are accepting write transactions and HA Proxy still connecting to the old primary because that's the information retained by patroni and the DCS.

If we end the pause mode then the promoted node goes back to become a replica again and any updated record on that node is lost.

❯ docker exec -it demo-patroni1 bash
postgres@patroni1:/$ patronictl -c /postgres.yml list
+ Cluster: demo (7599660909656199194) ------+----+-------------+-----+------------+-----+
| Member   | Host     | Role    | State     | TL | Receive LSN | Lag | Replay LSN | Lag |
+----------+----------+---------+-----------+----+-------------+-----+------------+-----+
| patroni1 | patroni1 | Replica | running   |  9 |     unknown |     |  0/501A1C0 |   0 |
| patroni2 | patroni2 | Replica | streaming |  4 |   0/5000728 |   0 |  0/5000728 |   0 |
| patroni3 | patroni3 | Leader  | running   |  4 |             |     |            |     |
+----------+----------+---------+-----------+----+-------------+-----+------------+-----+
 Maintenance mode: on
postgres@patroni1:/$ patronictl -c /postgres.yml resume
Success: cluster management is resumed
postgres@patroni1:/$ patronictl -c /postgres.yml list
+ Cluster: demo (7599660909656199194) ------+----+-------------+-----+------------+-----+
| Member   | Host     | Role    | State     | TL | Receive LSN | Lag | Replay LSN | Lag |
+----------+----------+---------+-----------+----+-------------+-----+------------+-----+
| patroni1 | patroni1 | Replica | running   |  4 |   0/5000000 |   0 |  0/501A1C0 |   0 |
| patroni2 | patroni2 | Replica | streaming |  4 |   0/5000728 |   0 |  0/5000728 |   0 |
| patroni3 | patroni3 | Leader  | running   |  4 |             |     |            |     |
+----------+----------+---------+-----------+----+-------------+-----+------------+-----+
postgres@patroni1:/$ psql -h patroni1 -U postgres
psql (18.1 (Debian 18.1-1.pgdg13+2))
Type "help" for help.

postgres=# select pg_is_in_recovery();
 pg_is_in_recovery
-------------------
 t
(1 row)

postgres=#

UI Management Tool

"Ivory is an open-source project designed to simplify and visualize work with Postgres clusters."

Ivory project on GitHub.

Configuring Ivory

First time you enter the application you'll set up some configuration parameters for the application to work.

The following images display the configuration process:

  1. Secret word Secret word
  2. Authentication configuration Authentication configuration
  3. Login Login
  4. Adding a patroni cluster Adding a patroni cluster
  5. Adding cluster nodes taken a previous step Adding noded
  6. Warning about postgres password postgres user password
  7. Storing postgres password storing postgres password
  8. Choosing previous password for the cluster selecting password
  9. Final application view application view

Advanced HAProxy Configuration

We can use the server-template configuration directive to enable dynamic backend server management by pre-defining a pool of placeholder servers, which are populated at runtime using DNS-based service discovery.

To test this option, we've included a secondary docker compose file called docker-compose-dns.yml and a secondary HA Proxy configuration file called haproxy-dns.cfg.

Starting the new docker compose

docker compose down
docker compose -f docker-compose-dns.yml up -d

Getting patroni instances IPs

After the instances have started, we need to get all the patroni IPs:

❯ docker exec -it demo-patroni1 bash
postgres@patroni1:/$ psql -h patroni1 -U postgres
psql (18.1 (Debian 18.1-1.pgdg13+2))
Type "help" for help.

postgres=# select inet_server_addr();
 inet_server_addr
------------------
 172.21.0.4
(1 row)

postgres=# \q
postgres@patroni1:/$ psql -h patroni2 -U postgres
psql (18.1 (Debian 18.1-1.pgdg13+2))
Type "help" for help.

postgres=# select inet_server_addr();
 inet_server_addr
------------------
 172.21.0.8
(1 row)

postgres=# \q
postgres@patroni1:/$ psql -h patroni3 -U postgres
psql (18.1 (Debian 18.1-1.pgdg13+2))
Type "help" for help.

postgres=# select inet_server_addr();
 inet_server_addr
------------------
 172.21.0.5
(1 row)

postgres=# \q
postgres@patroni1:/$ exit

Configuring DNS A record

These records are stored in the file using the domain: patroni-hosts.my.dns in the following way:

address=/patroni-hosts.my.dns/172.21.0.4
address=/patroni-hosts.my.dns/172.21.0.8
address=/patroni-hosts.my.dns/172.21.0.5

We then need to restart the demo-dnsmasq container:

docker restart demo-dnsmasq

After checking the logs to see it starts well:

❯ docker logs demo-haproxy
...
[NOTICE]   (1) : Loading success.
[WARNING]  (8) : patroni_readiness_healthcheck/patroni-1: IP changed from '(none)' to '172.21.0.5' by 'demo-dns-server/dns-server'.
[WARNING]  (8) : patroni_readiness_healthcheck/patroni-2: IP changed from '(none)' to '172.21.0.8' by 'DNS cache'.
[WARNING]  (8) : patroni_readiness_healthcheck/patroni-3: IP changed from '(none)' to '172.21.0.4' by 'DNS cache'.
[WARNING]  (8) : readwrite/patroni-1: IP changed from '(none)' to '172.21.0.4' by 'DNS cache'.
[WARNING]  (8) : readwrite/patroni-2: IP changed from '(none)' to '172.21.0.5' by 'DNS cache'.
[WARNING]  (8) : readwrite/patroni-3: IP changed from '(none)' to '172.21.0.8' by 'DNS cache'.
[WARNING]  (8) : readonly/patroni-1: IP changed from '(none)' to '172.21.0.8' by 'DNS cache'.
[WARNING]  (8) : readonly/patroni-2: IP changed from '(none)' to '172.21.0.4' by 'DNS cache'.
[WARNING]  (8) : readonly/patroni-3: IP changed from '(none)' to '172.21.0.5' by 'DNS cache'.
[WARNING]  (8) : Server patroni_readiness_healthcheck/patroni-1 ('patroni-hosts.my.dns') is UP/READY (resolves again).
[WARNING]  (8) : Server patroni_readiness_healthcheck/patroni-1 administratively READY thanks to valid DNS answer.
[WARNING]  (8) : Server patroni_readiness_healthcheck/patroni-2 ('patroni-hosts.my.dns') is UP/READY (resolves again).
[WARNING]  (8) : Server patroni_readiness_healthcheck/patroni-2 administratively READY thanks to valid DNS answer.
[WARNING]  (8) : Server patroni_readiness_healthcheck/patroni-3 ('patroni-hosts.my.dns') is UP/READY (resolves again).
[WARNING]  (8) : Server patroni_readiness_healthcheck/patroni-3 administratively READY thanks to valid DNS answer.
[WARNING]  (8) : Server readwrite/patroni-1 ('patroni-hosts.my.dns') is UP/READY (resolves again).
[WARNING]  (8) : Server readwrite/patroni-1 administratively READY thanks to valid DNS answer.
[WARNING]  (8) : Server readwrite/patroni-2 ('patroni-hosts.my.dns') is UP/READY (resolves again).
[WARNING]  (8) : Server readwrite/patroni-2 administratively READY thanks to valid DNS answer.
[WARNING]  (8) : Server readwrite/patroni-3 ('patroni-hosts.my.dns') is UP/READY (resolves again).
[WARNING]  (8) : Server readwrite/patroni-3 administratively READY thanks to valid DNS answer.
[WARNING]  (8) : Server readonly/patroni-1 ('patroni-hosts.my.dns') is UP/READY (resolves again).
[WARNING]  (8) : Server readonly/patroni-1 administratively READY thanks to valid DNS answer.
[WARNING]  (8) : Server readonly/patroni-2 ('patroni-hosts.my.dns') is UP/READY (resolves again).
[WARNING]  (8) : Server readonly/patroni-2 administratively READY thanks to valid DNS answer.
[WARNING]  (8) : Server readonly/patroni-3 ('patroni-hosts.my.dns') is UP/READY (resolves again).
[WARNING]  (8) : Server readonly/patroni-3 administratively READY thanks to valid DNS answer.
[WARNING]  (8) : Server readwrite/patroni-1 is DOWN, reason: Layer7 wrong status, code: 503, info: "Service Unavailable", check duration: 3ms. 2 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
[WARNING]  (8) : Server readwrite/patroni-3 is DOWN, reason: Layer7 wrong status, code: 503, info: "Service Unavailable", check duration: 4ms. 1 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.

Testing HAProxy

Testing readwrite connection

This connection always return primary node:

❯ psql -h localhost -p 5050 -U postgres
psql (18.1)
Type "help" for help.

postgres=# select inet_server_addr();
 inet_server_addr
------------------
 172.21.0.5
(1 row)

postgres=# \q
❯ psql -h localhost -p 5050 -U postgres
psql (18.1)
Type "help" for help.

postgres=# select inet_server_addr();
 inet_server_addr
------------------
 172.21.0.5
(1 row)

postgres=# select pg_is_in_recovery();
 pg_is_in_recovery
-------------------
 f
(1 row)

postgres=# \q

Testing readonly connection

This connection returns all nodes using round-robin selection method:

❯ psql -h localhost -p 5051 -U postgres
psql (18.1)
Type "help" for help.

postgres=# select inet_server_addr();
 inet_server_addr
------------------
 172.21.0.8
(1 row)

postgres=# \q
❯ psql -h localhost -p 5051 -U postgres
psql (18.1)
Type "help" for help.

postgres=# select inet_server_addr();
 inet_server_addr
------------------
 172.21.0.4
(1 row)

postgres=# \q
❯ psql -h localhost -p 5051 -U postgres
psql (18.1)
Type "help" for help.

postgres=# select inet_server_addr();
 inet_server_addr
------------------
 172.21.0.5
(1 row)

postgres=# \q
❯ psql -h localhost -p 5051 -U postgres
psql (18.1)
Type "help" for help.

postgres=# select inet_server_addr();
 inet_server_addr
------------------
 172.21.0.8
(1 row)

postgres=# \q

About

A patroni training workshop using docker

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors