Skip to content

Commit 7f22501

Browse files
committed
Merge tag 'v4.1.0' into multisite
2 parents 9c221af + 2a3f060 commit 7f22501

File tree

25 files changed

+595
-95
lines changed

25 files changed

+595
-95
lines changed

.github/workflows/tests.yaml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ env:
1313

1414
jobs:
1515
unit:
16-
runs-on: ${{ fromJson('{"ubuntu":"ubuntu-22.04","windows":"windows-latest","macos":"macos-latest"}')[matrix.os] }}
16+
runs-on: ${{ fromJson('{"ubuntu":"ubuntu-22.04","windows":"windows-latest","macos":"macos-15"}')[matrix.os] }}
1717
strategy:
1818
fail-fast: false
1919
matrix:
@@ -105,7 +105,7 @@ jobs:
105105
run: python -m coveralls --service=github
106106

107107
behave:
108-
runs-on: ${{ fromJson('{"ubuntu":"ubuntu-22.04","windows":"windows-latest","macos":"macos-latest"}')[matrix.os] }}
108+
runs-on: ${{ fromJson('{"ubuntu":"ubuntu-22.04","windows":"windows-latest","macos":"macos-14"}')[matrix.os] }}
109109
env:
110110
DCS: ${{ matrix.dcs }}
111111
ETCDVERSION: 3.4.23
@@ -198,7 +198,7 @@ jobs:
198198

199199
- uses: jakebailey/pyright-action@v2
200200
with:
201-
version: 1.1.401
201+
version: 1.1.405
202202

203203
ydiff:
204204
name: Test compatibility with the latest version of ydiff

README.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ Patroni is a template for high availability (HA) PostgreSQL solutions using Pyth
1212

1313
We call Patroni a "template" because it is far from being a one-size-fits-all or plug-and-play replication system. It will have its own caveats. Use wisely.
1414

15-
Currently supported PostgreSQL versions: 9.3 to 17.
15+
Currently supported PostgreSQL versions: 9.3 to 18.
1616

1717
**Note to Citus users**: Starting from 3.0 Patroni nicely integrates with the `Citus <https://github.com/citusdata/citus>`__ database extension to Postgres. Please check the `Citus support page <https://github.com/patroni/patroni/blob/master/docs/citus.rst>`__ in the Patroni documentation for more info about how to use Patroni high availability together with a Citus distributed cluster.
1818

docs/index.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ Patroni is a template for high availability (HA) PostgreSQL solutions using Pyth
1010

1111
We call Patroni a "template" because it is far from being a one-size-fits-all or plug-and-play replication system. It will have its own caveats. Use wisely. There are many ways to run high availability with PostgreSQL; for a list, see the `PostgreSQL Documentation <https://wiki.postgresql.org/wiki/Replication,_Clustering,_and_Connection_Pooling>`__.
1212

13-
Currently supported PostgreSQL versions: 9.3 to 17.
13+
Currently supported PostgreSQL versions: 9.3 to 18.
1414

1515
**Note to Citus users**: Starting from 3.0 Patroni nicely integrates with the `Citus <https://github.com/citusdata/citus>`__ database extension to Postgres. Please check the :ref:`Citus support page <citus>` in the Patroni documentation for more info about how to use Patroni high availability together with a Citus distributed cluster.
1616

docs/patronictl.rst

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1136,6 +1136,7 @@ Synopsis
11361136
[ --group CITUS_GROUP ]
11371137
[ --wait ]
11381138
[ --force ]
1139+
[ --from-leader ]
11391140
11401141
.. _patronictl_reinit_description:
11411142
@@ -1168,6 +1169,9 @@ Parameters
11681169
``--force``
11691170
Flag to skip confirmation prompts when rebuilding Postgres standby instances.
11701171
1172+
``--from-leader``
1173+
Flag to get basebackup from leader directly.
1174+
11711175
Useful for scripts.
11721176
11731177
.. _patronictl_reinit_examples:
@@ -1206,6 +1210,20 @@ Request a rebuild of ``postgresql2`` and wait for it to complete:
12061210
Waiting for reinitialize to complete on: postgresql2
12071211
Reinitialize is completed on: postgresql2
12081212
1213+
Request a rebuild of ``postgresql2`` and get basebackup from leader directly:
1214+
1215+
.. code:: bash
1216+
1217+
$ patronictl -c postgres0.yml reinit batman postgresql2 --from-leader
1218+
+ Cluster: batman (7277694203142172922) -+-----------+----+-------------+-----+------------+-----+
1219+
| Member | Host | Role | State | TL | Receive LSN | Lag | Replay LSN | Lag |
1220+
+-------------+----------------+---------+-----------+----+-------------+-----+------------+-----+
1221+
| postgresql0 | 127.0.0.1:5432 | Leader | running | 5 | | | | |
1222+
| postgresql1 | 127.0.0.1:5433 | Replica | streaming | 5 | 0/40004E8 | 0 | 0/40004E8 | 0 |
1223+
| postgresql2 | 127.0.0.1:5434 | Replica | streaming | 5 | 0/40004E8 | 0 | 0/40004E8 | 0 |
1224+
+-------------+----------------+---------+-----------+----+-------------+-----+------------+-----+
1225+
Success: reinitialize for member postgresql2
1226+
12091227
.. _patronictl_reload:
12101228
12111229
patronictl reload

docs/releases.rst

Lines changed: 121 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,127 @@
33
Release notes
44
=============
55

6+
Version 4.1.0
7+
-------------
8+
9+
Released 2025-09-23
10+
11+
**New features**
12+
13+
- Add support for systemd "notify" unit type (Ronan Dunklau)
14+
15+
Without a notify unit type, it is possible to start Patroni and immediately send it a SIGHUP signal using systemd, effectively killing it before it had time to set up its signal handlers.
16+
17+
- Provide receive and replay LSN/lag information in API and ctl (Polina Bungina)
18+
19+
Patroni REST API ``/cluster`` endpoint and ``patronictl list`` command now provide receive LSN, replay LSN, receive lag, and replay lag information for each replica member.
20+
21+
- Ensure clean demotion to standby cluster (Polina Bungina)
22+
23+
Make sure the introduction of the ``standby_cluster`` section in the dynamic configuration leads to a clean cluster demotion.
24+
25+
- Implement ``patronictl demote-cluster`` and ``promote-cluster`` commands (Polina Bungina)
26+
27+
New commands for cluster demotion and promotion handle both the dynamic configuration editing and checking the result status.
28+
29+
- Implement ``sync_priority`` tag (Polina Bungina)
30+
31+
This parameter controls the priority a member should have during synchronous replica selection when ``synchronous_mode`` is set to ``on``.
32+
33+
- Implement ``--print`` option for ``--validate-config`` (Polina Bungina)
34+
35+
Print out local configuration (including environment configuration overrides) after it has been successfully validated.
36+
37+
- Implement ``kubernetes.bootstrap_labels`` (Polina Bungina)
38+
39+
This feature allows you to define labels that will be assigned to a member pod when in ``initializing new cluster``, ``running custom bootstrap script``, ``starting after custom bootstrap``, or ``creating replica`` state.
40+
41+
- Add configuration option to suppress duplicate heartbeat logs (Michael Morris)
42+
43+
If set to ``true``, successive heartbeat logs that are identical shall not be output.
44+
45+
- Add optional ``cluster_type`` attribute to permanent replication slots (Michael Banck)
46+
47+
This allows you to set whether a particular permanent replication slot should always be created, or just on a primary or standby cluster.
48+
49+
- Make HTTP Server header configurable (David Grierson)
50+
51+
Introduce the ``restapi.server_tokens`` configuration parameter that allows you to restrict information disclosed in the HTTP Server header.
52+
53+
- Implement readiness API checks for replication on replica members (Ants Aasma)
54+
55+
The previous implementation considered replicas ready as soon as PostgreSQL was started. With this change, a replica pod is only considered ready when PostgreSQL is replicating and is not too far behind the leader.
56+
57+
58+
**Improvements**
59+
60+
- Reduce log level of watchdog configuration failure (Ants Aasma)
61+
62+
Show the `Could not activate Linux watchdog device` log line on debug logging level, when the watchdog is configured with ``required`` mode. It was previously shown on info level.
63+
64+
- Take advantage of ``written_lsn`` and ``latest_end_lsn`` from ``pg_stat_wal_receiver`` (Alexander Kukushkin)
65+
66+
``written_lsn``, the actual write LSN, is now preferred over the one returned by ``pg_last_wal_receive_lsn()``, which is in fact the flush LSN. ``latest_end_lsn`` points to WAL flush on the source host. In case of a primary, it allows better calculation of the replay lag, because values stored in DCS are updated only every ``loop_wait`` seconds.
67+
68+
- Avoid interactions with slots created with the ``failover=true`` option (Alexander Kukushkin)
69+
70+
This change is required to make the logical failover slots feature fully functional.
71+
72+
- Add PostgreSQL state to ``/metrics`` REST API endpoint (Ivan Filianin)
73+
74+
PostgreSQL instance state information is now available in the Prometheus format output of the ``/metrics`` REST API endpoint.
75+
76+
77+
Version 4.0.7
78+
-------------
79+
80+
Released 2025-09-22
81+
82+
**New features**
83+
84+
- Add support for PostgreSQL 18 RC1 (Alexander Kukushkin)
85+
86+
GUC's validator rules were extended. Patroni now properly handles the new background I/O worker.
87+
88+
89+
**Bugfixes**
90+
91+
- Fix potential issue around resolving localhost to IPv6 on Windows (András Váczi)
92+
93+
When configuring ``listen_addresses`` in PostgreSQL, using ``0.0.0.0`` or ``127.0.0.1`` will restrict listening to IPv4 only, excluding IPv6. On typical Windows systems, however, ``localhost`` often resolves to the IPv6 address ``::1`` by default. To ensure compatibility, Patroni now configures PostgreSQL to listen on ``127.0.0.1``, instead of ``localhost``, on Windows systems.
94+
95+
- Return global config only when ``/config`` key exists in DCS (Alexander Kukushkin)
96+
97+
Patroni REST API was returning an empty configuration instead of raising an error if the ``/config`` key was missing in DCS.
98+
99+
- Fix the issue of failsafe mode not being triggered in case of Etcd unavailability (Alexander Kukushkin)
100+
101+
Patroni was not always properly handling ``etcd3`` exceptions, which resulted in failsafe mode not being triggered.
102+
103+
- Fix signal handler reentrancy deadlock (Waynerv)
104+
105+
Patroni running in a Docker container with ``PID=1`` in some special cases was experiencing deadlock after receiving ``SIGCHLD``.
106+
107+
- Recreate (permanent) physical slot when it doesn't reserve WAL (Israel Barth Rubio)
108+
109+
Permanent physical replication slots created outside of Patroni scope without reserving WALs were causing a ``replication slot cannot be advanced`` error. To avoid this, Patroni now recreates such slots.
110+
111+
- Handle watch cancelation messages in ``etcd3`` properly (Alexander Kukushkin)
112+
113+
When ``etcd3`` sends a cancelation message to the watch channel, it doesn't close the connection. This results in Patroni using stale data. Patroni now solves it by breaking a loop of reading chunked response and closing the connection on the Patroni side.
114+
115+
- Handle case when ``HTTPConnection`` socket is wrapped with ``pyopenssl`` (Alexander Kukushkin)
116+
117+
Patroni was not correctly using ``pyopenssl`` interfaces, enforced in ``python-etcd``.
118+
119+
120+
**Documentation improvements**
121+
122+
- Improve 2-node cluster guidance (Nikolay Samokhvalov)
123+
124+
Clarify behaviour during failover and DCS requirements.
125+
126+
6127
Version 4.0.6
7128
-------------
8129

docs/rest_api.rst

Lines changed: 69 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -286,7 +286,7 @@ Retrieve the Patroni metrics in Prometheus format through the ``GET /metrics`` e
286286
.. code-block:: bash
287287
288288
$ curl http://localhost:8008/metrics
289-
289+
290290
# HELP patroni_version Patroni semver without periods. \
291291
# TYPE patroni_version gauge
292292
patroni_version{scope="batman",name="patroni1"} 040000
@@ -353,6 +353,71 @@ Retrieve the Patroni metrics in Prometheus format through the ``GET /metrics`` e
353353
# HELP patroni_is_paused Value is 1 if auto failover is disabled, 0 otherwise.
354354
# TYPE patroni_is_paused gauge
355355
patroni_is_paused{scope="batman",name="patroni1"} 1
356+
# HELP patroni_postgres_state Numeric representation of Postgres state.
357+
# Values: 0=initdb, 1=initdb_failed, 2=custom_bootstrap, 3=custom_bootstrap_failed, 4=creating_replica, 5=running, 6=starting, 7=bootstrap_starting, 8=start_failed, 9=restarting, 10=restart_failed, 11=stopping, 12=stopped, 13=stop_failed, 14=crashed
358+
# TYPE patroni_postgres_state gauge
359+
patroni_postgres_state{scope="batman",name="patroni1"} 5
360+
361+
PostgreSQL State Values
362+
^^^^^^^^^^^^^^^^^^^^^^^
363+
364+
The ``patroni_postgres_state`` metric provides a numeric representation of the current PostgreSQL instance state. This is useful for monitoring and alerting systems that need to track state changes over time. The numeric values are generated using the ``PostgresqlState.get_metrics_description()`` static method.
365+
366+
.. list-table:: PostgreSQL State Values
367+
:widths: 10 20 50
368+
:header-rows: 1
369+
370+
* - Value
371+
- State Name
372+
- Description
373+
* - 0
374+
- initdb
375+
- Initializing new cluster
376+
* - 1
377+
- initdb_failed
378+
- Initialization of new cluster failed
379+
* - 2
380+
- custom_bootstrap
381+
- Running custom bootstrap script
382+
* - 3
383+
- custom_bootstrap_failed
384+
- Custom bootstrap script failed
385+
* - 4
386+
- creating_replica
387+
- Creating replica from primary
388+
* - 5
389+
- running
390+
- PostgreSQL is running normally
391+
* - 6
392+
- starting
393+
- PostgreSQL is starting up
394+
* - 7
395+
- bootstrap_starting
396+
- Starting after custom bootstrap
397+
* - 8
398+
- start_failed
399+
- PostgreSQL start failed
400+
* - 9
401+
- restarting
402+
- PostgreSQL is restarting
403+
* - 10
404+
- restart_failed
405+
- PostgreSQL restart failed
406+
* - 11
407+
- stopping
408+
- PostgreSQL is stopping
409+
* - 12
410+
- stopped
411+
- PostgreSQL is stopped
412+
* - 13
413+
- stop_failed
414+
- PostgreSQL stop failed
415+
* - 14
416+
- crashed
417+
- PostgreSQL has crashed
418+
419+
.. note::
420+
These numeric values are fixed and will never change to maintain backward compatibility with existing monitoring systems. If new states are added in the future, they will be assigned new numeric values without changing existing ones.
356421

357422

358423
Cluster status endpoints
@@ -637,7 +702,7 @@ In the JSON body of the ``POST`` request you must specify the ``candidate`` fiel
637702
Successfully failed over to "postgresql1"
638703
639704
.. warning::
640-
:ref:`Be very careful <failover_healthcheck>` when using this endpoint, as this can cause data loss in certain situations. In most cases, :ref:`the switchover endpoint <switchover_api>` satisfies the administrator's needs.
705+
:ref:`Be very careful <failover_healthcheck>` when using this endpoint, as this can cause data loss in certain situations. In most cases, :ref:`the switchover endpoint <switchover_api>` satisfies the administrator's needs.
641706

642707

643708
``POST /switchover`` and ``POST /failover`` endpoints are used by :ref:`patronictl_switchover` and :ref:`patronictl_failover`, respectively.
@@ -721,4 +786,6 @@ Reinitialize endpoint
721786

722787
The call might fail if Patroni is in a loop trying to recover (restart) a failed Postgres. In order to overcome this problem one can specify ``{"force":true}`` in the request body.
723788

789+
You can specify {"from-leader":true} in the request body to directly get basebackup from leader node. This is useful when executing reinit during all replica nodes fail.
790+
724791
The reinitialize endpoint is used by :ref:`patronictl_reinit`.

features/patroni_api.feature

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -77,7 +77,7 @@ Scenario: check API requests for the primary-replica pair in the pause mode
7777
Then I receive a response code 200
7878
And I receive a response state running
7979
And I receive a response role replica
80-
When I run patronictl.py reinit batman postgres-1 --force --wait
80+
When I run patronictl.py reinit batman postgres-1 --force --from-leader --wait
8181
Then I receive a response returncode 0
8282
And I receive a response output "Success: reinitialize for member postgres-1"
8383
And postgres-1 role is the secondary after 30 seconds

patroni/api.py

Lines changed: 12 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -733,6 +733,15 @@ def do_GET_metrics(self) -> None:
733733
metrics.append("# TYPE patroni_is_paused gauge")
734734
metrics.append("patroni_is_paused{0} {1}".format(labels, int(postgres.get('pause', 0))))
735735

736+
metrics.append("# HELP patroni_postgres_state Numeric representation of Postgres state.")
737+
# Generate description of all state values for metrics documentation
738+
state_descriptions = [f"{state.index}={state.name.lower()}" for state in PostgresqlState]
739+
metrics.append(f"# Values: {', '.join(state_descriptions)}")
740+
metrics.append("# TYPE patroni_postgres_state gauge")
741+
current_state = postgres['state']
742+
state_value = current_state.index if isinstance(current_state, PostgresqlState) else -1
743+
metrics.append(f"patroni_postgres_state{labels} {state_value}")
744+
736745
if patroni.multisite.is_active:
737746
metrics.append("# HELP patroni_multisite_switches Number of times multisite leader has been switched")
738747
metrics.append("# TYPE patroni_multisite_switches counter")
@@ -1070,6 +1079,7 @@ def do_POST_reinitialize(self) -> None:
10701079
The request body may contain a JSON dictionary with the following key:
10711080
10721081
* ``force``: ``True`` if we want to cancel an already running task in order to reinit a replica.
1082+
* ``from_leader``: ``True`` if we want to reinit a replica and get basebackup from the leader node.
10731083
10741084
Response HTTP status codes:
10751085
@@ -1082,8 +1092,9 @@ def do_POST_reinitialize(self) -> None:
10821092
logger.debug('received reinitialize request: %s', request)
10831093

10841094
force = isinstance(request, dict) and parse_bool(request.get('force')) or False
1095+
from_leader = isinstance(request, dict) and parse_bool(request.get('from_leader')) or False
10851096

1086-
data = self.server.patroni.ha.reinitialize(force)
1097+
data = self.server.patroni.ha.reinitialize(force, from_leader)
10871098
if data is None:
10881099
status_code = 200
10891100
data = 'reinitialize started'

0 commit comments

Comments
 (0)