You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: content/docs/concept/_index.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -26,11 +26,11 @@ Pigsty provides:
26
26
27
27
-**Out-of-the-Box PostgreSQL Distribution**
28
28
29
-
Pigsty deeply integrates [**440+ extensions**](https://pgext.cloud/zh/list) from the PostgreSQL ecosystem, providing out-of-the-box distributed, time-series, geographic, spatial, graph, vector, search, and other multi-modal database capabilities. From kernel to RDS distribution, providing production-grade database services for versions 13-18 on EL/Debian/Ubuntu.
29
+
Pigsty deeply integrates [**440+ extensions**](https://pgext.cloud/list) from the PostgreSQL ecosystem, providing out-of-the-box distributed, time-series, geographic, spatial, graph, vector, search, and other multi-modal database capabilities. From kernel to RDS distribution, providing production-grade database services for versions 13-18 on EL/Debian/Ubuntu.
30
30
31
31
-**Self-Healing High Availability Architecture**
32
32
33
-
A [**high availability architecture**](/docs/concept/ha) built on Patroni, Etcd, and HAProxy enables automatic failover for hardware failures with seamless traffic handoff. Primary failure recovery time RTO < 30s, data recovery point RPO ≈ 0. You can perform rolling maintenance and upgrades on the entire cluster without application coordination.
33
+
A [**high availability architecture**](/docs/concept/ha) built on Patroni, Etcd, and HAProxy enables automatic failover for hardware failures with seamless traffic handoff. Primary failure recovery time RTO < 45s, data recovery point RPO ≈ 0. You can perform rolling maintenance and upgrades on the entire cluster without application coordination.
Hardware failures are covered by the self-healing HA architecture provided by patroni, etcd, and haproxy—in case of primary failure, automatic failover executes within 30 seconds by default.
94
+
Hardware failures are covered by the self-healing HA architecture provided by patroni, etcd, and haproxy—in case of primary failure, automatic failover executes within 45 seconds by default.
95
95
Clients don't need to modify config or restart applications: Haproxy uses patroni health checks for traffic distribution, and read-write requests are automatically routed to the new cluster primary, avoiding split-brain issues.
96
96
This process is seamless—for example, in case of replica failure or planned switchover, clients experience only a momentary flash of the current query.
Copy file name to clipboardExpand all lines: content/docs/concept/arch/pgsql.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -99,7 +99,7 @@ finally, you can swap different [**kernel CPUs**](/docs/pgsql/kernel) and [**ext
99
99
The [**HA**](/docs/concept/ha) subsystem consists of [**Patroni**](#patroni) and [**etcd**](#etcd), responsible for PostgreSQL cluster failure detection, automatic failover, and configuration management.
100
100
101
101
**How it works**: [Patroni](#patroni) runs on each node, managing the local [PostgreSQL](#postgresql) process and writing cluster state (leader, members, config) to [etcd](#etcd).
102
-
When the primary fails, [Patroni](#patroni) coordinates election via [etcd](#etcd), promoting the healthiest replica to new primary. The entire process is automatic, with RTO typically under 30 seconds.
102
+
When the primary fails, [Patroni](#patroni) coordinates election via [etcd](#etcd), promoting the healthiest replica to new primary. The entire process is automatic, with RTO typically under 45 seconds.
103
103
104
104
**Key Interactions**:
105
105
-**[PostgreSQL](#postgresql)**: Starts, stops, reloads PG as parent process, controls its lifecycle
Copy file name to clipboardExpand all lines: content/docs/concept/ha/_index.md
+4-4Lines changed: 4 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -18,7 +18,7 @@ Pigsty's PostgreSQL clusters come with out-of-the-box high availability, powered
18
18
19
19
When your PostgreSQL cluster has two or more instances, you automatically have self-healing database high availability without any additional configuration — as long as any instance in the cluster survives, the cluster can provide complete service. Clients only need to connect to any node in the cluster to get full service without worrying about primary-replica topology changes.
20
20
21
-
With default configuration, the primary failure Recovery Time Objective (RTO) ≈ 30s, and Recovery Point Objective (RPO) < 1MB; for replica failures, RPO = 0 and RTO ≈ 0 (brief interruption). In consistency-first mode, failover can guarantee zero data loss: RPO = 0. All these metrics can be [**configured as needed**](#tradeoffs) based on your actual hardware conditions and reliability requirements.
21
+
With default configuration, the primary failure Recovery Time Objective (RTO) ≈ 45s, and Recovery Point Objective (RPO) < 1MB; for replica failures, RPO = 0 and RTO ≈ 0 (brief interruption). In consistency-first mode, failover can guarantee zero data loss: RPO = 0. All these metrics can be [**configured as needed**](#tradeoffs) based on your actual hardware conditions and reliability requirements.
22
22
23
23
Pigsty includes built-in HAProxy load balancers for automatic traffic switching, providing DNS/VIP/LVS and other access methods for clients. Failover and switchover are almost transparent to the business side except for brief interruptions - applications don't need to modify connection strings or restart.
24
24
The minimal maintenance window requirements bring great flexibility and convenience: you can perform rolling maintenance and upgrades on the entire cluster without application coordination. The feature that hardware failures can wait until the next day to handle lets developers, operations, and DBAs sleep well during incidents.
@@ -31,7 +31,7 @@ Many large organizations and core institutions have been using Pigsty in product
31
31
32
32
**What problems does High Availability solve?**
33
33
34
-
* Elevates data security C/IA availability to a new level: RPO ≈ 0, RTO < 30s.
34
+
* Elevates data security C/IA availability to a new level: RPO ≈ 0, RTO < 45s.
35
35
* Gains seamless rolling maintenance capability, minimizing maintenance window requirements and bringing great convenience.
36
36
* Hardware failures can self-heal immediately without human intervention, allowing operations and DBAs to sleep well.
37
37
* Replicas can handle read-only requests, offloading primary load and fully utilizing resources.
@@ -99,7 +99,7 @@ The default **RTO** and **RPO** values used by Pigsty meet reliability requireme
99
99
Too small an RTO increases false positive rates; too small an RPO reduces the probability of successful automatic failover.
100
100
{{% /alert %}}
101
101
102
-
The upper limit of unavailability during failover is controlled by the [**`pg_rto`**](/docs/pgsql/param#pg_rto) parameter. **RTO** defaults to `30s`. Increasing it will result in longer primary failure write unavailability, while decreasing it will increase the rate of false positive failovers (e.g., repeated switching due to brief network jitter).
102
+
The upper limit of unavailability during failover is controlled by the [**`pg_rto`**](/docs/pgsql/param#pg_rto) parameter. **RTO** defaults to `45s`. Increasing it will result in longer primary failure write unavailability, while decreasing it will increase the rate of false positive failovers (e.g., repeated switching due to brief network jitter).
103
103
104
104
The upper limit of potential data loss is controlled by the [**`pg_rpo`**](/docs/pgsql/param#pg_rpo) parameter, defaulting to `1MB`. Reducing this value can lower the data loss ceiling during failover but also increases the probability of refusing automatic failover when replicas are not healthy enough (lagging too far behind).
105
105
@@ -118,7 +118,7 @@ If you need to ensure zero data loss during failover, you can use the [**`crit.y
118
118
119
119
Parameter name: `pg_rto`, Type: `int`, Level: `C`
120
120
121
-
Recovery Time Objective (RTO) in seconds. This is used to calculate Patroni's TTL value, defaulting to `30` seconds.
121
+
Recovery Time Objective (RTO) in seconds. This is used to calculate Patroni's TTL value, defaulting to `45` seconds.
122
122
123
123
If the primary instance is missing for this long, a new leader election will be triggered. This value is not always better when lower; it involves tradeoffs:
124
124
Reducing this value can decrease unavailability during cluster failover (inability to write), but makes the cluster more sensitive to short-term network jitter, increasing the probability of false positive failover triggers.
0 commit comments