Skip to content

Commit 53608ba

Browse files
Sync docs from Discourse (#850)
Co-authored-by: GitHub Actions <41898282+github-actions[bot]@users.noreply.github.com>
1 parent 573517b commit 53608ba

14 files changed

+1288
-19
lines changed

docs/explanation.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,9 +9,10 @@ This section contains pages with more detailed explanations that provide additio
99
* [Legacy charm]
1010

1111
## Operational concepts
12-
* [Connection pooling]
12+
* [Units]
1313
* [Users]
1414
* [Logs]
15+
* [Connection pooling]
1516

1617
## Security and hardening
1718
* [Security hardening guide][Security]
@@ -22,6 +23,7 @@ This section contains pages with more detailed explanations that provide additio
2223

2324
[Architecture]: /t/11857
2425
[Interfaces and endpoints]: /t/10251
26+
[Units]: /t/17525
2527
[Users]: /t/10798
2628
[Logs]: /t/12099
2729
[Juju]: /t/11985

docs/explanation/e-units.md

Lines changed: 93 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,93 @@
1+
# PostgreSQL units
2+
3+
Each [HA](https://en.wikipedia.org/wiki/High_availability)/[DR](https://en.wikipedia.org/wiki/IT_disaster_recovery) implementation has a primary and secondary (standby) site(s).
4+
Charmed PostgreSQL cluster size can be [easily scaled](/t/11863) from 0 to 10 units ([contact us](/t/11863) for 10+ units cluster). It is recommended to use 3+ units cluster size in production (due to [Raft consensus](https://en.wikipedia.org/wiki/Raft_(algorithm)) requirements). Those units type can be:
5+
* **Primary**: unit which accepts all writes and guaranties [no split brain](https://en.wikipedia.org/wiki/Split-brain_(computing)).
6+
* **Sync Standby** (synchronous copy) : designed for the fast automatic failover. Used for read-only queries and guaranties the latest transaction availability.
7+
* **Replica** (asynchronous copy): designed for long-running and resource consuming queries without affecting Primary performance. Used for read-only queries without guaranties of the latest transaction availability.
8+
9+
> **Warning**: all SQL transactions have to be confirmed by all Sync Standby unit(s) before Primary unit commit transaction to the client. Therefor the high-performance and high-availability is a trade-of balance between "Sync Standby" and "Replica" units count in the cluster.
10+
11+
> **Note**: starting from revision 561 all Charmed PostgreSQL units are configured as Sync Standby members. It provides better guaranties for the data survival when two of three units gone simultaneously. Users can re-configure the necessary synchronous units count using Juju config option '[synchronous_node_count](https://charmhub.io/postgresql/configurations?channel=14/edge#synchronous_node_count)'.
12+
13+
![PostgreSQL Units types|690x253, 100%](upload://pY5kzxO9ELJGEqEe1F1RQjOG6SS.png)
14+
15+
## Primary
16+
17+
The simplest way to find the Primary unit is to run `juju status`. Please be aware that the information here can be outdated as it is being updated only on [Juju event 'update-status'](https://documentation.ubuntu.com/juju/3.6/reference/hook/#update-status):
18+
```shell
19+
ubuntu@juju360:~$ juju status postgresql
20+
Model Controller Cloud/Region Version SLA Timestamp
21+
postgresql lxd localhost/localhost 3.6.5 unsupported 13:04:15+02:00
22+
23+
App Version Status Scale Charm Channel Rev Exposed Message
24+
postgresql 14.15 active 3 postgresql 14/stable 553 no
25+
26+
Unit Workload Agent Machine Public address Ports Message
27+
postgresql/0* active idle 0 10.189.210.53 5432/tcp Primary <<<<<<<<<<<<<<
28+
postgresql/1 active idle 1 10.189.210.166 5432/tcp
29+
postgresql/2 active idle 2 10.189.210.188 5432/tcp
30+
31+
Machine State Address Inst id Base AZ Message
32+
0 started 10.189.210.53 juju-422c1a-0 [email protected] Running
33+
1 started 10.189.210.166 juju-422c1a-1 [email protected] Running
34+
2 started 10.189.210.188 juju-422c1a-2 [email protected] Running
35+
```
36+
37+
The up-to-date Primary unit number can be received using Juju action `get-primary`:
38+
```shell
39+
> juju run postgresql/leader get-primary
40+
...
41+
primary: postgresql/0
42+
```
43+
44+
Also it is possible to retrieve this information using [patronictl](/t/17406#p-37204-patronictl-3) and [Patroni REST API](/t/17406#p-37204-patroni-rest-api-8).
45+
46+
## Standby / Replica
47+
48+
At the moment it is possible to retrieve this information using [patronictl](/t/17406#p-37204-patronictl-3) and [Patroni REST API](/t/17406#p-37204-patroni-rest-api-8) only (check the linked documentation for the access details). Example:
49+
```shell
50+
> ... patronictl ... list
51+
+ Cluster: postgresql (7499430436963402504) ---+-----------+----+-----------+
52+
| Member | Host | Role | State | TL | Lag in MB |
53+
+--------------+----------------+--------------+-----------+----+-----------+
54+
| postgresql-0 | 10.189.210.53 | Leader | running | 1 | |
55+
| postgresql-1 | 10.189.210.166 | Sync Standby | streaming | 1 | 0 |
56+
| postgresql-2 | 10.189.210.188 | Replica | streaming | 1 | 0 |
57+
+--------------+----------------+--------------+-----------+----+-----------+
58+
```
59+
On the example above:
60+
* `postgresql-0` is a PostgreSQL Primary unit (Patroni Leader) which accepts all writes
61+
* `postgresql-1` is a PostgreSQL/Patroni Sync Standby unit which can be promoted as new primary using manual switchover (safe).
62+
* `postgresql-2` is a PostgreSQL/Patroni Replica unit which can NOT be directly promoted as a new Primary using manual switchover. The automatic promotion Replica=>Sync Standby is necessary to guaranties the latest SQL transactions availability on this unit to allow further promotion as a new Primary. Otherwise the manual failover can be performed to Replica unit accepting the risks of loosing the last transactions(s) which lagged behind Primary.
63+
64+
## Replica lag distance
65+
66+
At the moment it is possible to retrieve this information using [patronictl](/t/17406#p-37204-patronictl-3) and [Patroni REST API](/t/17406#p-37204-patroni-rest-api-8) only (check the linked documentation for the access details). Example:
67+
```shell
68+
> ... patronictl ... list
69+
+ Cluster: postgresql (7499430436963402504) ---+-----------+----+-----------+
70+
| Member | Host | Role | State | TL | Lag in MB |
71+
+--------------+----------------+--------------+-----------+----+-----------+
72+
| postgresql-0 | 10.189.210.53 | Leader | running | 1 | |
73+
| ...
74+
| postgresql-2 | 10.189.210.188 | Replica | streaming | 1 | 42 | <<<<<
75+
+--------------+----------------+--------------+-----------+----+-----------+
76+
77+
> curl ... x.x.x.x:8008/cluster | jq
78+
"members": [
79+
{
80+
"name": "postgresql-0",
81+
"role": "leader",
82+
"state": "running",
83+
...
84+
},
85+
...
86+
{
87+
"name": "postgresql-2",
88+
"role": "replica",
89+
"state": "streaming",
90+
...
91+
"lag": 42 <<<<<<<<<<<< Lag in MB
92+
}
93+
```

docs/explanation/e-users.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# Charm Users explanations
1+
# Users
22

33
There are three types of users in PostgreSQL:
44
* Internal users (used by charm operator)

docs/how-to.md

Lines changed: 17 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -13,10 +13,12 @@ Installation of different cloud services with Juju:
1313
* [Azure]
1414
* [Multi-availability zones (AZ)][Multi-AZ]
1515

16-
Specific deployment scenarios and architectures:
17-
* [Terraform]
18-
* [Air-gapped]
16+
Other deployment scenarios and configurations:
1917
* [TLS VIP access]
18+
* [Juju spaces]
19+
* [Air-gapped]
20+
* [Terraform]
21+
* [Juju storage]
2022

2123
## Usage and maintenance
2224

@@ -25,6 +27,7 @@ Specific deployment scenarios and architectures:
2527
* [Scale replicas]
2628
* [Enable TLS]
2729
* [Enable plugins/extensions]
30+
* [Switchover/failover]
2831

2932
## Backup and restore
3033
* [Configure S3 AWS]
@@ -36,9 +39,10 @@ Specific deployment scenarios and architectures:
3639

3740
## Monitoring (COS)
3841

39-
* [Enable monitoring]
40-
* [Enable alert rules]
41-
* [Enable tracing]
42+
* [Enable monitoring] with Grafana
43+
* [Enable alert rules] with Prometheus
44+
* [Enable tracing] with Tempo
45+
* [Enable profiling] with Parca
4246

4347
## Minor upgrades
4448
* [Perform a minor upgrade]
@@ -69,13 +73,17 @@ This section is for charm developers looking to support PostgreSQL integrations
6973
[GCE]: /t/15722
7074
[Azure]: /t/15733
7175
[Multi-AZ]: /t/15749
76+
[TLS VIP access]: /t/16576
77+
[Juju spaces]: /t/17416
7278
[Terraform]: /t/14916
7379
[Air-gapped]: /t/15746
74-
[TLS VIP access]: /t/16576
80+
[Juju storage]: /t/17529
81+
7582
[Integrate with another application]: /t/9687
7683
[External access]: /t/15802
7784
[Scale replicas]: /t/9689
7885
[Enable TLS]: /t/9685
86+
[Switchover/failover]: /t/17523
7987

8088
[Configure S3 AWS]: /t/9681
8189
[Configure S3 RadosGW]: /t/10313
@@ -87,7 +95,8 @@ This section is for charm developers looking to support PostgreSQL integrations
8795
[Enable monitoring]: /t/10600
8896
[Enable alert rules]: /t/13084
8997
[Enable tracing]: /t/14521
90-
98+
[Enable profiling]: /t/17172
99+
91100
[Perform a minor upgrade]: /t/12089
92101
[Perform a minor rollback]: /t/12090
93102

docs/how-to/h-async-set-up.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -62,7 +62,7 @@ juju run -m rome db1/leader create-replication
6262
To switchover and use `lisbon` as the primary instead, run
6363

6464
```shell
65-
juju run -m lisbon db2/leader promote-to-primary
65+
juju run -m lisbon db2/leader promote-to-primary scope=cluster
6666
```
6767

6868
## Scale a cluster

docs/how-to/h-deploy-juju-spaces.md

Lines changed: 65 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,65 @@
1+
# Deploy on Juju spaces
2+
3+
The Charmed PostgreSQL operator supports [Juju spaces](https://documentation.ubuntu.com/juju/latest/reference/space/index.html) to separate network traffic for:
4+
- **Client** - PostgreSQL instance to client data
5+
- **Instance-replication** - cluster instances replication data
6+
- **Cluster-replication** - cluster to cluster replication data
7+
- **Backup** - backup and restore data
8+
9+
## Prerequisites
10+
11+
* **Charmed PostgreSQL 16**
12+
* Configured network spaces
13+
* See [Juju | How to manage network spaces](https://documentation.ubuntu.com/juju/latest/reference/juju-cli/list-of-juju-cli-commands/add-space/)
14+
15+
## Deploy
16+
17+
On application deployment, constraints are required to ensure the unit(s) have address(es) on the specified network space(s), and endpoint binding(s) for the space(s).
18+
19+
For example, with spaces configured for instance replication and client traffic:
20+
```shell
21+
❯ juju spaces
22+
Name Space ID Subnets
23+
alpha 0 10.163.154.0/24
24+
client 1 10.0.0.0/24
25+
peers 2 10.10.10.0/24
26+
```
27+
28+
The space `alpha` is default and cannot be removed. To deploy Charmed PostgreSQL Operator using the spaces:
29+
```shell
30+
juju deploy postgresql --channel 16/edge \
31+
--constraints spaces=client,peers \
32+
--bind "database-peers=peers database=client"
33+
```
34+
35+
[note type=caution]
36+
Currently there's no support for the juju `bind` command. Network space binding must be defined at deploy time only.
37+
[/note]
38+
39+
Consequently, a client application must use the `client` space on the model, or a space for the same subnet in another model, for example:
40+
```shell
41+
juju deploy client-app \
42+
--constraints spaces=client \
43+
--bind database=client
44+
```
45+
46+
The two application can be then related using:
47+
```shell
48+
juju integrate postgresql:database client-app:database
49+
```
50+
51+
The client application will receive network endpoints on the `10.0.0.0/24` subnet.
52+
53+
The Charmed PostgreSQL operator endpoints are:
54+
55+
| Endpoint | Traffic |
56+
| ------------------------------ | -------------------- |
57+
| database | Client |
58+
| database-peers | Instance-replication |
59+
| replication-offer, replication | Cluster-replication |
60+
| s3-parameters | Backup |
61+
62+
63+
[note]
64+
If using a network space for the backup traffic, the user is responsible for ensuring that the target object storage URL traffic is routed via the specified network space.
65+
[/note]

0 commit comments

Comments
 (0)