Skip to content

Commit 6665503

Browse files
committed
[*] rewrite "Preparing databases" tutorial, fixes #606
1 parent 98e5671 commit 6665503

File tree

1 file changed

+120
-149
lines changed

1 file changed

+120
-149
lines changed

docs/tutorial/preparing_databases.md

Lines changed: 120 additions & 149 deletions
Original file line numberDiff line numberDiff line change
@@ -4,39 +4,37 @@ title: Preparing databases for monitoring
44

55
## Effects of monitoring
66

7-
- Although the "Observer effect" applies also for pgwatch, no
7+
- Although the "Observer effect" applies also for pgwatch, no
88
noticeable impact for the monitored DB is expected when using
99
*Preset configs* settings, and given that there is some normal load
1010
on the server anyway and the DB doesn't have thousands of tables.
1111
For some metrics though can happen that the metric reading query
12-
(notably "stat_statements" and "table_stats") takes some tens of
12+
(notably `stat_statements` and `table_stats`) takes some tens of
1313
milliseconds, which might be more than an average application query.
14-
- At any time maximally 2 metric fetching queries can run in parallel
15-
on any monitored DBs. This can be changed by recompiling
16-
(MAX_PG_CONNECTIONS_PER_MONITORED_DB variable) the gatherer.
17-
- Default Postgres [statement
14+
- Default Postgres [statement
1815
timeout](https://www.postgresql.org/docs/current/runtime-config-client.html#GUC-STATEMENT-TIMEOUT)
1916
is *5s* for entries inserted via the Web UI / database directly.
2017

2118
## Basic preparations
2219

23-
As a base requirement you'll need a **login user** (non-superuser
20+
As a base requirement you'll need a **login user** (`pg_monitor`
2421
suggested) for connecting to your server and fetching metrics.
2522

2623
Though theoretically you can use any username you like, but if not using
2724
"pgwatch" you need to adjust the "helper" creation SQL scripts (see
2825
below for explanation) accordingly, as in those by default the
2926
"pgwatch" will be granted execute privileges.
3027

31-
``` sql
28+
```sql
3229
CREATE ROLE pgwatch WITH LOGIN PASSWORD 'secret';
3330
-- For critical databases it might make sense to ensure that the user account
3431
-- used for monitoring can only open a limited number of connections
3532
-- (there are according checks in code, but multiple instances might be launched)
36-
ALTER ROLE pgwatch CONNECTION LIMIT 3;
33+
ALTER ROLE pgwatch CONNECTION LIMIT 5;
3734
GRANT pg_monitor TO pgwatch;
3835
GRANT CONNECT ON DATABASE mydb TO pgwatch;
3936
GRANT EXECUTE ON FUNCTION pg_stat_file(text) to pgwatch; -- for wal_size metric
37+
GRANT EXECUTE ON FUNCTION pg_stat_file(text, boolean) TO pgwatch;
4038
```
4139

4240
For most monitored databases it's extremely beneficial (for
@@ -50,108 +48,108 @@ troubleshooting benefits also the
5048
[track_io_timing](https://www.postgresql.org/docs/current/static/runtime-config-statistics.html#GUC-TRACK-IO-TIMING)
5149
setting should be enabled.
5250

53-
1. Make sure the Postgres *contrib* package is installed (should be
51+
1. Make sure the Postgres *contrib* package is installed (should be
5452
installed automatically together with the Postgres server package on
5553
Debian based systems).
5654

57-
- On RedHat / Centos: `yum install -y postgresqlXY-contrib`
58-
- On Debian / Ubuntu: `apt install postgresql-contrib`
55+
- On RedHat / Centos: `yum install -y postgresqlXY-contrib`
56+
- On Debian / Ubuntu: `apt install postgresql-contrib`
5957

60-
1. Add `pg_stat_statements` to your server config (postgresql.conf) and
58+
1. Add `pg_stat_statements` to your server config (postgresql.conf) and
6159
restart the server.
6260

63-
shared_preload_libraries = 'pg_stat_statements'
64-
track_io_timing = on
61+
```ini
62+
shared_preload_libraries = 'pg_stat_statements'
63+
track_io_timing = on
64+
```
6565

66-
1. After restarting activate the extension in the monitored DB. Assumes
66+
1. After restarting activate the extension in the monitored DB. Assumes
6767
Postgres superuser.
6868

69-
psql -c "CREATE EXTENSION IF NOT EXISTS pg_stat_statements"
69+
```terminal
70+
psql -c "CREATE EXTENSION IF NOT EXISTS pg_stat_statements"
71+
```
7072

71-
## Rolling out helper functions
73+
## Metrics initialization
74+
75+
Some **rare** metrics are not runable out-of-the-box on Postgres and
76+
need some installed helper functions, extensions or database objects
77+
before they can be used.
78+
For example, it is impossible to obtain the CPU usage statistics
79+
with a regular SQL query. But it is possible to get this system
80+
information with some untrusted procedure language like PL/Python.
81+
82+
That's why some metrics have a special init section in their definitions.
83+
Some init sections might contain `CREATE FUNCTION` statements that
84+
create helper functions in the monitored database. Some might contain
85+
`CREATE EXTENSION` or other preparation steps.
86+
87+
To examine the init section of a metric, you can use the following
88+
command:
89+
90+
```terminal
91+
pgwatch metric print-init <metric or preset name> ...
92+
```
93+
94+
You may put multiple metric or preset names in the command line. The
95+
output will contain the concatenated init sections of the specified
96+
metrics or presets.
97+
98+
For example, to check the init section of the `cpu_load` metric:
99+
100+
```terminal
101+
$ pgwatch metric print-init cpu_load
102+
-- cpu_load
103+
BEGIN;
104+
CREATE EXTENSION IF NOT EXISTS plpython3u;
105+
CREATE OR REPLACE FUNCTION get_load_average(OUT load_1min float, OUT load_5min float, OUT load_15min float) AS
106+
$$
107+
from os import getloadavg
108+
la = getloadavg()
109+
return [la[0], la[1], la[2]]
110+
$$ LANGUAGE plpython3u VOLATILE;
111+
GRANT EXECUTE ON FUNCTION get_load_average() TO pgwatch;
112+
COMMENT ON FUNCTION get_load_average() is 'created for pgwatch';
113+
COMMIT;
114+
```
72115
73116
Helper functions in pgwatch context are standard Postgres stored
74117
procedures, running under `SECURITY DEFINER` privileges. Via such
75-
wrapper functions one can do **controlled privilege escalation** - i.e.
76-
to give access to protected Postgres metrics (like active session
77-
details, "per query" statistics) or even OS-level metrics, to normal
78-
unprivileged users, like the pgwatch monitoring role.
79-
80-
If using a superuser login (recommended only for local "push" setups)
81-
you have full access to all Postgres metrics and would need *helpers*
82-
only for OS remote statistics. For local (push) setups as of pgwatch
83-
version 1.8.4 the most typical OS metrics are covered by the
84-
`--direct-os-stats` flag, explained below.
85-
86-
For unprivileged monitoring users it is highly recommended to take these
87-
additional steps on the "to be monitored" database to get maximum
88-
value out of pgwatch in the long run. Without these additional steps,
89-
you lose though about 10-15% of built-in metrics, which might not be too
90-
tragical nevertheless. For that use case there's also a *preset config*
91-
named "unprivileged".
92-
93-
When monitoring v10+ servers then the built-in **pg_monitor** system
94-
role is recommended for the monitoring user, which almost substitutes
95-
superuser privileges for monitoring purposes in a safe way.
96-
97-
### Rolling out common helpers
98-
For completely unprivileged monitoring users the following *helpers* are
99-
recommended to make good use of the default "exhaustive" *Preset
100-
Config*:
101-
102-
export PGUSER=superuser
103-
psql -f /etc/pgwatch/metrics/00_helpers/get_stat_activity/$pgver/metric.sql mydb
104-
psql -f /etc/pgwatch/metrics/00_helpers/get_stat_replication/$pgver/metric.sql mydb
105-
psql -f /etc/pgwatch/metrics/00_helpers/get_wal_size/$pgver/metric.sql mydb
106-
psql -f /etc/pgwatch/metrics/00_helpers/get_stat_statements/$pgver/metric.sql mydb
107-
psql -f /etc/pgwatch/metrics/00_helpers/get_sequences/$pgver/metric.sql mydb
108-
109-
Note that there might not be an exact Postgres version match for helper
110-
definitions - then replace *\$pgver* with the previous available version
111-
number below your server's Postgres version number.
112-
113-
Also note that as of v1.8.1 some helpers definition SQLs scripts (like
114-
for "get_stat_statements") will inspect also the "search_path" and
115-
by default **will not install into schemas that have PUBLIC CREATE
116-
privileges**, like the "public" schema by default has!
117-
118-
Also when rolling out helpers make sure the `search_path` is
118+
wrapper functions one can do **controlled privilege escalation**, i.e.
119+
to give access to OS-level metrics.
120+
121+
Since pgwatch operates with a "least privilege" principle, it shouldn't
122+
automatically create needed helper functions on the monitored database.
123+
124+
So to create the helper functions, you need to execute init commands under
125+
the appropriate account, usually a superuser account. The easiest way to do it
126+
is just pipe the output of the `pgwatch metric print-init` command to the
127+
`psql` command:
128+
129+
```terminal
130+
export PGUSER=superuser
131+
pgwatch metric print-init cpu_load psutil_mem psutil_disk | psql -d mydb
132+
```
133+
134+
!!! Info
135+
Here in all examples we assume that we are using the built-in metrics.
136+
But you can also use your own custom metrics. In this case, you need to
137+
provide the appropriate command-line options, e.g.
138+
```terminal
139+
pgwatch --metrics=/path/to/your/metrics.yaml metric print-init ...
140+
```
141+
142+
Also when init metrics make sure the `search_path` is
119143
at defaults or set so that it's also accessible for the monitoring role
120144
as currently neither helpers nor metric definition SQLs don't assume
121145
any particular schema and depend on the `search_path`
122146
including everything needed.
123147

124-
For more detailed statistics (OS monitoring, table bloat, WAL size, etc.)
125-
it is recommended to install also all other helpers found from the
126-
`/etc/pgwatch/metrics/00_helpers` folder or do it
127-
automatically by using the *rollout_helper.py* script found in the
128-
*00_helpers* folder.
129-
130-
As of v1.6.0 though helpers are not needed for Postgres-native metrics
131-
(e.g. WAL size) if a privileged user (superuser or *pg_monitor* GRANT)
132-
is used, as pgwatch now supports having 2 SQL definitions for each
133-
metric - "normal / unprivileged" and "privileged" / "superuser".
134-
In the file system */etc/pgwatch/metrics* such "privileged" access
135-
definitions will have a "_su" added to the file name.
136-
137-
## Automatic rollout of helpers
138-
139-
pgwatch can roll out *helpers* also automatically on the monitored DB.
140-
This requires superuser privileges and a configuration attribute for the
141-
monitored DB. In YAML config mode it's called *is_superuser*, in Config
142-
DB *md_is_superuser*, in the Web UI one can tick the "Auto-create
143-
helpers" checkbox.
144-
145-
After the automatic rollout it's still generally recommended to remove
146-
the superuser privileges from the monitoring role, which now should have
147-
GRANTs to all automatically created helper functions. Note though that
148-
all created helpers will not be immediately usable as some are for
149-
special purposes and need additional dependencies.
150-
151-
A hint: if it can be foreseen that a lot of databases will be created on
152-
some instance (generally not a good idea though) it might be a good idea
153-
to roll out the helpers directly in the *template1* database - so that
154-
all newly created databases will get them automatically.
148+
!!! hint
149+
If it can be foreseen that a lot of databases will be created on
150+
some instance it might be a good idea
151+
to roll out the helpers directly in the *template1* database, so that
152+
all newly created databases will get them automatically.
155153

156154
## PL/Python helpers
157155

@@ -166,73 +164,52 @@ this data is stored together with Postgres-native metrics for easier
166164
graphing / correlation / alerting. This also enable to be totally
167165
independent of any System Monitoring tools like Zabbix, etc., with the
168166
downside that everything is gathered over Postgres connections so that
169-
when Postgres is down no OS metrics can be gathered also. Since v1.8.4
170-
though the latter problem can be reduced for local "push" based setups
171-
via the `--direct-os-stats` option plus according metrics
172-
configuration (e.g. the "full" preset).
167+
when Postgres is down no OS metrics can be gathered also.
173168

174169
Note though that PL/Python is usually disabled by DB-as-a-service
175170
providers like AWS RDS for security reasons.
176171

177-
# first install the Python bindings for Postgres
178-
apt install postgresql-plpython3-XY
179-
# yum install postgresqlXY-plpython3
180-
181-
psql -c "CREATE EXTENSION plpython3u"
182-
psql -f /etc/pgwatch/metrics/00_helpers/get_load_average/9.1/metric.sql mydb
172+
```terminal
173+
# first install the Python bindings for Postgres
174+
apt install postgresql-plpython3-XY
175+
# yum install postgresqlXY-plpython3
183176
184-
# psutil helpers are only needed when full set of common OS metrics is wanted
185-
apt install python3-psutil
186-
psql -f /etc/pgwatch/metrics/00_helpers/get_psutil_cpu/9.1/metric.sql mydb
187-
psql -f /etc/pgwatch/metrics/00_helpers/get_psutil_mem/9.1/metric.sql mydb
188-
psql -f /etc/pgwatch/metrics/00_helpers/get_psutil_disk/9.1/metric.sql mydb
189-
psql -f /etc/pgwatch/metrics/00_helpers/get_psutil_disk_io_total/9.1/metric.sql mydb
177+
pgwatch metric print-init cpu_load | psql -d mydb
190178
191-
Note that we're assuming here that we're on a modern Linux system with
192-
Python 3 as default. For older systems Python 3 might not be an option
193-
though, so you need to change *plpython3u* to *plpythonu* and also do
194-
the same replace inside the code of the actual helper functions! Here
195-
the *rollout_helper.py* script with it's `--python2` flag can be
196-
helpful again.
179+
# psutil helpers are only needed when full set of common OS metrics is wanted
180+
apt install python3-psutil
181+
pgwatch metric print-init psutil_cpu psutil_mem psutil_disk psutil_disk_io_total | psql -d mydb
182+
```
197183

198184
## Notice on using metric fetching helpers
199185

200-
- Starting from Postgres v10 helpers are mostly not needed (only for
201-
PL/Python ones getting OS statistics) - there are available some
202-
special monitoring roles like `pg_monitor`, that are exactly meant
203-
to be used for such cases where we want to give access to all
204-
Statistics Collector views without any other "superuser
205-
behaviour". See
206-
[here](https://www.postgresql.org/docs/current/default-roles.html)
207-
for documentation on such special system roles. Note that currently
208-
most out-of-the-box metrics first rely on the helpers as v10 is
209-
relatively new still, and only when fetching fails, direct access
210-
with the "Privileged SQL" is tried.
211-
- For gathering OS statistics (CPU, IO, disk) there are helpers and
186+
- Helpers are mostly needed only for PL/Python metrics getting OS statistics.
187+
- For gathering OS statistics (CPU, IO, disk) there are helpers and
212188
metrics provided, based on the "psutil" Python package... but from
213189
user reports seems the package behaviour differentiates slightly
214190
based on the Linux distro / Kernel version used, so small
215191
adjustments might be needed there (e.g. to remove a non-existent
216192
column). Minimum usable Kernel version required is 3.3.
217-
- When running the gatherer locally, i.e. having a "push" based
218-
configuration, the metric fetching helpers are mostly not needed
219-
as superuser can be used in a safe way and starting from v1.8.4 one
220-
can also enable the `--direct-os-stats` parameter to signal that
221-
we can fetch the data for the default `psutil*` metrics
193+
- When running the gatherer locally one can enable the `--direct-os-stats`
194+
parameter to signal that we can fetch the data for the default `psutil*` metrics
222195
directly from OS counters. If direct OS fetching fails though, the
223196
fallback is still to try via PL/Python wrappers.
224-
- In rare cases when some "helpers" have been installed, and when
197+
- In rare cases when some "helpers" have been installed, and when
225198
doing a binary PostgreSQL upgrade at some later point in time via
226199
`pg_upgrade`, this could result in error messages
227200
thrown. Then just drop those failing helpers on the "to be
228201
upgraded" cluster and re-create them after the upgrade process.
229202

203+
!!! Info
204+
If despite all the warnings you still want to run the pgwatch
205+
with a sufficient user account (e.g. a superuser) you can also
206+
use the `--create-helpers` parameter to automatically create all
207+
needed helper functions in the monitored databases.
208+
230209
## Running with developer credentials
231210

232-
As mentioned above, helper / wrapper functions are not strictly needed,
233-
they just provide a bit more information for unprivileged users - thus
234-
for developers with no means to install any wrappers as superuser, it's
235-
also possible to benefit from pgwatch - for such use cases e.g. the
211+
For developers with no means to install any wrappers as superuser, it's
212+
also possible to benefit from pgwatch. For such use cases the
236213
"unprivileged" preset metrics profile and the according "DB overview
237214
Unprivileged / Developer"
238215
![dashboard](../gallery/overview_developer.png)
@@ -247,32 +224,26 @@ selected. Following types are available:
247224

248225
### *postgres*
249226

250-
Monitor a single database on a single Postgres instance. When using
251-
the Web UI and the "DB name" field is left empty, there's as a
252-
one time operation where all non-template DB names are fetched,
253-
prefixed with "Unique name" field value and added to monitoring
254-
(if not already monitored). Internally monitoring always happens
255-
"per DB" not "per cluster" though.
227+
Monitor a single database on a single Postgres instance. Internally monitoring
228+
always happens "per DB" not "per cluster" though.
256229

257-
### *postgres-continuous-discovery*
230+
### *postgres-continuous-discovery*
258231

259232
Monitor a whole (or subset of DB-s) of Postgres cluster / instance.
260-
Host information without a DB name needs to be specified and then
233+
Connection string needs to be specified and then
261234
the pgwatch daemon will periodically scan the cluster and add any
262235
found and not yet monitored DBs to monitoring. In this mode it's
263236
also possible to specify regular expressions to include/exclude some
264237
database names.
265238

266239
### *pgbouncer*
267240

268-
Use to track metrics from PgBouncer's `SHOW STATS` command. In
269-
place of the Postgres "DB name" the name of the PgBouncer "pool"
270-
to be monitored must be inserted.
241+
Use to track metrics from PgBouncer's `SHOW STATS` command.
271242

272243
### *pgpool*
273244

274245
Use to track joint metrics from Pgpool2's `SHOW POOL_NODES` and
275-
`SHOW POOL_PROCESSES` commands. Pgpool2 from version 3.0 is supported.
246+
`SHOW POOL_PROCESSES` commands.
276247

277248
### *patroni*
278249

0 commit comments

Comments
 (0)