You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: hugo/content/prometheus/_index.md
+8-5Lines changed: 8 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -39,7 +39,10 @@ Or you can also download [Prometheus](https://prometheus.io/) and [Alertmanager]
39
39
40
40
##### Minimum Versions
41
41
42
-
pgMonitor assumes to be using at least Prometheus 2.9.x. We recommend to always use the latest minor version of Prometheus.
42
+
pgMonitor has been tested with the following versions at a minimum. Later versions should generally work. If they do not, please open an issue on our Github.
43
+
44
+
* Prometheus 2.49.1
45
+
* Alertmanager 0.26.0
43
46
44
47
##### User and Configuration Directory Installation
45
48
@@ -118,10 +121,10 @@ The below files dictate how Prometheus and Alertmanager will behave at runtime f
| /etc/prometheus/crunchy-prometheus.yml |Modify to set scrape interval if different from the default of 30s. Activate alert rules and Alertmanager by uncommenting lines when set as needed. Activate blackbox_exporter monitoring if desired. Service file provided by pgMonitor expects config file to be named "crunchy-prometheus.yml" |
122
-
| /etc/prometheus/crunchy-alertmanager.yml | Setup alert target (e.g., SMTP, SMS, etc.), receiver and route information. Service file provided by pgMonitor expects config file to be named "crunchy-alertmanager.yml" |
123
-
| /etc/prometheus/alert-ruled.d/crunchy-alert-rules-\*.yml.example | Update rules as needed and remove ".example" suffix. Prometheus config provided by pgmonitor expects ".yml" files to be located in "/etc/prometheus/alert-rules.d/" |
124
-
| /etc/prometheus/auto.d/*.yml | You will need at least one file with a final ".yml" extension. Copy the example files to create as many additional targets as needed. Ensure the configuration files you want to use do not end in ".yml.example" but only with ".yml". Note that in order to use the provided Grafana dashboards, the extra "exp_type" label must be applied to all targets and be set appropriately (pg or node). Also, PostgreSQL targets make use of the "cluster_name" variable and should be given a relevant value so all systems (primary & replicas) can be related to each other when needed (Grafana dashboards, etc). See the example target files provided for how to set the labels for postgres or node exporter targets. |
124
+
| /etc/prometheus/crunchy-prometheus.yml |Main configuration file for prometheus to set things like scrape intervals and alerting. blackbox_exporter monitoring can also be enabled if desired. Service file provided by pgMonitor expects config file to be named "crunchy-prometheus.yml". For full configration options please see the [Prometheus upstream documentation](https://prometheus.io/docs/prometheus/latest/configuration/configuration/)|
125
+
| /etc/prometheus/crunchy-alertmanager.yml | Setup alert target (e.g., SMTP, SMS, etc.), receiver and route information. Service file provided by pgMonitor expects config file to be named "crunchy-alertmanager.yml". For full configuration options please see the [Alertmanager upstream documentation](https://prometheus.io/docs/alerting/latest/configuration/)|
126
+
| /etc/prometheus/alert-ruled.d/crunchy-alert-rules-\*.yml.example | Update rules as needed and remove ".example" suffix. Prometheus config provided by pgmonitor expects ".yml" files to be located in "/etc/prometheus/alert-rules.d/". Additional information on configuring alert rules can be found in the [alert rules upstream documentation](https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/).|
127
+
| /etc/prometheus/auto.d/*.yml | You will need at least one file with a final ".yml" extension. Copy the example files to create as many additional targets as needed. Ensure the configuration files you want to use do not end in ".yml.example" but only with ".yml". Note that in order to use the provided Grafana dashboards, the extra "exp_type" label must be applied to all targets and be set appropriately (pg, node, etcd, pgbouncer, etc). Also, PostgreSQL targets make use of the "cluster_name" variable and should be given a relevant value so all systems (primary & replicas) can be related to each other when needed (Grafana dashboards, etc). See the example target files provided for how to set the labels for postgres or node exporter targets. |
Copy file name to clipboardExpand all lines: sql_exporter/common/crunchy_global_collector.yml
+52-81Lines changed: 52 additions & 81 deletions
Original file line number
Diff line number
Diff line change
@@ -362,32 +362,24 @@ queries:
362
362
363
363
- query_name: ccp_archive_command_status
364
364
query: |
365
-
SELECT CASE
366
-
WHEN EXTRACT(epoch from (last_failed_time - last_archived_time)) IS NULL THEN 0
367
-
WHEN EXTRACT(epoch from (last_failed_time - last_archived_time)) < 0 THEN 0
368
-
ELSE EXTRACT(epoch from (last_failed_time - last_archived_time))
369
-
END AS seconds_since_last_fail
370
-
, EXTRACT(epoch from (CURRENT_TIMESTAMP - last_archived_time)) AS seconds_since_last_archive
365
+
SELECT seconds_since_last_fail
366
+
, seconds_since_last_archive
371
367
, archived_count
372
368
, failed_count
373
-
FROM pg_catalog.pg_stat_archiver
369
+
FROM pgmonitor_ext.ccp_archive_command_status
374
370
375
371
376
372
- query_name: ccp_connection_stats
377
373
query: |
378
-
SELECT ((total - idle) - idle_in_txn) AS active
374
+
SELECT active
379
375
, total
380
376
, idle
381
377
, idle_in_txn
382
-
, (select coalesce(extract(epoch from (max(clock_timestamp() - state_change))),0) from pg_catalog.pg_stat_activity where state = 'idle in transaction') AS max_idle_in_txn_time
383
-
, (select coalesce(extract(epoch from (max(clock_timestamp() - query_start))),0) from pg_catalog.pg_stat_activity where backend_type = 'client backend' AND state NOT LIKE 'idle%' ) AS max_query_time
384
-
, (select coalesce(extract(epoch from (max(clock_timestamp() - query_start))),0) from pg_catalog.pg_stat_activity where backend_type = 'client backend' and wait_event_type = 'Lock' ) AS max_blocked_query_time
378
+
, max_idle_in_txn_time
379
+
, max_query_time
380
+
, max_blocked_query_time
385
381
, max_connections
386
-
FROM (
387
-
SELECT count(*) AS total
388
-
, COALESCE(SUM(CASE WHEN state = 'idle' THEN 1 ELSE 0 END),0) AS idle
389
-
, COALESCE(SUM(CASE WHEN state = 'idle in transaction' THEN 1 ELSE 0 END),0) AS idle_in_txn FROM pg_catalog.pg_stat_activity) x
390
-
JOIN (SELECT setting::float AS max_connections FROM pg_settings WHERE name = 'max_connections') xx ON (true)
382
+
FROM pgmonitor_ext.ccp_connection_stats
391
383
392
384
393
385
- query_name: ccp_database_size
@@ -399,8 +391,8 @@ queries:
399
391
400
392
- query_name: ccp_is_in_recovery
401
393
query: |
402
-
SELECT CASE WHEN pg_is_in_recovery = true THEN 1 ELSE 2 END AS status
403
-
FROM pg_is_in_recovery()
394
+
SELECT status
395
+
FROM pgmonitor_ext.ccp_pg_is_in_recovery
404
396
405
397
406
398
- query_name: ccp_locks
@@ -419,53 +411,48 @@ queries:
419
411
- query_name: ccp_pg_settings_checksum
420
412
query: |
421
413
SELECT pgmonitor_ext.pg_settings_checksum() AS status
422
-
414
+
423
415
424
416
- query_name: ccp_postgresql_version
425
417
query: |
426
-
SELECT current_setting('server_version_num')::int AS current
418
+
SELECT current
419
+
FROM pgmonitor_ext.ccp_postgresql_version
427
420
428
421
429
422
- query_name: ccp_postmaster_runtime
430
423
query: |
431
-
SELECT extract('epoch' from pg_postmaster_start_time) as start_time_seconds from pg_catalog.pg_postmaster_start_time()
424
+
SELECT start_time_seconds
425
+
FROM pgmonitor_ext.ccp_postmaster_runtime
432
426
433
427
434
428
- query_name: ccp_postmaster_uptime
435
429
query: |
436
-
SELECT extract(epoch from (clock_timestamp() - pg_postmaster_start_time() )) AS seconds
430
+
SELECT seconds
431
+
FROM pgmonitor_ext.ccp_postmaster_uptime
437
432
438
433
439
434
- query_name: ccp_replication_lag
440
435
query: |
441
-
SELECT
442
-
CASE
443
-
WHEN (pg_last_wal_receive_lsn() = pg_last_wal_replay_lsn()) OR (pg_is_in_recovery() = false) THEN 0
444
-
ELSE EXTRACT (EPOCH FROM clock_timestamp() - pg_last_xact_replay_timestamp())::INTEGER
445
-
END
446
-
AS replay_time
447
-
, CASE
448
-
WHEN pg_is_in_recovery() = false THEN 0
449
-
ELSE EXTRACT (EPOCH FROM clock_timestamp() - pg_last_xact_replay_timestamp())::INTEGER
450
-
END
451
-
AS received_time
436
+
SELECT replay_time
437
+
, received_time
438
+
FROM pgmonitor_ext.ccp_replication_lag
452
439
453
440
454
441
- query_name: ccp_replication_lag_size
455
442
query: |
456
-
SELECT client_addr AS replica
457
-
, client_hostname AS replica_hostname
458
-
, client_port AS replica_port
459
-
, pg_wal_lsn_diff(sent_lsn, replay_lsn) AS bytes
460
-
FROM pg_catalog.pg_stat_replication
443
+
SELECT replica
444
+
, replica_hostname
445
+
, replica_port
446
+
, bytes
447
+
FROM pgmonitor_ext.ccp_replication_lag_size
461
448
462
449
463
450
- query_name: ccp_replication_slots
464
451
query: |
465
452
SELECT slot_name
466
-
, active::int
467
-
, pg_wal_lsn_diff(CASE WHEN pg_is_in_recovery() THEN pg_last_wal_replay_lsn() ELSE pg_current_wal_insert_lsn() END, restart_lsn) AS retained_bytes
468
-
FROM pg_catalog.pg_replication_slots
453
+
, active
454
+
, retained_bytes
455
+
FROM pgmonitor_ext.ccp_replication_slots
469
456
470
457
471
458
- query_name: ccp_sequence_exhaustion
@@ -475,7 +462,8 @@ queries:
475
462
476
463
- query_name: ccp_settings_pending_restart
477
464
query: |
478
-
SELECT count(*) AS count FROM pg_catalog.pg_settings WHERE pending_restart = true
465
+
SELECT count
466
+
FROM pgmonitor_ext.ccp_settings_pending_restart
479
467
480
468
481
469
- query_name: ccp_stat_bgwriter
@@ -495,50 +483,33 @@ queries:
495
483
496
484
- query_name: ccp_stat_database
497
485
query: |
498
-
SELECT d.datname AS dbname
499
-
, s.xact_commit
500
-
, s.xact_rollback
501
-
, s.blks_read
502
-
, s.blks_hit
503
-
, s.tup_returned
504
-
, s.tup_fetched
505
-
, s.tup_inserted
506
-
, s.tup_updated
507
-
, s.tup_deleted
508
-
, s.conflicts
509
-
, s.temp_files
510
-
, s.temp_bytes
511
-
, s.deadlocks
512
-
FROM pg_catalog.pg_stat_database s
513
-
JOIN pg_catalog.pg_database d ON d.datname = s.datname
514
-
WHERE d.datistemplate = false
486
+
SELECT dbname
487
+
, xact_commit
488
+
, xact_rollback
489
+
, blks_read
490
+
, blks_hit
491
+
, tup_returned
492
+
, tup_fetched
493
+
, tup_inserted
494
+
, tup_updated
495
+
, tup_deleted
496
+
, conflicts
497
+
, temp_files
498
+
, temp_bytes
499
+
, deadlocks
500
+
FROM pgmonitor_ext.ccp_stat_database
515
501
516
502
517
503
- query_name: ccp_transaction_wraparound
518
504
query: |
519
-
WITH max_age AS (
520
-
SELECT 2000000000 as max_old_xid, setting AS autovacuum_freeze_max_age FROM pg_catalog.pg_settings WHERE name = 'autovacuum_freeze_max_age'
521
-
)
522
-
, per_database_stats AS (
523
-
SELECT datname
524
-
, m.max_old_xid::int
525
-
, m.autovacuum_freeze_max_age::int
526
-
, age(d.datfrozenxid) AS oldest_current_xid
527
-
FROM pg_catalog.pg_database d
528
-
JOIN max_age m ON (true) WHERE d.datallowconn
529
-
)
530
-
SELECT max(oldest_current_xid) AS oldest_current_xid
531
-
, max(ROUND(100*(oldest_current_xid/max_old_xid::float))) AS percent_towards_wraparound
532
-
, max(ROUND(100*(oldest_current_xid/autovacuum_freeze_max_age::float))) AS percent_towards_emergency_autovac
533
-
FROM per_database_stats
534
-
505
+
SELECT oldest_current_xid
506
+
, percent_towards_wraparound
507
+
, percent_towards_emergency_autovac
508
+
FROM pgmonitor_ext.ccp_transaction_wraparound
509
+
535
510
536
511
- query_name: ccp_wal_activity
537
512
query: |
538
513
SELECT last_5_min_size_bytes
539
-
, (SELECT COALESCE(sum(size),0) FROM pg_catalog.pg_ls_waldir()) AS total_size_bytes
540
-
FROM (SELECT COALESCE(sum(size),0) AS last_5_min_size_bytes
541
-
FROM pg_catalog.pg_ls_waldir()
542
-
WHERE modification > CURRENT_TIMESTAMP - '5 minutes'::interval) x
help: "Number of rows updated where the successor version goes onto a new heap page, leaving behind an original version with a t_ctid field that points to a different heap page. These are always non-HOT updates."
0 commit comments