You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/postgresql/flexible-server/concepts-monitoring.md
+42-26Lines changed: 42 additions & 26 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,7 +6,7 @@ ms.author: varundhawan
6
6
ms.service: postgresql
7
7
ms.subservice: flexible-server
8
8
ms.topic: conceptual
9
-
ms.date: 9/5/2023
9
+
ms.date: 9/15/2023
10
10
---
11
11
12
12
# Monitor metrics on Azure Database for PostgreSQL - Flexible Server
@@ -22,7 +22,7 @@ Azure Database for PostgreSQL provides various metrics that give insight into th
22
22
> [!NOTE]
23
23
> While metrics are stored for 93 days, you can only query (in the Metrics tile) for a maximum of 30 days' worth of data on any single chart. If you see a blank chart or your chart displays only part of metric data, verify that the difference between start and end dates in the time picker doesn't exceed the 30-day interval. After you've selected a 30-day interval, you can pan the chart to view the full retention window.
24
24
25
-
### List of metrics
25
+
### Default Metrics
26
26
27
27
The following metrics are available for a flexible server instance of Azure Database for PostgreSQL:
28
28
@@ -51,18 +51,16 @@ The following metrics are available for a flexible server instance of Azure Data
51
51
|**Write IOPS**|`write_iops`|Count |Number of data disk I/O write operations per second. |Yes |
52
52
53
53
54
-
## Enhanced metrics
54
+
###Enhanced metrics
55
55
56
-
You can use enhanced metrics for Azure Database for PostgreSQL - Flexible Server to get fine-grained monitoring and alerting on databases. You can configure alerts on the metrics.
56
+
You can use enhanced metrics for Azure Database for PostgreSQL - Flexible Server to get fine-grained monitoring and alerting on databases. You can configure alerts on the metrics. Some enhanced metrics include a `Dimension` parameter that you can use to split and filter metrics data by using a dimension like database name or state.
57
57
58
-
Some enhanced metrics include a `Dimension` parameter that you can use to split and filter metrics data by using a dimension like database name or state.
59
-
60
-
### Enable enhanced metrics
58
+
#### Enabling enhanced metrics
61
59
62
60
- Most of these new metrics are *disabled* by default. A few exceptions are described in the next table.
63
61
- To enable these metrics, set the server parameter `metrics.collector_database_activity` to `ON`. This parameter is dynamic and doesn't require an instance restart.
64
62
65
-
### List of enhanced metrics
63
+
#####List of enhanced metrics
66
64
67
65
You can choose from the following categories of enhanced metrics:
68
66
@@ -73,7 +71,7 @@ You can choose from the following categories of enhanced metrics:
@@ -85,7 +83,7 @@ You can choose from the following categories of enhanced metrics:
85
83
|**Oldest xmin**|`oldest_backend_xmin`|Count|The actual value of the oldest `xmin`. If `xmin` isn't increasing, it indicates that there are some long-running transactions that can potentially hold dead tuples from being removed. |Doesn't apply|No|
86
84
|**Oldest xmin Age**|`oldest_backend_xmin_age`|Count|Age in units of the oldest `xmin`. Indicates how many transactions passed since the oldest `xmin`. |Doesn't apply|No|
87
85
88
-
#### Database
86
+
#####Database
89
87
90
88
|Display name |Metric ID |Unit |Description |Dimension |Default enabled|
|**Max Physical Replication Lag**|`physical_replication_delay_in_bytes`|Bytes|Maximum lag across all asynchronous physical replication slots.|Doesn't apply|Yes |
119
117
|**Read Replica Lag**|`physical_replication_delay_in_seconds`|Seconds|Read replica lag in seconds. |Doesn't apply|Yes |
|**Disk Bandwidth Consumed Percentage**|`disk_bandwidth_consumed_percentage`|Percent|Percentage of data disk bandwidth consumed per minute.|Doesn't apply|Yes |
126
124
|**Disk IOPS Consumed Percentage**|`disk_iops_consumed_percentage`|Percent|Percentage of data disk I/Os consumed per minute. |Doesn't apply|Yes |
|**Max Connections** ^|`max_connections`|Count|Number of maximum connections. |Doesn't apply|Yes |
133
131
134
132
^ **Max Connections** represents the configured value for the `_max_connections_ server` parameter. This metric is pooled every 30 minutes.
135
133
136
-
#### Considerations for using enhanced metrics
134
+
#####Considerations for using enhanced metrics
137
135
138
136
- Enhanced metrics that use the DatabaseName dimension have a *50-database* limit.
139
137
- On the *Burstable* SKU, the limit is 10 databases for metrics that use the DatabaseName dimension.
140
138
- The DatabaseName dimension limit is applied on the object identifier (OID) column, which reflects the order of creation for the database.
141
139
- The DatabaseName in the metrics dimension is *case insensitive*. The metrics for database names that are the same except for case (for example, *contoso_database* and *Contoso_database*) will be merged and might not show accurate data.
142
140
143
-
## Autovacuum metrics
141
+
###Autovacuum metrics
144
142
145
143
Autovaccum metrics can be used to monitor and tune autovaccum performance for Azure Database for PostgreSQL - Flexible Server. Each metric is emitted at a *30-minute* interval and has up to *93 days* of retention. You can create alerts for specific metrics, and you can split and filter metrics data by using the DatabaseName dimension.
146
144
147
-
###Enable autovacuum metrics
145
+
#### How to enable autovacuum metrics
148
146
149
147
- Autovacuum metrics are disabled by default.
150
148
- To enable these metrics, set the server parameter `metrics.autovacuum_diagnostics` to `ON`.
151
149
- This parameter is dynamic, so an instance restart isn't required.
152
150
153
-
### List of autovacuum metrics
151
+
####List of autovacuum metrics
154
152
155
153
|Display name |Metric ID |Unit |Description |Dimension |Default enabled|
@@ -168,23 +166,23 @@ Autovaccum metrics can be used to monitor and tune autovaccum performance for Az
168
166
|**User Tables Vacuumed**|`tables_vacuumed_user_tables`|Count |Number of user-only tables that have been vacuumed in this database. |DatabaseName|No |
169
167
|**Vacuum Counter User Tables**|`vacuum_count_user_tables`|Count |Number of times user-only tables have been manually vacuumed in this database (not counting `VACUUM FULL`).|DatabaseName|No |
170
168
171
-
### Considerations for using autovacuum metrics
169
+
####Considerations for using autovacuum metrics
172
170
173
171
- Autovacuum metrics that use the DatabaseName dimension have a *30-database* limit.
174
172
- On the *Burstable* SKU, the limit is 10 databases for metrics that use the DatabaseName dimension.
175
173
- The DatabaseName dimension limit is applied on the OID column, which reflects the order of creation for the database.
176
174
177
-
## PgBouncer metrics
175
+
###PgBouncer metrics
178
176
179
177
You can use PgBouncer metrics to monitor the performance of the PgBouncer process, including details for active connections, idle connections, total pooled connections, and the number of connection pools. Each metric is emitted at a *30-minute* interval and has up to *93 days* of history. Customers can configure alerts on the metrics and also access the new metrics dimensions to split and filter metrics data by database name.
180
178
181
-
###Enable PgBouncer metrics
179
+
#### How to enable PgBouncer metrics
182
180
183
181
- PgBouncer metrics are disabled by default.
184
182
- For PgBouncer metrics to work, both the server parameters `pgbouncer.enabled` and `metrics.pgbouncer_diagnostics` must be enabled.
185
183
- These parameters are dynamic and don't require an instance restart.
@@ -195,13 +193,13 @@ You can use PgBouncer metrics to monitor the performance of the PgBouncer proces
195
193
|**Total pooled connections**|`total_pooled_connections`|Count|Current number of pooled connections. |DatabaseName|No |
196
194
|**Number of connection pools**|`num_pools`|Count|Total number of connection pools. |DatabaseName|No |
197
195
198
-
### Considerations for using the PgBouncer metrics
196
+
####Considerations for using the PgBouncer metrics
199
197
200
198
- PgBouncer metrics that use the DatabaseName dimension have a *30-database* limit.
201
199
- On the *Burstable* SKU, the limit is 10 databases that have the DatabaseName dimension.
202
200
- The DatabaseName dimension limit is applied to the OID column, which reflects the order of creation for the database.
203
201
204
-
## Database availability metric
202
+
###Database availability metric
205
203
206
204
Is-db-alive is an database server availability metric for Azure Postgres Flexible Server, that returns `[1 for available]` and `[0 for not-available]`. Each metric is emitted at a *1 minute* frequency, and has up to *93 days* of retention. Customers can configure alerts on the metric.
207
205
@@ -215,7 +213,7 @@ Is-db-alive is an database server availability metric for Azure Postgres Flexibl
215
213
- Customers have option to further aggregate these metrics with any desired frequency (5m, 10m, 30m etc.) to suit their alerting requirements and avoid any false positive.
216
214
- Other possible aggregations are `AVG()` and `MIN()`
217
215
218
-
## Filter and split on dimension metrics
216
+
###Filter and split on dimension metrics
219
217
220
218
In the preceding tables, some metrics have dimensions like DatabaseName or State. You can use [filtering](../../azure-monitor/essentials/metrics-charts.md#filters) and [splitting](../../azure-monitor/essentials/metrics-charts.md#apply-splitting) for the metrics that have dimensions. These features show how various metric segments (or *dimension values*) affect the overall value of the metric. You can use them to identify possible outliers.
221
219
@@ -228,10 +226,28 @@ The following example demonstrates splitting by the State dimension and filterin
228
226
229
227
For more information about setting up charts for dimensional metrics, see [Metric chart examples](../../azure-monitor/essentials/metric-chart-samples.md).
230
228
231
-
## Server logs
229
+
### Metrics visualization
230
+
231
+
There are several options to visualize Azure Monitor metrics
232
+
233
+
|Component |Description | Required training and/or configuration|
234
+
|---------|---------|--------|
235
+
|Overview page|Most Azure services have an **Overview** page in the Azure portal that includes a **Monitor** section with charts that show recent critical metrics. This information is intended for owners of individual services to quickly assess the performance of the resource. |This page is based on platform metrics that are collected automatically. No configuration is required. |
236
+
|[Metrics Explorer](../../azure-monitor/essentials/metrics-getting-started.md)|You can use Metrics Explorer to interactively work with metric data and create metric alerts. You need minimal training to use Metrics Explorer, but you must be familiar with the metrics you want to analyze. |- Once data collection is configured, no other configuration is required.<br>- Platform metrics for Azure resources are automatically available.<br>- Guest metrics for virtual machines are available after an Azure Monitor agent is deployed to the virtual machine.<br>- Application metrics are available after Application Insights is configured. |
237
+
|[Grafana](https://grafana.com/grafana/dashboards/19556-azure-azure-postgresql-flexible-server-monitoring/)| You can use Grafana for visualizing and alerting on metrics. All versions of Grafana include the [Azure Monitor datasource plug-in](../../azure-monitor/visualize/grafana-plugin.md) to visualize your Azure Monitor metrics and logs. | Some training is required for you to become familiar with Grafana dashboards, although you can download prebuilt [Azure PostgreSQL grafana monitoring dashboard](https://grafana.com/grafana/dashboards/19556-azure-azure-postgresql-flexible-server-monitoring/) to easily all Auzre PostgreSQL srevers in your organzation. |
238
+
239
+
240
+
## Logs
232
241
233
242
In addition to the metrics, you can use Azure Database for PostgreSQL to configure and access Azure Database for PostgreSQL standard logs. For more information, see [Logging concepts](concepts-logging.md).
234
243
244
+
### Logs visualization
245
+
246
+
|Component |Description | Required training and/or configuration|
247
+
|---------|---------|--------|
248
+
|[Log Analytics](../../azure-monitor/logs/log-analytics-overview.md)|With Log Analytics, you can create log queries to interactively work with log data and create log query alerts.| Some training is required for you to become familiar with the query language, although you can use prebuilt queries for common requirements. |
249
+
250
+
235
251
## Next steps
236
252
237
253
- Learn more about how to [configure and access logs](howto-configure-and-access-logs.md).
Copy file name to clipboardExpand all lines: articles/postgresql/flexible-server/concepts-read-replicas.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -56,7 +56,7 @@ When you start the create replica workflow, a blank Azure Database for PostgreSQ
56
56
57
57
In Azure Database for PostgreSQL - Flexible Server, the create operation of replicas is considered successful only when the entire backup of the primary instance has been copied to the replica destination along with the transaction logs have been synchronized up to the threshold of maximum 1GB lag.
58
58
59
-
To ensure the success of the create operation, it's recommended to avoid creating replicas during periods of high transactional load. For example, it's best to avoid creating replicas during migrations from other sources to Azure Database for PostgreSQL - Flexible Server, or during excessive bulk load operations. If you are currently in the process of performing a migration or bulk load operation, it's recommended that you wait until the operation has completed before proceeding with the creation of replicas. Once the migration or bulk load operation has finished, check whether the transaction log size has returned to its normal size. Typically, the transaction log size should be close to the value defined in the max_wal_size server parameter for your instance. You can track the transaction log storage footprint using the [Transaction Log Storage Used](concepts-monitoring.md#list-of-metrics) metric, which provides insights into the amount of storage used by the transaction log. By monitoring this metric, you can ensure that the transaction log size is within the expected range and that the replica creation process might be started.
59
+
To ensure the success of the create operation, it's recommended to avoid creating replicas during periods of high transactional load. For example, it's best to avoid creating replicas during migrations from other sources to Azure Database for PostgreSQL - Flexible Server, or during excessive bulk load operations. If you are currently in the process of performing a migration or bulk load operation, it's recommended that you wait until the operation has completed before proceeding with the creation of replicas. Once the migration or bulk load operation has finished, check whether the transaction log size has returned to its normal size. Typically, the transaction log size should be close to the value defined in the max_wal_size server parameter for your instance. You can track the transaction log storage footprint using the [Transaction Log Storage Used](concepts-monitoring.md#default-metrics) metric, which provides insights into the amount of storage used by the transaction log. By monitoring this metric, you can ensure that the transaction log size is within the expected range and that the replica creation process might be started.
60
60
61
61
> [!IMPORTANT]
62
62
> Read Replicas are currently supported for the General Purpose and Memory Optimized server compute tiers, Burstable server compute tier is not supported.
@@ -86,7 +86,7 @@ At the prompt, enter the password for the user account.
86
86
Read replica feature in Azure Database for PostgreSQL - Flexible Server relies on replication slots mechanism. The main advantage of replication slots is the ability to automatically adjust the number of transaction logs (WAL segments) needed by all replica servers and therefore avoid situations when one or more replicas going out of sync because WAL segments that were not yet sent to the replicas are being removed on the primary. The disadvantage of the approach is the risk of going out of space on the primary in case replication slot remains inactive for a long period of time. In such situations primary will accumulate WAL files causing incremental growth of the storage usage. When the storage usage reaches 95% or if the available capacity is less than 5 GiB, the server is automatically switched to read-only mode to avoid errors associated with disk-full situations.
87
87
Therefore, monitoring the replication lag and replication slots status is crucial for read replicas.
88
88
89
-
We recommend setting alert rules for storage used or storage percentage, as well as for replication lags, when they exceed certain thresholds so that you can proactively act, increase the storage size and delete lagging read replicas. For example, you can set an alert if the storage percentage exceeds 80% usage, as well on the replica lag being higher than 1h. The [Transaction Log Storage Used](concepts-monitoring.md#list-of-metrics) metric will show you if the WAL files accumulation is the main reason of the excessive storage usage.
89
+
We recommend setting alert rules for storage used or storage percentage, as well as for replication lags, when they exceed certain thresholds so that you can proactively act, increase the storage size and delete lagging read replicas. For example, you can set an alert if the storage percentage exceeds 80% usage, as well on the replica lag being higher than 1h. The [Transaction Log Storage Used](concepts-monitoring.md#default-metrics) metric will show you if the WAL files accumulation is the main reason of the excessive storage usage.
90
90
91
91
Azure Database for PostgreSQL - Flexible Server provides [two metrics](concepts-monitoring.md#replication) for monitoring replication. The two metrics are **Max Physical Replication Lag** and **Read Replica Lag**. To learn how to view these metrics, see the **Monitor a replica** section of the [read replica how-to article](how-to-read-replicas-portal.md#monitor-a-replica).
0 commit comments