You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/postgresql/flexible-server/how-to-autovacuum-tuning.md
+31-33Lines changed: 31 additions & 33 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,21 +6,23 @@ author: sarat0681
6
6
ms.service: postgresql
7
7
ms.subservice: flexible-server
8
8
ms.topic: conceptual
9
-
ms.date: 7/28/2022
9
+
ms.date: 08/03/2022
10
10
---
11
11
12
-
# Autovacuum Tuning
12
+
# Autovacuum Tuning in Azure Database for PostgreSQL - Flexible Server
13
13
14
-
## What is Autovacuum
14
+
This article provides an overview of the autovacuum feature for [Azure Database for PostgreSQL - Flexible Server](overview.md).
15
+
16
+
## What is autovacuum
15
17
16
18
Internal data consistency in PostgreSQL is based on the Multi-Version Concurrency Control (MVCC) mechanism, which allows the database engine to maintain multiple versions of a row and provides greater concurrency with minimal blocking between the different processes.
17
19
18
-
PostgreSQL databases need appropriate maintenance. For example, when a row is deleted, it is not removed physically. Instead, the row is marked as “dead”. Similarly for updates, the row is marked as "dead" and a new version of the row is inserted. These operations leave behind dead records, called dead tuples, even after all the transactions that might see those versions finish. Unless cleaned up, dead tuples remain, consuming disk space and bloating tables and indexes which result in slow query performance.
20
+
PostgreSQL databases need appropriate maintenance. For example, when a row is deleted, it is not removed physically. Instead, the row is marked as “dead”. Similarly for updates, the row is marked as "dead" and a new version of the row is inserted. These operations leave behind dead records, called dead tuples, even after all the transactions that might see those versions finish. Unless cleaned up, dead tuples remain, consuming disk space and bloating tables and indexes which result in slow query performance.
19
21
20
22
PostgreSQL uses a process called autovacuum to automatically clean up dead tuples.
21
23
22
24
23
-
## Autovacuum Internals
25
+
## Autovacuum internals
24
26
25
27
Autovacuum reads pages looking for dead tuples, and if none are found, autovacuum discard the page. When autovacuum finds dead tuples, it removes them. The cost is based on:
26
28
@@ -47,7 +49,7 @@ That means in one-second autovacuum can do:
47
49
48
50
49
51
50
-
## Monitoring Autovacuum
52
+
## Monitoring autovacuum
51
53
52
54
Use the following queries to monitor autovacuum:
53
55
@@ -64,7 +66,7 @@ The following columns help determine if autovacuum is catching up to table activ
64
66
-**Last_autoanalyze**: The date of the last time the table was automatically analyzed.
65
67
66
68
67
-
## When Does PostgreSQL Trigger Autovacuum
69
+
## When does PostgreSQL trigger autovacuum
68
70
69
71
An autovacuum action (either *ANALYZE* or *VACUUM*) triggers when the number of dead tuples exceeds a particular number that is dependent on two factors: the total count of rows in a table, plus a fixed threshold. *ANALYZE*, by default, triggers when 10% of the table plus 50 rows changes, while *VACUUM* triggers when 20% of the table plus 50 rows changes. Since the *VACUUM* threshold is twice as high as the *ANALYZE* threshold, *ANALYZE* gets triggered much earlier than *VACUUM*.
70
72
@@ -122,11 +124,11 @@ Use the following query to list the tables in a database and identify the tables
122
124
> The query does not take into consideration that autovacuum can be configured on a per-table basis using the "alter table" DDL command.
123
125
124
126
125
-
## Common Autovacuum Problems
127
+
## Common autovacuum problems
126
128
127
129
Review the possible common problems with the autovacuum process.
128
130
129
-
### Not Keeping Up With Busy Server
131
+
### Not keeping up with busy server
130
132
131
133
The autovacuum process estimates the cost of every I/O operation, accumulates a total for each operation it performs and pauses once the upper limit of the cost is reached. `autovacuum_vacuum_cost_delay` and `autovacuum_vacuum_cost_limit` are the two server parameters that are used in the process.
132
134
@@ -137,8 +139,6 @@ If `autovacuum_vacuum_cost_limit` is set to `-1` then autovacuum uses the `v
137
139
138
140
In case the autovacuum is not keeping up, the following parameters may be changed:
139
141
140
-
141
-
142
142
|Parameter |Description |
143
143
|---------|---------|
144
144
|`autovacuum_vacuum_scale_factor`| Default: `0.2`, range: `0.05 - 0.1`. The scale factor is workload-specific and should be set depending on the amount of data in the tables. Before changing the value, investigate the workload and individual table volumes. |
@@ -148,37 +148,36 @@ In case the autovacuum is not keeping up, the following parameters may be change
148
148
> [!NOTE]
149
149
> The `autovacuum_vacuum_cost_limit` value is distributed proportionally among the running autovacuum workers, so that if there is more than one, the sum of the limits for each worker does not exceed the value of the `autovacuum_vacuum_cost_limit` parameter
150
150
151
-
### Autovacuum Constantly Running
151
+
### Autovacuum constantly running
152
152
153
153
Continuously running autovacuum may affect CPU and IO utilization on the server. The following might be possible reasons:
154
154
155
-
#####`maintenance_work_mem`
155
+
#### `maintenance_work_mem`
156
156
157
157
Autovacuum daemon uses `autovacuum_work_mem` that is by default set to `-1` meaning `autovacuum_work_mem` would have the same value as the parameter `maintenance_work_mem`. This document assumes `autovacuum_work_mem` is set to `-1` and `maintenance_work_mem` is used by the autovacuum daemon.
158
158
159
159
If `maintenance_work_mem` is low, it may be increased to up to 2 GB on Flexible Server. A general rule of thumb is to allocate 50 MB to `maintenance_work_mem` for every 1 GB of RAM.
160
160
161
161
162
-
#####Large Number Of Databases
162
+
#### Large number of databases
163
163
164
164
Autovacuum tries to start a worker on each database every `autovacuum_naptime` seconds.
165
165
166
166
For example, if a server has 60 databases and `autovacuum_naptime` is set to 60 seconds, then the autovacuum worker starts every second [autovacuum_naptime/Number of DBs].
167
167
168
-
169
168
It is a good idea to increase `autovacuum_naptime` if there are more databases in a cluster. At the same time, the autovacuum process can be made more aggressive by increasing the `autovacuum_cost_limit` and decreasing the `autovacuum_cost_delay` parameters and increasing the `autovacuum_max_workers` from the default of 3 to 4 or 5.
170
169
171
170
172
-
### Out Of Memory Errors
171
+
### Out of memory errors
173
172
174
173
Overly aggressive `maintenance_work_mem` values could periodically cause out-of-memory errors in the system. It is important to understand available RAM on the server before any change to the `maintenance_work_mem` parameter is made.
175
174
176
175
177
-
### Autovacuum Is Too Disruptive
176
+
### Autovacuum is too disruptive
178
177
179
178
If autovacuum is consuming a lot of resources, the following can be done:
180
179
181
-
#####Autovacuum Parameters
180
+
#### Autovacuum parameters
182
181
183
182
Evaluate the parameters `autovacuum_vacuum_cost_delay`, `autovacuum_vacuum_cost_limit`, `autovacuum_max_workers`. Improperly setting autovacuum parameters may lead to scenarios where autovacuum becomes too disruptive.
184
183
@@ -187,7 +186,7 @@ If autovacuum is too disruptive, consider the following:
187
186
- Increase `autovacuum_vacuum_cost_delay` and reduce `autovacuum_vacuum_cost_limit` if set higher than the default of 200.
188
187
- Reduce the number of `autovacuum_max_workers` if it is set higher than the default of 3.
189
188
190
-
#### Too Many Autovacuum Workers
189
+
#### Too many autovacuum workers
191
190
192
191
Increasing the number of autovacuum workers will not necessarily increase the speed of vacuum. Having a high number of autovacuum workers is not recommended.
193
192
@@ -199,7 +198,7 @@ If the number of workers is increased, `autovacuum_vacuum_cost_limit` should als
199
198
200
199
However, if we have changed table level `autovacuum_vacuum_cost_delay` or `autovacuum_vacuum_cost_limit` parameters then the workers running on those tables are exempted from being considered in the balancing algorithm [autovacuum_cost_limit/autovacuum_max_workers].
201
200
202
-
### Autovacuum Transaction ID (TXID) Wraparound Protection
201
+
### Autovacuum transaction ID (TXID) wraparound protection
203
202
204
203
When a database runs into transaction ID wraparound protection, an error message like the following can be observed:
205
204
@@ -214,18 +213,17 @@ Stop the postmaster and vacuum that database in single-user mode.
214
213
215
214
The wraparound problem occurs when the database is either not vacuumed or there are too many dead tuples that could not be removed by autovacuum. The reasons for this might be:
216
215
217
-
#### Heavy Workload
216
+
#### Heavy workload
218
217
219
218
The workload could cause too many dead tuples in a brief period that makes it difficult for autovacuum to catch up. The dead tuples in the system add up over a period leading to degradation of query performance and leading to wraparound situation. One reason for this situation to arise might be because autovacuum parameters aren't adequately set and it is not keeping up with a busy server.
220
219
221
220
222
-
#### Long Running Transactions
221
+
#### Long-running transactions
223
222
224
223
Any long-running transactions in the system will not allow dead tuples to be removed while autovacuum is running. They're a blocker to the vacuum process. Removing the long running transactions frees up dead tuples for deletion when autovacuum runs.
225
224
226
225
Long-running transactions can be detected using the following query:
227
226
228
-
229
227
```postgresql
230
228
SELECT pid, age(backend_xid) AS age_in_xids,
231
229
now () - xact_start AS xact_age,
@@ -237,8 +235,8 @@ Long-running transactions can be detected using the following query:
237
235
ORDER BY 2 DESC
238
236
LIMIT 10;
239
237
```
240
-
241
-
#### Prepared Statements
238
+
239
+
#### Prepared statements
242
240
243
241
If there are prepared statements that are not committed, they would prevent dead tuples from being removed.
244
242
The following query helps find non-committed prepared statements:
@@ -251,7 +249,7 @@ The following query helps find non-committed prepared statements:
251
249
252
250
Use COMMIT PREPARED or ROLLBACK PREPARED to commit or roll back these statements.
253
251
254
-
#### Unused Replication Slots
252
+
#### Unused replication slots
255
253
256
254
Unused replication slots prevent autovacuum from claiming dead tuples. The following query helps identify unused replication slots:
257
255
@@ -266,11 +264,10 @@ Use `pg_drop_replication_slot()` to delete unused replication slots.
266
264
When the database runs into transaction ID wraparound protection, check for any blockers as mentioned previously, and remove those manually for autovacuum to continue and complete. You can also increase the speed of autovacuum by setting `autovacuum_cost_delay` to 0 and increasing the `autovacuum_cost_limit` to a value much greater than 200. However, changes to these parameters will not be applied to existing autovacuum workers. Either restart the database or kill existing workers manually to apply parameter changes.
267
265
268
266
269
-
### Table-specific Requirements
267
+
### Table-specific requirements
270
268
271
269
Autovacuum parameters may be set for individual tables. It is especially important for small and big tables. For example, for a small table that contains only 100 rows, autovacuum triggers VACUUM operation when 70 rows change (as calculated previously). If this table is frequently updated, you might see hundreds of autovacuum operations a day. This will prevent autovacuum from maintaining other tables on which the percentage of changes aren't as big. Alternatively, a table containing a billion rows needs to change 200 million rows to trigger autovacuum operations. Setting autovacuum parameters appropriately prevents such scenarios.
272
270
273
-
274
271
To set autovacuum setting per table, change the server parameters as the following examples:
275
272
276
273
```postgresql
@@ -281,7 +278,8 @@ To set autovacuum setting per table, change the server parameters as the follo
281
278
ALTER TABLE <table name> SET (autovacuum_vacuum_cost_delay = xx);
282
279
ALTER TABLE <table name> SET (autovacuum_vacuum_cost_limit = xx);
283
280
```
284
-
### Insert-only Workloads
281
+
282
+
### Insert-only workloads
285
283
286
284
In versions of PostgreSQL prior to 13, autovacuum will not run on tables with an insert-only workload, because if there are no updates or deletes, there are no dead tuples and no free space that needs to be reclaimed. However, autoanalyze will run for insert-only workloads since there is new data. The disadvantages of this are:
287
285
@@ -291,19 +289,19 @@ In versions of PostgreSQL prior to 13, autovacuum will not run on tables wit
291
289
292
290
#### Solutions
293
291
294
-
##### Postgres Versions prior to 13
292
+
##### Postgres versions prior to 13
295
293
296
294
Using the **pg_cron** extension, a cron job can be set up to schedule a periodic vacuum analyze on the table. The frequency of the cron job depends on the workload.
297
295
298
296
For step-by-step guidance using pg_cron, review [Extensions](./concepts-extensions.md).
299
297
300
298
301
-
##### Postgres 13 and Higher Versions
299
+
##### Postgres 13 and higher versions
302
300
303
301
Autovacuum will run on tables with an insert-only workload. Two new server parameters `autovacuum_vacuum_insert_threshold` and `autovacuum_vacuum_insert_scale_factor` help control when autovacuum can be triggered on insert-only tables.
304
302
305
303
## Next steps
306
304
307
-
- Troubleshoot High CPU Utilization[High CPU Utilization](./how-to-high-cpu-utilization.md).
308
-
- Troubleshoot High Memory Utilization[High Memory Utilization](./how-to-high-memory-utilization.md).
305
+
- Troubleshoot high CPU utilization[High CPU Utilization](./how-to-high-cpu-utilization.md).
306
+
- Troubleshoot high memory utilization[High Memory Utilization](./how-to-high-memory-utilization.md).
309
307
- Configure server parameters [Server Parameters](./howto-configure-server-parameters-using-portal.md).
Copy file name to clipboardExpand all lines: articles/postgresql/flexible-server/how-to-high-cpu-utilization.md
+17-19Lines changed: 17 additions & 19 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,23 +6,23 @@ author: sarat0681
6
6
ms.service: postgresql
7
7
ms.subservice: flexible-server
8
8
ms.topic: conceptual
9
-
ms.date: 7/28/2022
9
+
ms.date: 08/03/2022
10
10
---
11
11
12
12
# Troubleshoot high CPU utilization in Azure Database for PostgreSQL - Flexible Server
13
13
14
-
15
14
This article shows you how to quickly identify the root cause of high CPU utilization, and possible remedial actions to control CPU utilization when using [Azure Database for PostgreSQL - Flexible Server](overview.md).
16
15
17
-
18
16
In this article, you will learn:
19
17
20
18
- About tools to identify high CPU utilization such as Azure Metrics, Query Store, and pg_stat_statements.
21
19
- How to identify root causes, such as long running queries and total connections.
22
20
- How to resolve high CPU utilization by using Explain Analyze, Connection Pooling, and Vacuuming tables.
23
21
24
22
25
-
## Tools to Identify high CPU Utilization
23
+
## Tools to identify high CPU utilization
24
+
25
+
Consider these tools to identify high CPU utilization.
26
26
27
27
### Azure Metrics
28
28
@@ -39,7 +39,7 @@ The pg_stat_statements extension helps identify queries that consume time on the
39
39
40
40
#### Mean or average execution time
41
41
42
-
# [Postgres v13 & above](#tab/postgres-13)
42
+
#####[Postgres v13 & above](#tab/postgres-13)
43
43
44
44
45
45
For Postgres versions 13 and above, use the following statement to view the top five SQL statements by mean or average execution time:
For Postgres versions 9.6, 10, 11, and 12, use the following statement to view the top five SQL statements by total execution time:
87
87
88
-
89
88
```postgresql
90
89
SELECT userid: :regrole, dbid, query,
91
90
FROM pg_stat_statements
@@ -96,14 +95,14 @@ DESC LIMIT 5;
96
95
---
97
96
98
97
99
-
## Identify Root Causes
98
+
## Identify root causes
100
99
101
100
If CPU consumption levels are high in general, the following could be possible root causes:
102
101
103
102
104
-
### Long Running Transactions
103
+
### Long-running transactions
105
104
106
-
Longrunning transactions can consume CPU resources that can lead to high CPU utilization.
105
+
Long-running transactions can consume CPU resources that can lead to high CPU utilization.
107
106
108
107
The following query helps identify connections running for the longest time:
109
108
@@ -114,7 +113,7 @@ WHERE pid <> pg_backend_pid() and state IN ('idle in transaction', 'active')
114
113
ORDER BY duration DESC;
115
114
```
116
115
117
-
### Total Number of Connections and Number Connections by State
116
+
### Total number of connections and number connections by state
118
117
119
118
A large number of connections to the database is also another issue that might lead to increased CPU as well as memory utilization.
120
119
@@ -128,8 +127,7 @@ WHERE pid <> pg_backend_pid()
128
127
GROUP BY 1 ORDER BY 1;
129
128
```
130
129
131
-
132
-
## Resolve High CPU Utilization
130
+
## Resolve high CPU utilization
133
131
134
132
Use Explain Analyze, PG Bouncer, connection pooling and terminate long running transactions to resolve high CPU utilization.
135
133
@@ -139,7 +137,7 @@ Once you know the query that's running for a long time, use **EXPLAIN** to furth
139
137
For more information about the **EXPLAIN** command, review [Explain Plan](https://www.postgresql.org/docs/current/sql-explain.html).
140
138
141
139
142
-
### PGBouncer And Connection Pooling
140
+
### PGBouncer and connection pooling
143
141
144
142
In situations where there are lots of idle connections or lot of connections which are consuming the CPU consider use of a connection pooler like PgBouncer.
145
143
@@ -152,7 +150,7 @@ For more details about PgBouncer, review:
152
150
Azure Database for Flexible Server offers PgBouncer as a built-in connection pooling solution. For more information, see [PgBouncer](./concepts-pgbouncer.md)
153
151
154
152
155
-
### Terminating Long Running Transactions
153
+
### Terminating long running transactions
156
154
157
155
You could consider killing a long running transaction as an option.
158
156
@@ -165,15 +163,15 @@ WHERE pid <> pg_backend_pid() and state IN ('idle in transaction', 'active')
165
163
ORDER BY duration DESC;
166
164
```
167
165
168
-
You can also filter by other properties like usename (username), datname (database name) etc.
166
+
You can also filter by other properties like `usename` (username), `datname` (database name) etc.
169
167
170
168
Once you have the session's PID you can terminate using the following query:
171
169
172
170
```postgresql
173
171
SELECT pg_terminate_backend(pid);
174
172
```
175
173
176
-
### Monitoring Vacuum And Table Stats
174
+
### Monitoring vacuum and table stats
177
175
178
176
Keeping table statistics up to date helps improve query performance. Monitor whether regular autovacuuming is being carried out.
0 commit comments