You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/postgresql/flexible-server/how-to-autovacuum-tuning.md
+43-58Lines changed: 43 additions & 58 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -15,19 +15,15 @@ ms.date: 7/28/2022
15
15
16
16
Internal data consistency in PostgreSQL is based on the Multi-Version Concurrency Control (MVCC) mechanism, which allows the database engine to maintain multiple versions of a row and provides greater concurrency with minimal blocking between the different processes.
17
17
18
-
19
18
PostgreSQL databases need appropriate maintenance. For example, when a row is deleted, it is not removed physically. Instead, the row is marked as “dead”. Similarly for updates, the row is marked as "dead" and a new version of the row is inserted. These operations leave behind dead records, called dead tuples, even after all the transactions that might see those versions finish. Unless cleaned up, dead tuples remain, consuming disk space and bloating tables and indexes which result in slow query performance.
20
19
21
-
22
-
23
-
Postgres uses a process called autovacuum to automatically clean up dead tuples.
20
+
PostgreSQL uses a process called autovacuum to automatically clean up dead tuples.
24
21
25
22
26
23
## Autovacuum Internals
27
24
28
25
Autovacuum reads pages looking for dead tuples, and if none are found, autovacuum discard the page. When autovacuum finds dead tuples, it removes them. The cost is based on:
29
26
30
-
31
27
`vacuum_cost_page_hit`
32
28
Cost of reading a page that is already in shared buffers and does not need a disk read. The default value is set to 1.
33
29
@@ -46,13 +42,11 @@ The amount of work autovacuum does depends on two parameters:
46
42
47
43
48
44
In Postgres versions 9.6, 10 and 11 the default for `autovacuum_vacuum_cost_limit` is 200 and `autovacuum_vacuum_cost_delay` is 20 milliseconds.
49
-
50
45
In Postgres versions 12 and above the default `autovacuum_vacuum_cost_limit` is 200 and `autovacuum_vacuum_cost_delay` is 2 milliseconds.
51
46
52
-
53
47
Autovacuum wakes up 50 times (50*20 ms=1000 ms) every second. Every time it wakes up, autovacuum reads 200 pages.
54
48
55
-
That means in one-second autovacuum can do
49
+
That means in one-second autovacuum can do:
56
50
57
51
-~80 MB/Sec [ (200 pages/`vacuum_cost_page_hit`) * 50 * 8 KB per page] if all pages with dead tuples are found in shared buffers.
58
52
-~8 MB/Sec [ (200 pages/`vacuum_cost_page_miss`) * 50 * 8 KB per page] if all pages with dead tuples are read from disk.
@@ -61,22 +55,20 @@ That means in one-second autovacuum can do
61
55
62
56
63
57
## Monitoring Autovacuum
64
-
Use the following queries to monitor autovacuum:
65
58
59
+
Use the following queries to monitor autovacuum:
66
60
67
-
```
61
+
```postgresql
68
62
select schemaname,relname,n_dead_tup,n_live_tup,round(n_dead_tup::float/n_live_tup::float*100) dead_pct,autovacuum_count,last_vacuum,last_autovacuum,last_autoanalyze,last_analyze from pg_stat_all_tables where n_live_tup >0;
69
-
```
63
+
```
70
64
71
65
72
66
The following columns help determine if autovacuum is catching up to table activity:
73
67
74
68
75
-
**Dead_pct**: percentage of dead tuples when compared to live tuples.
76
-
77
-
**Last_autovacuum**: The date of the last time the table was autovacuumed.
78
-
79
-
**Last_autoanalyze**: The date of the last time the table was automatically analyzed.
69
+
-**Dead_pct**: percentage of dead tuples when compared to live tuples.
70
+
-**Last_autovacuum**: The date of the last time the table was autovacuumed.
71
+
-**Last_autoanalyze**: The date of the last time the table was automatically analyzed.
80
72
81
73
82
74
## When Does PostgreSQL Trigger Autovacuum
@@ -85,26 +77,20 @@ An autovacuum action (either *ANALYZE* or *VACUUM*) triggers when the number of
For example, analyze triggers after 60 rows change on a table that contains 100 rows, and vacuum triggers when 70 rows change on the table, using the following equations:
96
85
97
-
98
-
`Autoanalyze = 0.1 * 100 + 50 = 60`
99
-
100
-
86
+
`Autoanalyze = 0.1 * 100 + 50 = 60`
101
87
`Autovacuum = 0.2 * 100 + 50 = 70`
102
88
103
89
104
90
Use the following query to list the tables in a database and identify the tables that qualify for the autovacuum process:
105
91
106
92
107
-
```
93
+
```postgresql
108
94
SELECT *
109
95
,n_dead_tup > av_threshold AS av_needed
110
96
,CASE
@@ -139,12 +125,14 @@ Use the following query to list the tables in a database and identify the tables
139
125
ORDER BY av_needed DESC ,n_dead_tup DESC;
140
126
```
141
127
142
-
>[!NOTE]
128
+
>[!NOTE]
143
129
> The query does not take into consideration that autovacuum can be configured on a per-table basis using the "alter table" DDL command.
144
130
145
131
146
132
## Common Autovacuum Problems
147
133
134
+
Review the possible common problems with the autovacuum process.
135
+
148
136
### Not Keeping Up With Busy Server
149
137
150
138
The autovacuum process estimates the cost of every I/O operation, accumulates a total for each operation it performs and pauses once the upper limit of the cost is reached. `autovacuum_vacuum_cost_delay` and `autovacuum_vacuum_cost_limit` are the two server parameters that are used in the process.
@@ -155,38 +143,37 @@ By default, `autovacuum_vacuum_cost_limit` is set to –1 meaning autovacuum
155
143
`vacuum_cost_limit` is the cost of manual vacuum. If `autovacuum_vacuum_cost_limit` is set to -1 then autovacuum would use `vacuum_cost_limit` parameter but if `autovacuum_vacuum_cost_limit` itself is set greater than -1 then `autovacuum_vacuum_cost_limit` parameter is considered.
156
144
157
145
In case the autovacuum is not keeping up, the following parameters may be changed:
158
-
146
+
159
147
##### `autovacuum_vacuum_scale_factor`
148
+
160
149
Default: 0.2, range 0.05 - 0.1. The scale factor is workload-specific and should be set depending on the amount of data in the tables. Before changing the value, the workload and individual table volumes need to be investigated.
161
150
162
151
##### `autovacuum_vacuum_cost_limit`
152
+
163
153
Default: 200. Cost limit may be increased. CPU and I/O utilization on the database should be monitored before and after making changes.
164
154
165
155
##### `autovacuum_vacuum_cost_delay`
166
156
167
157
###### Postgres Versions 9.6,10,11
158
+
168
159
Default: 20 ms. The parameter may be decreased to 2-10 ms.
169
160
170
161
###### Postgres Versions 12 and above
162
+
171
163
Default: 2 ms.
172
164
173
-
```
174
165
> [!NOTE]
175
-
> The `autovacuum_vacuum_cost_limit` value is distributed proportionally among the running autovacuum workers, so that if there is more than one, the sum of the limits for each worker does not exceed the value of the `autovacuum_vacuum_cost_limit` parameter.
166
+
> The `autovacuum_vacuum_cost_limit` value is distributed proportionally among the running autovacuum workers, so that if there is more than one, the sum of the limits for each worker does not exceed the value of the `autovacuum_vacuum_cost_limit` parameter
176
167
177
-
```
178
168
179
169
### Autovacuum Constantly Running
180
170
181
171
Continuously running autovacuum may affect CPU and IO utilization on the server. The following might be possible reasons:
182
172
183
-
184
-
185
173
##### `maintenance_work_mem`
186
174
187
175
Autovacuum daemon uses `autovacuum_work_mem` that is by default set to -1 meaning `autovacuum_work_mem` would have the same value as the parameter `maintenance_work_mem`. This document assumes `autovacuum_work_mem` is set to -1 and `maintenance_work_mem` is used by the autovacuum daemon.
188
176
189
-
190
177
If `maintenance_work_mem` is low, it may be increased to up to 2 GB on Flexible Server. A general rule of thumb is to allocate 50 MB to `maintenance_work_mem` for every 1 GB of RAM.
191
178
192
179
@@ -197,21 +184,23 @@ Autovacuum tries to start a worker on each database every `autovacuum_naptime`
197
184
For example, if a server has 60 databases and `autovacuum_naptime` is set to 60 seconds, then the autovacuum worker starts every second [autovacuum_naptime/Number of DBs].
198
185
199
186
200
-
It is a good idea to increase `autovacuum_naptime` if there are more databases in a cluster.
201
-
At the same time, the autovacuum process can be made more aggressive by increasing the
202
-
`autovacuum_cost_limit` and decreasing the `autovacuum_cost_delay` parameters and increasing the `autovacuum_max_workers` from the default of 3 to 4 or 5.
187
+
It is a good idea to increase `autovacuum_naptime` if there are more databases in a cluster. At the same time, the autovacuum process can be made more aggressive by increasing the `autovacuum_cost_limit` and decreasing the `autovacuum_cost_delay` parameters and increasing the `autovacuum_max_workers` from the default of 3 to 4 or 5.
203
188
204
189
205
190
### Out Of Memory Errors
191
+
206
192
Overly aggressive `maintenance_work_mem` values could periodically cause out-of-memory errors in the system. It is important to understand available RAM on the server before any change to the `maintenance_work_mem` parameter is made.
207
193
208
194
209
195
### Autovacuum Is Too Disruptive
196
+
210
197
If autovacuum is consuming a lot of resources, the following can be done:
211
198
212
199
##### Autovacuum Parameters
200
+
213
201
Evaluate the parameters `autovacuum_vacuum_cost_delay`, `autovacuum_vacuum_cost_limit`, `autovacuum_max_workers`. Improperly setting autovacuum parameters may lead to scenarios where autovacuum becomes too disruptive.
214
202
203
+
If autovacuum is too disruptive, consider the following:
215
204
216
205
- Increase `autovacuum_vacuum_cost_delay` and reduce `autovacuum_vacuum_cost_limit` if set higher than the default of 200.
217
206
- Reduce the number of `autovacuum_max_workers` if it is set higher than the default of 3.
@@ -220,18 +209,13 @@ Evaluate the parameters `autovacuum_vacuum_cost_delay`, `autovacuum_vacuum_cost_
220
209
221
210
Increasing the number of autovacuum workers will not necessarily increase the speed of vacuum. Having a high number of autovacuum workers is not recommended.
222
211
223
-
a high number of autovacuum workers.
224
-
225
212
Increasing the number of autovacuum workers will result in more memory consumption, and depending on the value of `maintenance_work_mem` , could cause performance degradation.
226
213
227
214
Each autovacuum worker process only gets (1/autovacuum_max_workers) of the total `autovacuum_cost_limit`, so having a high number of workers causes each one to go slower.
228
215
229
-
If the number of workers is increased, `autovacuum_vacuum_cost_limit` should also be increased and/or
230
-
`autovacuum_vacuum_cost_delay` should be decreased to make the vacuum process faster.
216
+
If the number of workers is increased, `autovacuum_vacuum_cost_limit` should also be increased and/or `autovacuum_vacuum_cost_delay` should be decreased to make the vacuum process faster.
231
217
232
-
However, if we have changed table level `autovacuum_vacuum_cost_delay` or `autovacuum_vacuum_cost_limit` parameters
233
-
then the workers running on those tables are exempted from being considered in the balancing algorithm
234
-
[autovacuum_cost_limit/autovacuum_max_workers].
218
+
However, if we have changed table level `autovacuum_vacuum_cost_delay` or `autovacuum_vacuum_cost_limit` parameters then the workers running on those tables are exempted from being considered in the balancing algorithm [autovacuum_cost_limit/autovacuum_max_workers].
235
219
236
220
### Autovacuum Transaction ID (TXID) Wraparound Protection
237
221
@@ -240,28 +224,27 @@ When a database runs into transaction ID wraparound protection, an error message
240
224
```
241
225
<i>database is not accepting commands to avoid wraparound data loss in database ‘xx’
242
226
Stop the postmaster and vacuum that database in single-user mode. </i>
243
-
227
+
```
228
+
244
229
> [!NOTE]
245
230
> This error message is a long-standing oversight. Usually, you do not need to switch to single-user mode. Instead, you can run the required VACUUM commands and perform tuning for VACUUM to run fast. While you cannot run any data manipulation language (DML), you can still run VACUUM.
246
231
247
-
```
232
+
248
233
The wraparound problem occurs when the database is either not vacuumed or there are too many dead tuples that could not be removed by autovacuum. The reasons for this might be:
249
234
250
235
#### Heavy Workload
251
236
252
237
The workload could cause too many dead tuples in a brief period that makes it difficult for autovacuum to catch up. The dead tuples in the system add up over a period leading to degradation of query performance and leading to wraparound situation. One reason for this situation to arise might be because autovacuum parameters aren't adequately set and it is not keeping up with a busy server.
253
238
254
239
255
-
256
240
#### Long Running Transactions
257
241
258
-
Any long-running transactions in the system will not allow dead tuples to be removed while autovacuum is running.
259
-
They're a blocker to the vacuum process. Removing the long running transactions frees up dead tuples for deletion when autovacuum runs.
242
+
Any long-running transactions in the system will not allow dead tuples to be removed while autovacuum is running. They're a blocker to the vacuum process. Removing the long running transactions frees up dead tuples for deletion when autovacuum runs.
260
243
261
244
Long-running transactions can be detected using the following query:
262
245
263
246
264
-
```
247
+
```postgresql
265
248
SELECT pid, age(backend_xid) AS age_in_xids,
266
249
now () - xact_start AS xact_age,
267
250
now () - query_start AS query_age,
@@ -278,24 +261,25 @@ Long-running transactions can be detected using the following query:
278
261
If there are prepared statements that are not committed, they would prevent dead tuples from being removed.
279
262
The following query helps find non-committed prepared statements:
Use COMMIT PREPARED or ROLLBACK PREPARED to commit or roll back these statements.
287
271
288
272
#### Unused Replication Slots
289
273
290
274
Unused replication slots prevent autovacuum from claiming dead tuples. The following query helps identify unused replication slots:
291
275
292
-
```
276
+
```postgresql
293
277
SELECT slot_name, slot_type, database, xmin
294
278
FROM pg_replication_slots
295
279
ORDER BY age(xmin) DESC;
296
280
```
297
-
Use pg_drop_replication_slot() to delete unused replication slots.
298
281
282
+
Use `pg_drop_replication_slot()` to delete unused replication slots.
299
283
300
284
When the database runs into transaction ID wraparound protection, check for any blockers as mentioned previously, and remove those manually for autovacuum to continue and complete. You can also increase the speed of autovacuum by setting `autovacuum_cost_delay` to 0 and increasing the `autovacuum_cost_limit` to a value much greater than 200. However, changes to these parameters will not be applied to existing autovacuum workers. Either restart the database or kill existing workers manually to apply parameter changes.
301
285
@@ -305,8 +289,9 @@ When the database runs into transaction ID wraparound protection, check for any
305
289
Autovacuum parameters may be set for individual tables. It is especially important for small and big tables. For example, for a small table that contains only 100 rows, autovacuum triggers VACUUM operation when 70 rows change (as calculated previously). If this table is frequently updated, you might see hundreds of autovacuum operations a day. This will prevent autovacuum from maintaining other tables on which the percentage of changes aren't as big. Alternatively, a table containing a billion rows needs to change 200 million rows to trigger autovacuum operations. Setting autovacuum parameters appropriately prevents such scenarios.
306
290
307
291
308
-
To set autovacuum setting per table, change the server parameters as follows (values below are examples):
309
-
```
292
+
To set autovacuum setting per table, change the server parameters as the following examples:
293
+
294
+
```postgresql
310
295
ALTER TABLE <table name> SET (autovacuum_analyze_scale_factor = xx);
311
296
ALTER TABLE <table name> SET (autovacuum_analyze_threshold = xx);
312
297
ALTER TABLE <table name> SET (autovacuum_vacuum_scale_factor =xx);
@@ -316,18 +301,17 @@ To set autovacuum setting per table, change the server parameters as follows (
316
301
```
317
302
### Insert-only Workloads
318
303
319
-
In versions of PostgreSQL prior to 13, autovacuum will not run on tables with an insert-only workload, because if there are no updates or deletes, there are no dead tuples and no free space that needs to be reclaimed. However, autoanalyze will run for insert-only workloads since there is new data. The disadvantages of this are:
304
+
In versions of PostgreSQL prior to 13, autovacuum will not run on tables with an insert-only workload, because if there are no updates or deletes, there are no dead tuples and no free space that needs to be reclaimed. However, autoanalyze will run for insert-only workloads since there is new data. The disadvantages of this are:
320
305
321
306
- The visibility map of the tables is not updated, and thus query performance, especially where there are Index Only Scans, starts to suffer over time.
322
-
323
307
- The database can run into transaction ID wraparound protection.
324
308
- Hint bits will not be set.
325
309
326
310
#### Solutions
327
311
328
312
##### Postgres Versions prior to 13
329
313
330
-
Using the pg_cron extension, a cron job can be set up to schedule a periodic vacuum analyze on the table. The frequency of the cron job depends on the workload.
314
+
Using the **pg_cron** extension, a cron job can be set up to schedule a periodic vacuum analyze on the table. The frequency of the cron job depends on the workload.
331
315
332
316
For step-by-step guidance using pg_cron, review [Extensions](./concepts-extensions.md).
333
317
@@ -337,6 +321,7 @@ For step-by-step guidance using pg_cron, review [Extensions](./concepts-extensio
337
321
Autovacuum will run on tables with an insert-only workload. Two new server parameters `autovacuum_vacuum_insert_threshold` and `autovacuum_vacuum_insert_scale_factor` help control when autovacuum can be triggered on insert-only tables.
338
322
339
323
## Next steps
324
+
340
325
- Troubleshoot High CPU Utilization [High CPU Utilization](./how-to-high-cpu-utilization.md).
341
326
- Troubleshoot High Memory Utilization [High Memory Utilization](./how-to-high-memory-utilization.md).
342
327
- Configure server parameters [Server Parameters](./howto-configure-server-parameters-using-portal.md).
0 commit comments