Skip to content

Commit af39201

Browse files
Updated Autovacuum with Analyze information
1 parent 8d39075 commit af39201

File tree

1 file changed

+42
-19
lines changed

1 file changed

+42
-19
lines changed

articles/postgresql/flexible-server/how-to-autovacuum-tuning.md

Lines changed: 42 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -18,24 +18,25 @@ This article provides an overview of the autovacuum feature for [Azure Database
1818

1919
## What is autovacuum
2020

21-
Internal data consistency in PostgreSQL is based on the Multi-Version Concurrency Control (MVCC) mechanism, which allows the database engine to maintain multiple versions of a row and provides greater concurrency with minimal blocking between the different processes.
21+
Autovacuum is a crucial PostgreSQL daemon process designed to automatically clean up dead tuples and update statistics. It combines the functionalities of *VACUUM* and *ANALYZE*. The *VACUUM* component reclaims disk space by removing dead tuples, while the *ANALYZE* component updates statistics, enabling the PostgreSQL Optimizer to choose the most efficient execution paths for queries. Autovacuum handles both *VACUUM* and *ANALYZE* operations on tables, ensuring optimal performance and maintaining efficient database performance. without manual intervention.
2222

23-
PostgreSQL databases need appropriate maintenance. For example, when a row is deleted, it isn't removed physically. Instead, the row is marked as "dead". Similarly for updates, the row is marked as "dead" and a new version of the row is inserted. These operations leave behind dead records, called dead tuples, even after all the transactions that might see those versions finish. Unless cleaned up, dead tuples remain, consuming disk space and bloating tables and indexes which result in slow query performance.
24-
25-
PostgreSQL uses a process called autovacuum to automatically clean-up dead tuples.
23+
Autovacuum should always be set to ON for the autovacuum daemon to effectively perform its operations on the server. PostgreSQL automatically determines whether VACUUM or ANALYZE needs to be executed on a table, but this only occurs if autovacuum is enabled on the server or database.
2624

2725
## Autovacuum internals
2826

2927
Autovacuum reads pages looking for dead tuples, and if none are found, autovacuum discards the page. When autovacuum finds dead tuples, it removes them. The cost is based on:
3028

31-
- `vacuum_cost_page_hit`: Cost of reading a page that is already in shared buffers and doesn't need a disk read. The default value is set to 1.
32-
- `vacuum_cost_page_miss`: Cost of fetching a page that isn't in shared buffers. The default value is set to 10.
33-
- `vacuum_cost_page_dirty`: Cost of writing to a page when dead tuples are found in it. The default value is set to 20.
29+
| Parameter | Description
30+
| -------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
31+
`vacuum_cost_page_hit` | Cost of reading a page that is already in shared buffers and doesn't need a disk read. The default value is set to 1.
32+
`vacuum_cost_page_miss` | Cost of fetching a page that isn't in shared buffers. The default value is set to 10.
33+
`vacuum_cost_page_dirty` | Cost of writing to a page when dead tuples are found in it. The default value is set to 20.
3434

3535
The amount of work autovacuum does depends on two parameters:
36-
37-
- `autovacuum_vacuum_cost_limit` is the amount of work autovacuum does in one go.
38-
- `autovacuum_vacuum_cost_delay` number of milliseconds that autovacuum is asleep after it has reached the cost limit specified by the `autovacuum_vacuum_cost_limit` parameter.
36+
| Parameter | Description
37+
| -------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
38+
`autovacuum_vacuum_cost_limit` | The amount of work autovacuum does in one go.
39+
`autovacuum_vacuum_cost_delay` | Number of milliseconds that autovacuum is asleep after it has reached the cost limit specified by the `autovacuum_vacuum_cost_limit` parameter.
3940

4041
In all currently supported versions of Postgres the default for `autovacuum_vacuum_cost_limit` is 200 (actually, it is set to -1 which makes it equals to the value of the regular `vacuum_cost_limit` which, by default, is 200).
4142

@@ -59,24 +60,46 @@ select schemaname,relname,n_dead_tup,n_live_tup,round(n_dead_tup::float/n_live_t
5960

6061
The following columns help determine if autovacuum is catching up to table activity:
6162

62-
- **dead_pct**: percentage of dead tuples when compared to live tuples.
63-
- **last_autovacuum**: The date of the last time the table was autovacuumed.
64-
- **last_autoanalyze**: The date of the last time the table was automatically analyzed.
63+
| Parameter | Description
64+
| -------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
65+
`dead_pct` | Percentage of dead tuples when compared to live tuples.
66+
`last_autovacuum` | The date of the last time the table was autovacuumed.
67+
`last_autoanalyze` | The date of the last time the table was automatically analyzed.
6568

6669
## When does PostgreSQL trigger autovacuum
6770

68-
An autovacuum action (either *ANALYZE* or *VACUUM*) triggers when the number of dead tuples exceeds a particular number that is dependent on two factors: the total count of rows in a table, plus a fixed threshold. *ANALYZE*, by default, triggers when 10% of the table plus 50 rows changes, while *VACUUM* triggers when 20% of the table plus 50 rows changes. Since the *VACUUM* threshold is twice as high as the *ANALYZE* threshold, *ANALYZE* gets triggered earlier than *VACUUM*.
71+
An autovacuum action (either *ANALYZE* or *VACUUM*) triggers when the number of dead tuples exceeds a particular number that is dependent on two factors: the total count of rows in a table, plus a fixed threshold. *ANALYZE*, by default, triggers when 10% of the table plus 50 row changes, while *VACUUM* triggers when 20% of the table plus 50 row changes. Since the *VACUUM* threshold is twice as high as the *ANALYZE* threshold, *ANALYZE* gets triggered earlier than *VACUUM*.
72+
For PG versions >=13; *ANALYZE* by default, triggers when 20% of the table plus 1000 row inserts.
6973

7074
The exact equations for each action are:
7175

72-
- **Autoanalyze** = autovacuum_analyze_scale_factor * tuples + autovacuum_analyze_threshold
76+
- **Autoanalyze** = autovacuum_analyze_scale_factor * tuples + autovacuum_analyze_threshold or
77+
autovacuum_vacuum_insert_scale_factor * tuples + autovacuum_vacuum_insert_threshold (For PG versions >= 13)
7378
- **Autovacuum** = autovacuum_vacuum_scale_factor * tuples + autovacuum_vacuum_threshold
7479

75-
For example, analyze triggers after 60 rows change on a table that contains 100 rows, and vacuum triggers when 70 rows change on the table, using the following equations:
80+
For example, if we have a table woth 100 rows. Then using the below equation
7681

82+
For Updates/deletes:
7783
`Autoanalyze = 0.1 * 100 + 50 = 60`
7884
`Autovacuum = 0.2 * 100 + 50 = 70`
7985

86+
Analyze triggers after 60 rows are changed on a table, and Vacuum triggers when 70 rows are changed on a table.
87+
88+
For Inserts:
89+
`Autoanalyze = 0.2 * 100 + 1000 = 1020`
90+
91+
Analyze triggers after 1020 rows are inserted on a table
92+
93+
Below is the description of the parameters used in the above equation:
94+
95+
| Parameter | Description
96+
| -------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
97+
| `autovacuum_analyze_scale_factor` | This is the percentage of inserts/updates/deletes which triggers ANALYZE on the table.
98+
| `autovacuum_analyze_threshold` | This parameter specifies the minimum number of tuples inserted/updated/deleted to ANALYZE a table.
99+
| `autovacuum_vacuum_insert_scale_factor` | This is the percentage of inserts which triggers ANLYZE on the table.
100+
| `autovacuum_vacuum_insert_threshold` | This parameter specifies the minimum number of tuples inserted to ANALYZE a table.
101+
| `autovacuum_vacuum_scale_factor` | This is the percentage of updates/deletes which triggers VACUUM on the table.
102+
80103
Use the following query to list the tables in a database and identify the tables that qualify for the autovacuum process:
81104

82105
```sql
@@ -131,14 +154,14 @@ If `autovacuum_vacuum_cost_limit` is set to `-1` then autovacuum uses the `v
131154

132155
In case the autovacuum isn't keeping up, the following parameters might be changed:
133156

134-
| Parameter | Description |
157+
| Parameter | Description
135158
| -------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
136-
| `autovacuum_vacuum_scale_factor` | Default: `0.2`, range: `0.05 - 0.1`. The scale factor is workload-specific and should be set depending on the amount of data in the tables. Before changing the value, investigate the workload and individual table volumes. |
137159
| `autovacuum_vacuum_cost_limit` | Default: `200`. Cost limit might be increased. CPU and I/O utilization on the database should be monitored before and after making changes. |
138160
| `autovacuum_vacuum_cost_delay` | **Postgres Version 11** - Default: `20 ms`. The parameter might be decreased to `2-10 ms`.<br />**Postgres Versions 12 and above** - Default: `2 ms`. |
139161

140162
> [!NOTE]
141-
> The `autovacuum_vacuum_cost_limit` value is distributed proportionally among the running autovacuum workers, so that if there is more than one, the sum of the limits for each worker doesn't exceed the value of the `autovacuum_vacuum_cost_limit` parameter
163+
> - The `autovacuum_vacuum_cost_limit` value is distributed proportionally among the running autovacuum workers, so that if there is more than one, the sum of the limits for each worker doesn't exceed the value of the `autovacuum_vacuum_cost_limit` parameter.
164+
> - `autovacuum_vacuum_scale_factor` is another parameter which could trigger vacuum on a table based on dead tuple accumulation. Default: `0.2`, Allowed range: `0.05 - 0.1`. The scale factor is workload-specific and should be set depending on the amount of data in the tables. Before changing the value, investigate the workload and individual table volumes.
142165
143166
### Autovacuum constantly running
144167

0 commit comments

Comments
 (0)