Skip to content

Commit 40e9924

Browse files
committed
fixing blocking and non-blocking
1 parent 2b76d60 commit 40e9924

File tree

4 files changed

+60
-64
lines changed

4 files changed

+60
-64
lines changed

articles/postgresql/TOC.yml

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -530,19 +530,19 @@
530530
href: flexible-server/concepts-azure-advisor-recommendations.md
531531
- name: Troubleshooting
532532
items:
533-
- name: Functional Troubleshooting
533+
- name: Functional troubleshooting
534534
items:
535535
- name: Troubleshoot CLI errors
536536
href: flexible-server/how-to-troubleshoot-cli-errors.md
537-
- name: Performance Troubleshooting
537+
- name: Performance troubleshooting
538538
items:
539-
- name: Troubleshoot High CPU Utilization
539+
- name: Troubleshoot high CPU utilization
540540
href: flexible-server/how-to-high-cpu-utilization.md
541541
displayName: High CPU Utilization
542-
- name: Troubleshoot High Memory Utilization
542+
- name: Troubleshoot high memory utilization
543543
href: flexible-server/how-to-high-memory-utilization.md
544544
displayName: High Memory Utilization
545-
- name: Troubleshoot Autovacuum
545+
- name: Troubleshoot autovacuum
546546
href: flexible-server/how-to-autovacuum-tuning.md
547547
displayName: Autovacuum troubleshooting, tuning
548548
- name: How-to guides

articles/postgresql/flexible-server/how-to-autovacuum-tuning.md

Lines changed: 31 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -6,21 +6,23 @@ author: sarat0681
66
ms.service: postgresql
77
ms.subservice: flexible-server
88
ms.topic: conceptual
9-
ms.date: 7/28/2022
9+
ms.date: 08/03/2022
1010
---
1111

12-
# Autovacuum Tuning
12+
# Autovacuum Tuning in Azure Database for PostgreSQL - Flexible Server
1313

14-
## What is Autovacuum
14+
This article provides an overview of the autovacuum feature for [Azure Database for PostgreSQL - Flexible Server](overview.md).
15+
16+
## What is autovacuum
1517

1618
Internal data consistency in PostgreSQL is based on the Multi-Version Concurrency Control (MVCC) mechanism, which allows the database engine to maintain multiple versions of a row and provides greater concurrency with minimal blocking between the different processes.
1719

18-
PostgreSQL databases need appropriate maintenance. For example, when a row is deleted, it is not removed physically. Instead, the row is marked as “dead”. Similarly for updates, the row is marked as "dead" and a new version of the row is inserted. These operations leave behind dead records, called dead tuples, even after all the transactions that might see those versions finish. Unless cleaned up, dead tuples remain, consuming disk space and bloating tables and indexes which result in slow query performance.
20+
PostgreSQL databases need appropriate maintenance. For example, when a row is deleted, it is not removed physically. Instead, the row is marked as “dead”. Similarly for updates, the row is marked as "dead" and a new version of the row is inserted. These operations leave behind dead records, called dead tuples, even after all the transactions that might see those versions finish. Unless cleaned up, dead tuples remain, consuming disk space and bloating tables and indexes which result in slow query performance.
1921

2022
PostgreSQL uses a process called autovacuum to automatically clean up dead tuples.
2123

2224

23-
## Autovacuum Internals
25+
## Autovacuum internals
2426

2527
Autovacuum reads pages looking for dead tuples, and if none are found, autovacuum discard the page. When autovacuum finds dead tuples, it removes them. The cost is based on:
2628

@@ -47,7 +49,7 @@ That means in one-second autovacuum can do:
4749

4850

4951

50-
## Monitoring Autovacuum
52+
## Monitoring autovacuum
5153

5254
Use the following queries to monitor autovacuum:
5355

@@ -64,7 +66,7 @@ The following columns help determine if autovacuum is catching up to table activ
6466
- **Last_autoanalyze**: The date of the last time the table was automatically analyzed.
6567

6668

67-
## When Does PostgreSQL Trigger Autovacuum
69+
## When does PostgreSQL trigger autovacuum
6870

6971
An autovacuum action (either *ANALYZE* or *VACUUM*) triggers when the number of dead tuples exceeds a particular number that is dependent on two factors: the total count of rows in a table, plus a fixed threshold. *ANALYZE*, by default, triggers when 10% of the table plus 50 rows changes, while *VACUUM* triggers when 20% of the table plus 50 rows changes. Since the *VACUUM* threshold is twice as high as the *ANALYZE* threshold, *ANALYZE* gets triggered much earlier than *VACUUM*.
7072

@@ -122,11 +124,11 @@ Use the following query to list the tables in a database and identify the tables
122124
> The query does not take into consideration that autovacuum can be configured on a per-table basis using the "alter table" DDL command. 
123125
124126

125-
## Common Autovacuum Problems
127+
## Common autovacuum problems
126128

127129
Review the possible common problems with the autovacuum process.
128130

129-
### Not Keeping Up With Busy Server
131+
### Not keeping up with busy server
130132

131133
The autovacuum process estimates the cost of every I/O operation, accumulates a total for each operation it performs and pauses once the upper limit of the cost is reached. `autovacuum_vacuum_cost_delay` and `autovacuum_vacuum_cost_limit` are the two server parameters that are used in the process.
132134

@@ -137,8 +139,6 @@ If `autovacuum_vacuum_cost_limit` is set to `-1` then autovacuum uses the `v
137139

138140
In case the autovacuum is not keeping up, the following parameters may be changed:
139141

140-
141-
142142
|Parameter |Description |
143143
|---------|---------|
144144
|`autovacuum_vacuum_scale_factor`| Default: `0.2`, range: `0.05 - 0.1`. The scale factor is workload-specific and should be set depending on the amount of data in the tables. Before changing the value, investigate the workload and individual table volumes. |
@@ -148,37 +148,36 @@ In case the autovacuum is not keeping up, the following parameters may be change
148148
> [!NOTE]
149149
> The `autovacuum_vacuum_cost_limit` value is distributed proportionally among the running autovacuum workers, so that if there is more than one, the sum of the limits for each worker does not exceed the value of the `autovacuum_vacuum_cost_limit` parameter
150150
151-
### Autovacuum Constantly Running
151+
### Autovacuum constantly running
152152

153153
Continuously running autovacuum may affect CPU and IO utilization on the server. The following might be possible reasons:
154154

155-
##### `maintenance_work_mem`
155+
#### `maintenance_work_mem`
156156

157157
Autovacuum daemon uses `autovacuum_work_mem` that is by default set to `-1` meaning `autovacuum_work_mem` would have the same value as the parameter `maintenance_work_mem`. This document assumes `autovacuum_work_mem` is set to `-1` and `maintenance_work_mem` is used by the autovacuum daemon.
158158

159159
If `maintenance_work_mem` is low, it may be increased to up to 2 GB on Flexible Server. A general rule of thumb is to allocate 50 MB to `maintenance_work_mem` for every 1 GB of RAM. 
160160

161161

162-
##### Large Number Of Databases
162+
#### Large number of databases
163163

164164
Autovacuum tries to start a worker on each database every `autovacuum_naptime` seconds.
165165

166166
For example, if a server has 60 databases and `autovacuum_naptime` is set to 60 seconds, then the autovacuum worker starts every second [autovacuum_naptime/Number of DBs].
167167

168-
169168
It is a good idea to increase `autovacuum_naptime` if there are more databases in a cluster. At the same time, the autovacuum process can be made more aggressive by increasing the `autovacuum_cost_limit` and decreasing the `autovacuum_cost_delay` parameters and increasing the `autovacuum_max_workers` from the default of 3 to 4 or 5.
170169

171170

172-
### Out Of Memory Errors
171+
### Out of memory errors
173172

174173
Overly aggressive `maintenance_work_mem` values could periodically cause out-of-memory errors in the system. It is important to understand available RAM on the server before any change to the `maintenance_work_mem` parameter is made.
175174

176175

177-
### Autovacuum Is Too Disruptive
176+
### Autovacuum is too disruptive
178177

179178
If autovacuum is consuming a lot of resources, the following can be done:
180179

181-
##### Autovacuum Parameters
180+
#### Autovacuum parameters
182181

183182
Evaluate the parameters `autovacuum_vacuum_cost_delay`, `autovacuum_vacuum_cost_limit`, `autovacuum_max_workers`. Improperly setting autovacuum parameters may lead to scenarios where autovacuum becomes too disruptive.
184183

@@ -187,7 +186,7 @@ If autovacuum is too disruptive, consider the following:
187186
- Increase `autovacuum_vacuum_cost_delay` and reduce `autovacuum_vacuum_cost_limit` if set higher than the default of 200.
188187
- Reduce the number of `autovacuum_max_workers` if it is set higher than the default of 3. 
189188

190-
#### Too Many Autovacuum Workers
189+
#### Too many autovacuum workers
191190

192191
Increasing the number of autovacuum workers will not necessarily increase the speed of vacuum. Having a high number of autovacuum workers is not recommended.
193192

@@ -199,7 +198,7 @@ If the number of workers is increased, `autovacuum_vacuum_cost_limit` should als
199198

200199
However, if we have changed table level `autovacuum_vacuum_cost_delay` or `autovacuum_vacuum_cost_limit` parameters then the workers running on those tables are exempted from being considered in the balancing algorithm [autovacuum_cost_limit/autovacuum_max_workers].
201200

202-
### Autovacuum Transaction ID (TXID) Wraparound Protection
201+
### Autovacuum transaction ID (TXID) wraparound protection
203202

204203
When a database runs into transaction ID wraparound protection, an error message like the following can be observed:
205204

@@ -214,18 +213,17 @@ Stop the postmaster and vacuum that database in single-user mode.
214213

215214
The wraparound problem occurs when the database is either not vacuumed or there are too many dead tuples that could not be removed by autovacuum. The reasons for this might be:
216215

217-
#### Heavy Workload
216+
#### Heavy workload
218217

219218
The workload could cause too many dead tuples in a brief period that makes it difficult for autovacuum to catch up. The dead tuples in the system add up over a period leading to degradation of query performance and leading to wraparound situation. One reason for this situation to arise might be because autovacuum parameters aren't adequately set and it is not keeping up with a busy server.
220219

221220

222-
#### Long Running Transactions
221+
#### Long-running transactions
223222

224223
Any long-running transactions in the system will not allow dead tuples to be removed while autovacuum is running. They're a blocker to the vacuum process. Removing the long running transactions frees up dead tuples for deletion when autovacuum runs.
225224

226225
Long-running transactions can be detected using the following query:
227226

228-
229227
```postgresql
230228
SELECT pid, age(backend_xid) AS age_in_xids,
231229
now () - xact_start AS xact_age,
@@ -237,8 +235,8 @@ Long-running transactions can be detected using the following query:
237235
ORDER BY 2 DESC
238236
LIMIT 10;
239237
```
240-
241-
#### Prepared Statements
238+
239+
#### Prepared statements
242240

243241
If there are prepared statements that are not committed, they would prevent dead tuples from being removed.
244242
The following query helps find non-committed prepared statements:
@@ -251,7 +249,7 @@ The following query helps find non-committed prepared statements:
251249

252250
Use COMMIT PREPARED or ROLLBACK PREPARED to commit or roll back these statements.
253251

254-
#### Unused Replication Slots
252+
#### Unused replication slots
255253

256254
Unused replication slots prevent autovacuum from claiming dead tuples. The following query helps identify unused replication slots:
257255

@@ -266,11 +264,10 @@ Use `pg_drop_replication_slot()` to delete unused replication slots.
266264
When the database runs into transaction ID wraparound protection, check for any blockers as mentioned previously, and remove those manually for autovacuum to continue and complete. You can also increase the speed of autovacuum by setting `autovacuum_cost_delay` to 0 and increasing the `autovacuum_cost_limit` to a value much greater than 200. However, changes to these parameters will not be applied to existing autovacuum workers. Either restart the database or kill existing workers manually to apply parameter changes.
267265

268266

269-
### Table-specific Requirements
267+
### Table-specific requirements
270268

271269
Autovacuum parameters may be set for individual tables. It is especially important for small and big tables. For example, for a small table that contains only 100 rows, autovacuum triggers VACUUM operation when 70 rows change (as calculated previously). If this table is frequently updated, you might see hundreds of autovacuum operations a day. This will prevent autovacuum from maintaining other tables on which the percentage of changes aren't as big. Alternatively, a table containing a billion rows needs to change 200 million rows to trigger autovacuum operations. Setting autovacuum parameters appropriately prevents such scenarios.
272270

273-
274271
To set autovacuum setting per table, change the server parameters as the following examples:
275272

276273
```postgresql
@@ -281,7 +278,8 @@ To set autovacuum setting per table, change the server parameters as the follo
281278
ALTER TABLE <table name> SET (autovacuum_vacuum_cost_delay = xx); 
282279
ALTER TABLE <table name> SET (autovacuum_vacuum_cost_limit = xx); 
283280
```
284-
### Insert-only Workloads 
281+
282+
### Insert-only workloads 
285283

286284
In versions of PostgreSQL prior to 13, autovacuum will not run on tables with an insert-only workload, because if there are no updates or deletes, there are no dead tuples and no free space that needs to be reclaimed. However, autoanalyze will run for insert-only workloads since there is new data. The disadvantages of this are:
287285

@@ -291,19 +289,19 @@ In versions of PostgreSQL prior to 13, autovacuum will not run on tables wit
291289

292290
#### Solutions 
293291

294-
##### Postgres Versions prior to 13 
292+
##### Postgres versions prior to 13 
295293

296294
Using the **pg_cron** extension, a cron job can be set up to schedule a periodic vacuum analyze on the table. The frequency of the cron job depends on the workload.  
297295

298296
For step-by-step guidance using pg_cron, review [Extensions](./concepts-extensions.md).
299297

300298

301-
##### Postgres 13 and Higher Versions
299+
##### Postgres 13 and higher versions
302300

303301
Autovacuum will run on tables with an insert-only workload. Two new server parameters `autovacuum_vacuum_insert_threshold` and  `autovacuum_vacuum_insert_scale_factor` help control when autovacuum can be triggered on insert-only tables. 
304302

305303
## Next steps
306304

307-
- Troubleshoot High CPU Utilization [High CPU Utilization](./how-to-high-cpu-utilization.md).
308-
- Troubleshoot High Memory Utilization [High Memory Utilization](./how-to-high-memory-utilization.md).
305+
- Troubleshoot high CPU utilization [High CPU Utilization](./how-to-high-cpu-utilization.md).
306+
- Troubleshoot high memory utilization [High Memory Utilization](./how-to-high-memory-utilization.md).
309307
- Configure server parameters [Server Parameters](./howto-configure-server-parameters-using-portal.md).

articles/postgresql/flexible-server/how-to-high-cpu-utilization.md

Lines changed: 17 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -6,23 +6,23 @@ author: sarat0681
66
ms.service: postgresql
77
ms.subservice: flexible-server
88
ms.topic: conceptual
9-
ms.date: 7/28/2022
9+
ms.date: 08/03/2022
1010
---
1111

1212
# Troubleshoot high CPU utilization in Azure Database for PostgreSQL - Flexible Server
1313

14-
1514
This article shows you how to quickly identify the root cause of high CPU utilization, and possible remedial actions to control CPU utilization when using [Azure Database for PostgreSQL - Flexible Server](overview.md).
1615

17-
1816
In this article, you will learn:
1917

2018
- About tools to identify high CPU utilization such as Azure Metrics, Query Store, and pg_stat_statements.
2119
- How to identify root causes, such as long running queries and total connections.
2220
- How to resolve high CPU utilization by using Explain Analyze, Connection Pooling, and Vacuuming tables.
2321

2422

25-
## Tools to Identify high CPU Utilization
23+
## Tools to identify high CPU utilization
24+
25+
Consider these tools to identify high CPU utilization.
2626

2727
### Azure Metrics
2828

@@ -39,7 +39,7 @@ The pg_stat_statements extension helps identify queries that consume time on the
3939

4040
#### Mean or average execution time
4141

42-
# [Postgres v13 & above](#tab/postgres-13)
42+
##### [Postgres v13 & above](#tab/postgres-13)
4343

4444

4545
For Postgres versions 13 and above, use the following statement to view the top five SQL statements by mean or average execution time:
@@ -52,7 +52,7 @@ DESC LIMIT 5;
5252
```
5353

5454

55-
# [Postgres v9.6-12 & above](#tab/postgres9-12)
55+
##### [Postgres v9.6-12 & above](#tab/postgres9-12)
5656

5757
For Postgres versions 9.6, 10, 11, and 12, use the following statement to view the top five SQL statements by mean or average execution time:
5858

@@ -70,7 +70,7 @@ DESC LIMIT 5;
7070

7171
Execute the following statements to view the top five SQL statements by total execution time.
7272

73-
# [Postgres v13 & above](#tab/postgres-13)
73+
##### [Postgres v13 & above](#tab/postgres-13)
7474

7575
For Postgres versions 13 and above, use the following statement to view the top five SQL statements by total execution time:
7676

@@ -81,11 +81,10 @@ ORDER BY total_exec_time
8181
DESC LIMIT 5;
8282
```
8383

84-
# [Postgres v9.6-12 & above](#tab/postgres9-12)
84+
##### [Postgres v9.6-12 & above](#tab/postgres9-12)
8585

8686
For Postgres versions 9.6, 10, 11, and 12, use the following statement to view the top five SQL statements by total execution time:
8787

88-
8988
```postgresql
9089
SELECT userid: :regrole, dbid, query,
9190
FROM pg_stat_statements
@@ -96,14 +95,14 @@ DESC LIMIT 5;
9695
---
9796

9897

99-
## Identify Root Causes
98+
## Identify root causes
10099

101100
If CPU consumption levels are high in general, the following could be possible root causes:
102101

103102

104-
### Long Running Transactions
103+
### Long-running transactions
105104

106-
Long running transactions can consume CPU resources that can lead to high CPU utilization.
105+
Long-running transactions can consume CPU resources that can lead to high CPU utilization.
107106

108107
The following query helps identify connections running for the longest time:
109108

@@ -114,7 +113,7 @@ WHERE pid <> pg_backend_pid() and state IN ('idle in transaction', 'active')
114113
ORDER BY duration DESC;
115114
```
116115

117-
### Total Number of Connections and Number Connections by State
116+
### Total number of connections and number connections by state
118117

119118
A large number of connections to the database is also another issue that might lead to increased CPU as well as memory utilization.
120119

@@ -128,8 +127,7 @@ WHERE pid <> pg_backend_pid()
128127
GROUP BY 1 ORDER BY 1;
129128
```
130129

131-
132-
## Resolve High CPU Utilization
130+
## Resolve high CPU utilization
133131

134132
Use Explain Analyze, PG Bouncer, connection pooling and terminate long running transactions to resolve high CPU utilization.
135133

@@ -139,7 +137,7 @@ Once you know the query that's running for a long time, use **EXPLAIN** to furth
139137
For more information about the **EXPLAIN** command, review [Explain Plan](https://www.postgresql.org/docs/current/sql-explain.html).
140138

141139

142-
### PGBouncer And Connection Pooling
140+
### PGBouncer and connection pooling
143141

144142
In situations where there are lots of idle connections or lot of connections which are consuming the CPU consider use of a connection pooler like PgBouncer.
145143

@@ -152,7 +150,7 @@ For more details about PgBouncer, review:
152150
Azure Database for Flexible Server offers PgBouncer as a built-in connection pooling solution. For more information, see [PgBouncer](./concepts-pgbouncer.md)
153151

154152

155-
### Terminating Long Running Transactions
153+
### Terminating long running transactions
156154

157155
You could consider killing a long running transaction as an option.
158156

@@ -165,15 +163,15 @@ WHERE pid <> pg_backend_pid() and state IN ('idle in transaction', 'active')
165163
ORDER BY duration DESC;
166164
```
167165

168-
You can also filter by other properties like usename (username), datname (database name) etc.
166+
You can also filter by other properties like `usename` (username), `datname` (database name) etc.
169167

170168
Once you have the session's PID you can terminate using the following query:
171169

172170
```postgresql
173171
SELECT pg_terminate_backend(pid);
174172
```
175173

176-
### Monitoring Vacuum And Table Stats
174+
### Monitoring vacuum and table stats
177175

178176
Keeping table statistics up to date helps improve query performance. Monitor whether regular autovacuuming is being carried out.
179177

0 commit comments

Comments
 (0)