Skip to content

Commit 3462bde

Browse files
committed
addressed reviewer changes
1 parent 6ea3676 commit 3462bde

File tree

4 files changed

+53
-53
lines changed

4 files changed

+53
-53
lines changed

articles/postgresql/TOC.yml

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -542,18 +542,18 @@
542542
- name: Troubleshoot high memory utilization
543543
href: flexible-server/how-to-high-memory-utilization.md
544544
displayName: High Memory Utilization
545-
- name: Troubleshoot high io utilization
545+
- name: Troubleshoot High IO utilization
546546
href: flexible-server/how-to-high-io-utilization.md
547547
displayName: High IOPS Utilization
548548
- name: Troubleshoot autovacuum
549549
href: flexible-server/how-to-autovacuum-tuning.md
550550
displayName: Autovacuum troubleshooting, tuning
551-
- name: Bulk Data Load Best Practices
551+
- name: Best practices for bulk data upload
552552
href: flexible-server/how-to-bulkload_data.md
553-
displayName: Bulk Data Load Best Practices
554-
- name: Best Practices For Faster Dump And Restore
553+
displayName: Best practices for bulk data upload
554+
- name: Best practices for pg_dump and restore
555555
href: flexible-server/how-to-pgdump-restore.md
556-
displayName: Best Practices For Faster Dump And Restore
556+
displayName: Best practices for pg_dump and restore
557557
- name: How-to guides
558558
items:
559559
- name: Manage a server

articles/postgresql/flexible-server/how-to-bulkload_data.md renamed to articles/postgresql/flexible-server/how-to-bulk-load-data.md

Lines changed: 16 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
---
2-
title: Bulk Data Uploads
2+
title: Bulk Data Uploads For Azure Database for PostgreSQL - Flexible Server
33
description: Best practices to bulk load data in Azure Database for PostgreSQL - Flexible Server
44
author: sarat0681
55
ms.author: sbalijepalli
@@ -11,7 +11,7 @@ ms.custom: template-how-to #Required; leave this attribute/value as-is.
1111
---
1212

1313

14-
# Bulk data load best practices
14+
# Best practices for bulk data upload for Azure Database for PostgreSQL - Flexible Server
1515

1616
There are two types of bulk loads:
1717
- Initial data load of an empty database
@@ -23,7 +23,7 @@ This article discusses various loading techniques along with best practices when
2323

2424
Performance-wise, the data loading methods arranged in the order of most time consuming to least time consuming is as follows:
2525
- Single record Insert
26-
- Batch into 100-1000 rows per commit. One can use transaction block to wrap multiple records per commit [Batch Inserts]
26+
- Batch into 100-1000 rows per commit. One can use transaction block to wrap multiple records per commit
2727
- INSERT with multi row values
2828
- COPY command
2929

@@ -33,27 +33,27 @@ The preferred method to load the data into the database is by copy command. If t
3333

3434
#### Drop indexes
3535

36-
Before an initial data load, it is advised to drop all the indexes in the tables. It is always more efficient to create the indexes after the data load.
36+
Before an initial data load, it's advised to drop all the indexes in the tables. It's always more efficient to create the indexes after the data load.
3737

3838
#### Drop constraints
3939

4040
##### Unique key constraints
4141

42-
To achieve strong performance, it's advised to drop unique key constraints before a initial data load and recreate it once the data load is completed. However, be aware that dropping unique key constraints cancels the safeguards against duplicated data.
42+
To achieve strong performance, it's advised to drop unique key constraints before an initial data load, and recreate it once the data load is completed. However, dropping unique key constraints cancels the safeguards against duplicated data.
4343

4444
##### Foreign key constraints
4545

4646
It's advised to drop foreign key constraints before initial data load and recreate once data load is completed.
4747

48-
Changing the `session_replication_role` parameter to replica also disables all foreign key checks.However, be aware making the change can leave data in an inconsistent state if not properly used.
48+
Changing the `session_replication_role` parameter to replica also disables all foreign key checks. However, be aware making the change can leave data in an inconsistent state if not properly used.
4949

5050
#### Unlogged tables
5151

52-
Use of unlogged tables will make data load faster. Data written to unlogged tables is not written to the write-ahead log.
52+
Use of unlogged tables will make data load faster. Data written to unlogged tables isn't written to the write-ahead log.
5353

5454
The disadvantages of using unlogged tables are
55-
- They are not crash-safe. An unlogged table is automatically truncated after a crash or unclean shutdown.
56-
- Data from unlogged tables cannot be replicated to standby servers.
55+
- They aren't crash-safe. An unlogged table is automatically truncated after a crash or unclean shutdown.
56+
- Data from unlogged tables can't be replicated to standby servers.
5757

5858
The pros and cons of using unlogged tables should be considered before using in initial data loads.
5959

@@ -84,7 +84,7 @@ The maintenance_work_mem can be set to a maximum of 2 GB on a flexible server. `
8484

8585
`checkpoint_timeout`
8686

87-
On the flexible server, the checkpoint_timeout can be increased to maximum 24h from default 5 minutes. It is advised to increase the value to 1 hour before initial data loads on Flexible server.
87+
On the flexible server, the checkpoint_timeout can be increased to maximum 24 h from default 5 minutes. It's advised to increase the value to 1 hour before initial data loads on Flexible server.
8888

8989
`checkpoint_completion_target`
9090

@@ -96,12 +96,12 @@ The max_wal_size can be set to the maximum allowed value on the Flexible server,
9696

9797
`wal_compression`
9898

99-
wal_compression can be turned on. Enabling the parameter can reduce the WAL volume without increasing the risk of unrecoverable data corruption, but at the cost of some extra CPU spent on the compression during WAL logging and on the decompression during WAL replay.
99+
wal_compression can be turned on. Enabling the parameter can have some extra CPU cost spent on the compression during WAL logging and on the decompression during WAL replay.
100100

101101

102102
#### Flexible server recommendations
103103

104-
Before the start of initial data load on a Flexible server, it is recommended to
104+
Before the start of initial data load on a Flexible server, it's recommended to
105105

106106
- Disable high availability [HA] on the server. You can enable HA once initial load is completed on master/primary.
107107
- Create read replicas after initial data load is completed.
@@ -118,7 +118,7 @@ Sets the maximum number of workers that the system can support for parallel quer
118118

119119
`max_parallel_maintenance_workers`
120120

121-
Controls the maximum number of worker process, which can be used to CREATE INDEX.
121+
Controls the maximum number of worker processes, which can be used to CREATE INDEX.
122122

123123
One could also create the indexes by making recommended settings at the session level. An example of how it can be done at the session level is shown below:
124124

@@ -133,7 +133,7 @@ CREATE INDEX test_index ON test_table (test_column);
133133

134134
#### Table partitioning
135135

136-
It is always recommended to partition large tables. Some advantages of partitioning, especially during incremental loads:
136+
It's always recommended to partition large tables. Some advantages of partitioning, especially during incremental loads:
137137
- Creation of new partitions based on the new deltas makes it efficient to add new data to the table.
138138
- Maintenance of tables becomes easier. One can drop a partition during incremental data loads avoiding time-consuming deletes on large tables.
139139
- Autovacuum would be triggered only on partitions that were changed or added during incremental loads, which make maintaining statistics on the table easier.
@@ -145,7 +145,7 @@ Monitoring and maintaining table statistics is important for query performance o
145145
#### Index creation on foreign key constraints
146146

147147
Creating indexes on foreign keys in the child tables would be beneficial in the following scenarios:
148-
- Data updates or deletions in the parent table. When data is updated or deleted in the parent table lookups would be performed on the child table.To make lookups faster you could index foreign keys on the child table.
148+
- Data updates or deletions in the parent table. When data is updated or deleted in the parent table lookups would be performed on the child table. To make lookups faster, you could index foreign keys on the child table.
149149
- Queries, where we see join between parent and child tables on key columns.
150150

151151
#### Unused indexes
@@ -235,7 +235,7 @@ SELECT round (pg_wal_lsn_diff('LSN value when run second time','LSN value when r
235235

236236
`wal_compression`
237237

238-
wal_compression can be turned on. Enabling the parameter can reduce the WAL volume without increasing the risk of unrecoverable data corruption, but at the cost of some extra CPU spent on the compression during WAL logging and on the decompression during WAL replay.
238+
wal_compression can be turned on. Enabling the parameter can have some extra CPU cost spent on the compression during WAL logging and on the decompression during WAL replay.
239239

240240

241241
## Next steps

articles/postgresql/flexible-server/how-to-high-io-utilization.md

Lines changed: 24 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
---
2-
title: High IOPS Utilization
2+
title: High IOPS Utilization for Azure Database for PostgreSQL - Flexible Server
33
description: Troubleshooting guide for high IOPS utilization in Azure Database for PostgreSQL - Flexible Server
44
author: sarat0681
55
ms.author: sbalijepalli
@@ -9,7 +9,7 @@ ms.date: 08/16/2022
99
ms.custom: template-how-to #Required; leave this attribute/value as-is.
1010
---
1111

12-
# Troubleshoot high IOPS utilization in Azure Database for PostgreSQL - Flexible Server
12+
# Troubleshoot high IOPS utilization for Azure Database for PostgreSQL - Flexible Server
1313

1414
This article shows you how to quickly identify the root cause of high IOPS utilization and possible remedial actions to control IOPS utilization when using [Azure Database for PostgreSQL - Flexible Server](overview.md).
1515

@@ -19,15 +19,15 @@ In this article, you learn:
1919
- How to identify root causes, such as long-running queries, checkpoint timings, disruptive autovacuum daemon process, and high storage utilization.
2020
- How to resolve high IO utilization using Explain Analyze, tune checkpoint-related server parameters, and tune autovacuum daemon.
2121

22-
### Tools to identify high IO utilization
22+
## Tools to identify high IO utilization
2323

2424
Consider these tools to identify high IO utilization.
2525

26-
#### Azure metrics
26+
### Azure metrics
2727

2828
Azure Metrics is a good starting point to check the IO utilization for the definite date and period. Metrics give information about the time duration the IO utilization is high. Compare the graphs of Write IOPs, Read IOPs, Read Throughput, and Write Throughput to find out times when the workload caused high IO utilization. For proactive monitoring, you can configure alerts on the metrics. For step-by-step guidance, see [Azure Metrics](./howto-alert-on-metrics.md).
2929

30-
#### Query store
30+
### Query store
3131

3232
Query Store automatically captures the history of queries and runtime statistics and retains them for your review. It slices the data by time to see temporal usage patterns. Data for all users, databases, and queries is stored in a database named azure_sys in the Azure Database for PostgreSQL instance. For step-by-step guidance, see [Query Store](./concepts-query-store.md).
3333

@@ -38,7 +38,7 @@ select * from query_store.qs_view qv where is_system_query is FALSE
3838
order by blk_read_time + blk_write_time desc limit 5;
3939
```
4040

41-
#### pg_stat_statements
41+
### pg_stat_statements
4242

4343
The pg_stat_statements extension helps identify queries that consume IO on the server.
4444

@@ -54,11 +54,11 @@ LIMIT 5;
5454
> [!NOTE]
5555
> When using query store or pg_stat_statements for columns blk_read_time and blk_write_time to be populated enable server parameter `track_io_timing`.For more information about the **track_io_timing** parameter, review [Server Parameters](https://www.postgresql.org/docs/current/runtime-config-statistics.html).
5656
57-
### Identify root causes
57+
## Identify root causes
5858

5959
If IO consumption levels are high in general, the following could be possible root causes:
6060

61-
#### Long-running transactions
61+
### Long-running transactions
6262

6363
Long-running transactions can consume IO, that can lead to high IO utilization.
6464

@@ -71,13 +71,13 @@ WHERE pid <> pg_backend_pid() and state IN ('idle in transaction', 'active')
7171
ORDER BY duration DESC;
7272
```
7373

74-
#### Checkpoint timings
74+
### Checkpoint timings
7575

7676
High IO can also be seen in scenarios where a checkpoint is happening too frequently. One way to identify this is by checking the Postgres log file for the following log text "LOG: checkpoints are occurring too frequently."
7777

7878
You could also investigate using an approach where periodic snapshots of `pg_stat_bgwriter` with a timestamp is saved. Using the snapshots saved the average checkpoint interval, number of checkpoints requested and number of checkpoints timed can be calculated.
7979

80-
#### Disruptive autovacuum daemon process
80+
### Disruptive autovacuum daemon process
8181

8282
Execute the below query to monitor autovacuum:
8383

@@ -90,15 +90,15 @@ The query is used to check how frequently the tables in the database are being v
9090
**autovacuum_count** : provides number of times the table was vacuumed.
9191
**autoanalyze_count**: provides number of times the table was analyzed.
9292

93-
### Resolve high IO utilization
93+
## Resolve high IO utilization
9494

9595
To resolve high IO utilization, there are three methods you could employ - using Explain Analyze, terminating long-running transactions, or tuning server parameters.
9696

97-
#### Explain Analyze
97+
### Explain Analyze
9898

9999
Once you identify the query that's consuming high IO, use **EXPLAIN ANALYZE** to further investigate the query and tune it. For more information about the **EXPLAIN ANALYZE** command, review [Explain Plan](https://www.postgresql.org/docs/current/sql-explain.html).
100100

101-
#### Terminating long running transactions
101+
### Terminating long running transactions
102102

103103
You could consider killing a long running transaction as an option.
104104

@@ -113,19 +113,19 @@ ORDER BY duration DESC;
113113

114114
You can also filter by other properties like `usename` (username), `datname` (database name) etc.
115115

116-
Once you have the session's PID you can terminate using the following query:
116+
Once you have the session's PID, you can terminate using the following query:
117117

118118
```sql
119119
SELECT pg_terminate_backend(pid);
120120
```
121121

122-
#### Server parameter tuning
122+
### Server parameter tuning
123123

124124
If it's observed that the checkpoint is happening too frequently, increase `max_wal_size` server parameter until most checkpoints are time driven, instead of requested. Eventually, 90% or more should be time based, and the interval between two checkpoints is close to the `checkpoint_timeout` set on the server.
125125

126-
##### `max_wal_size`
126+
`max_wal_size`
127127

128-
Peak business hours is a good time to arrive at `max_wal_size` value. Follow the below listed steps to arrive at a value.
128+
Peak business hours are a good time to arrive at `max_wal_size` value. Follow the below listed steps to arrive at a value.
129129

130130
Execute the below query to get current WAL LSN, note down the result:
131131

@@ -145,23 +145,23 @@ Execute below query that uses the two results to check the difference in GB:
145145
select round (pg_wal_lsn_diff ('LSN value when run second time', 'LSN value when run first time')/1024/1024/1024,2) WAL_CHANGE_GB;
146146
```
147147

148-
##### `checkpoint_completion_target`
148+
`checkpoint_completion_target`
149149

150150
A good practice would be to set it to 0.9. As an example, a value of 0.9 for a `checkpoint_timeout` of 5 minutes indicates the target to complete a checkpoint is 270 sec [0.9*300 sec]. A value of 0.9 provides fairly consistent I/O load. An aggressive value of `check_point_completion_target` may result in increased IO load on the server.
151151

152-
##### `checkpoint_timeout`
152+
`checkpoint_timeout`
153153

154-
The `checkpoint_timeout` value can be increased from default value set on the server. Please note while increasing the `checkpoint_timeout` take into consideration that increasing the value would also increase the time for crash recovery.
154+
The `checkpoint_timeout` value can be increased from default value set on the server. Note while increasing the `checkpoint_timeout` take into consideration that increasing the value would also increase the time for crash recovery.
155155

156-
#### Autovacuum tuning to decrease disruptions
156+
### Autovacuum tuning to decrease disruptions
157157

158-
For more details on monitoring and tuning in scenarios where autovacuum is too disruptive please review [Autovacuum Tuning](./how-to-autovacuum-tuning.md).
158+
For more details on monitoring and tuning in scenarios where autovacuum is too disruptive review [Autovacuum Tuning](./how-to-autovacuum-tuning.md).
159159

160-
#### Increase storage
160+
### Increase storage
161161

162162
Increasing storage will also help in addition of more IOPS to the server. For more details on storage and associated IOPS review [Compute and Storage Options](./concepts-compute-storage.md).
163163

164-
### Next steps
164+
## Next steps
165165

166166
- Troubleshoot and tune Autovacuum [Autovacuum Tuning](./how-to-autovacuum-tuning.md)
167167
- Compute and Storage Options [Compute and Storage Options](./concepts-compute-storage.md)

0 commit comments

Comments
 (0)