Skip to content

Commit 9a43efc

Browse files
Merge pull request #268969 from GayathriPaderla/main
[PostgreSQL] New PG Partman article
2 parents a7a56b2 + e788cc0 commit 9a43efc

10 files changed

+254
-0
lines changed

articles/postgresql/TOC.yml

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -425,6 +425,12 @@
425425
- name: Optimize performance when using pgvector
426426
href: flexible-server/how-to-optimize-performance-pgvector.md
427427
displayName: vector databases
428+
- name: Partitioning
429+
items:
430+
- name: Partitioning using pg_partman
431+
href: flexible-server/how-to-use-pg-partman.md
432+
displayName: pg_partman
433+
- name: Migration
428434
- name: Migration Service
429435
items:
430436
- name: Overview
Lines changed: 248 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,248 @@
1+
---
2+
title: How to enable and use pg_partman - Azure Database for PostgreSQL - Flexible Server
3+
description: How to enable and use pg_partman on Azure Database for PostgreSQL - Flexible Server
4+
ms.author: gapaderla
5+
author: GayathriPaderla
6+
ms.reviewer: sbalijepalli
7+
ms.service: postgresql
8+
ms.subservice: flexible-server
9+
ms.topic: how-to
10+
ms.date: 03/14/2024
11+
---
12+
13+
# How to enable and use `pg_partman` on Azure Database for PostgreSQL - Flexible Server
14+
15+
**Optimize Azure Database for PostgreSQL Flexible Server by using pg_partman**  
16+
17+
When tables in the database get large, it's hard to manage how often they're vacuumed, how much space they take up, and how to keep their indexes efficient. This can make queries slower and affect performance. Partitioning of large tables is a solution for these situations. In this article, you find out how to use pg_partman extension to create range-based partitions of tables in your Azure Database for PostgreSQL Flexible Server.  
18+
19+
## Prerequisites
20+
21+
To enable pg_partman extension, follow these steps.
22+
23+
- Add pg_partman extension under azure extensions as shown from server parameters on the portal.
24+
25+
:::image type="content" source="media/how-to-use-pg-partman/pg-partman-prerequisites.png" alt-text="Screenshot of prerequisites.":::
26+
27+
```sql
28+
CREATE EXTENSION PG_PARTMAN;
29+
```
30+
31+
## Overview
32+
33+
When an identity feature uses sequences, the data that comes from the parent table gets new sequence value. It doesn't generate new sequence values when the data is directly added to the child table. 
34+
35+
PG_partman uses a template to control whether the table is UNLOGGED or not. This means that the Alter table command can't change this status for a partition set. By changing the status on the template, you can apply it to all future partitions. But for existing child tables, you must use the Alter command manually. [Here](https://www.postgresql.org/message-id/flat/15954-b61523bed4b110c4%40postgresql.org) is a bug that shows why.    
36+
37+
There's another extension related to PG_partman called pg_partman_bgw, which must be included in Shared_Preload_Libraries. It offers a scheduled function run_maintenance(). It takes care of the partition sets that have automatic_maintenance turned ON in `part_config`
38+
39+
:::image type="content" source="media/how-to-use-pg-partman/pg-partman-prerequisites-outlined.png" alt-text="Screenshot of prerequisites highlighted.":::
40+
41+
You can use server parameters in the Azure portal to change the following configuration options that affect the BGW process: 
42+
43+
`pg_partman_bgw.dbname` - Required. This parameter should contain one or more databases that run_maintenance() needs to be run on. If more than one, use a comma separated list. If nothing is set, BGW doesn't run the procedure. 
44+
45+
`pg_partman_bgw.interval` - Number of seconds between calls to run_maintenance() procedure. Default is 3600 (1 hour). This can be updated based on the requirement of the project. 
46+
47+
`pg_partman_bgw.role` - The role that run_maintenance() procedure runs as. Default is postgres. Only a single role name is allowed. 
48+
49+
`pg_partman_bgw.analyze` - By default, it's set to OFF. Same purpose as the p_analyze argument to run_maintenance(). 
50+
51+
`pg_partman_bgw.jobmon` - Same purpose as the p_jobmon argument to run_maintenance(). By default, it's set to ON. 
52+
53+
## Permissions 
54+
55+
Pg_partman doesn't require a super user role to run. The only requirement is that the role that runs pg_partman functions has ownership over all the partition sets/schema where new objects will be created. It's recommended to create a separate role for pg_partman and give it ownership over the schema/all the objects that pg_partman will operate on. 
56+
57+
```sql
58+
CREATE ROLE partman_role WITH LOGIN; 
59+
CREATE SCHEMA partman
60+
GRANT ALL ON SCHEMA partman TO partman_role; 
61+
GRANT ALL ON ALL TABLES IN SCHEMA partman TO partman_role; 
62+
GRANT EXECUTE ON ALL FUNCTIONS IN SCHEMA partman TO partman_role; 
63+
GRANT EXECUTE ON ALL PROCEDURES IN SCHEMA partman TO partman_role; 
64+
GRANT ALL ON SCHEMA <partition_schema> TO partman_role; 
65+
GRANT TEMPORARY ON DATABASE <databasename> to partman_role; --  this allows creation  of temporary table to move data. 
66+
```
67+
## Creating partitions
68+
69+
Pg_partman relies on range type partitions and not on trigger-based partitions. This shows how pg_partman assists with the partitioning of a table. 
70+
71+
```sql
72+
CREATE SCHEMA partman
73+
CREATE TABLE partman.partition_test 
74+
(a_int INT, b_text TEXT,c_text TEXT,d_date TIMESTAMP DEFAULT now()) 
75+
PARTITION BY RANGE(d_date); 
76+
CREATE INDEX idx_partition_date ON partman.partition_test(d_date); 
77+
```
78+
79+
:::image type="content" source="media/how-to-use-pg-partman/pg-partman-table-output.png" alt-text="Screenshot of table output.":::
80+
81+
Using the create_parent function, you can set up the number of partitions you want on the partition table. 
82+
83+
```sql
84+
SELECT public.create_parent
85+
p_parent_table := 'partman.partition_test'
86+
p_control := 'd_date'
87+
p_type := 'native'
88+
p_interval := 'daily'
89+
p_premake :=20
90+
p_start_partition := (now() - interval '10 days')::date::text  
91+
);
92+
93+
UPDATE public.part_config   
94+
SET infinite_time_partitions = true,  
95+
    retention = '1 hour',   
96+
    retention_keep_table=true   
97+
        WHERE parent_table = 'partman.partition_test';  
98+
```
99+
100+
This command divides the p_parent_table into smaller parts based on the p_control column, using native partitioning (the other option is trigger-based partitioning, but pg_partman doesn't support it yet). The partitions are created at a daily interval. We'll create 20 future partitions in advance, instead of the default value of 4. We'll also specify the p_start_partition, where we mention the past date from which the partitions should start. 
101+
102+
The `create_parent()` function populates two tables `part_config` and `part_config_sub`. There's a maintenance function `run_maintenance()`. You can schedule a cron job for this procedure to run on a periodic basis. This function checks all parent tables in *part_config* table and creates new partitions for them or runs the tables set retention policy. To know more about the functions and tables in pg_partman go through [here.](https://github.com/pgpartman/pg_partman/blob/master/doc/pg_partman.md) 
103+
104+
To create new partitions every time the `run_maintenance()` is run in the background using `bgw` extension, run the below update statement. 
105+
106+
```sql
107+
update partman.part_config set premake = premake+1 where parent_table = 'partman.partition_test'
108+
```
109+
110+
If the premake is the same and your run_maintenance() procedure is run, there wont be any new partitions created for that day. For the next day as premake defines from the current day a new partition for a day is created with the execution of you run_maintenance() function. 
111+
112+
Using the insert command below, insert 100k rows  for each month. 
113+
114+
```sql
115+
insert into partman.partition_test select generate_series(1,100000),generate_series(1, 100000) || 'abcdefghijklmnopqrstuvwxyz'
116+
117+
generate_series(1, 100000) || 'zyxwvutsrqponmlkjihgfedcba', generate_series (timestamp '2024-03-01',timestamp '2024-03-30', interval '1 day ') ; 
118+
119+
insert into partman.partition_test select generate_series(100000,200000),generate_series(100000,200000) || 'abcdefghijklmnopqrstuvwxyz'
120+
121+
generate_series(100000,200000) || 'zyxwvutsrqponmlkjihgfedcba', generate_series (timestamp '2024-04-01',timestamp '2024-04-30', interval '1 day') ; 
122+
123+
insert into partman.partition_test select generate_series(200000,300000),generate_series(200000,300000) || 'abcdefghijklmnopqrstuvwxyz'
124+
125+
generate_series(200000,300000) || 'zyxwvutsrqponmlkjihgfedcba', generate_series (timestamp '2024-05-01',timestamp '2024-05-30', interval '1 day') ; 
126+
127+
insert into partman.partition_test select generate_series(300000,400000),generate_series(300000,400000) || 'abcdefghijklmnopqrstuvwxyz'
128+
129+
generate_series(300000,400000) || 'zyxwvutsrqponmlkjihgfedcba', generate_series (timestamp '2024-06-01',timestamp '2024-06-30', interval '1 day') ; 
130+
131+
insert into partman.partition_test select generate_series(400000,500000),generate_series(400000,500000) || 'abcdefghijklmnopqrstuvwxyz'
132+
133+
generate_series(400000,500000) || 'zyxwvutsrqponmlkjihgfedcba', generate_series (timestamp '2024-07-01',timestamp '2024-07-30', interval '1 day') ; 
134+
```
135+
136+
Run the command below to see the partitions created. 
137+
138+
```sql
139+
Postgres=> \d+ partman.partition_test;
140+
```
141+
142+
:::image type="content" source="media/how-to-use-pg-partman/pg-partman-table-output-partitions.png" alt-text="Screenshot of table out with partitions." lightbox="media/how-to-use-pg-partman/pg-partman-table-output-partitions.png":::
143+
144+
Here's the output of the select statement executed.
145+
146+
:::image type="content" source="media/how-to-use-pg-partman/pg-partman-explain-plan-output.png" alt-text="Screenshot of explain plan output." lightbox="media/how-to-use-pg-partman/pg-partman-explain-plan-output.png":::
147+
148+
## How to manually run the run_maintenance procedure
149+
150+
```sql
151+
select partman.run_maintenance(p_parent_table:='partman.partition_test');
152+
```
153+
154+
> [!WARNING]
155+
> If you insert data before creating partitions, the data goes to the default partition. If the default partition has data that belongs to a new partition that you want to be created later, then you get a default partition violation error and the procedure won't work. Therefore, change the premake value as recommended above and then run the procedure.
156+
157+
## How to schedule maintenance procedure using pg_cron
158+
159+
Run the maintenance procedure using pg_cron. To enable `pg_cron` on your server follow the below steps.
160+
1. Add PG_CRON to `azure.extensions`, `Shared_preload_libraries` and `cron.database_name` server parameter from Azure portal.
161+
162+
:::image type="content" source="media/how-to-use-pg-partman/pg-partman-pgcron-prerequisites.png" alt-text="Screenshot of pgcron prerequisites.":::
163+
164+
:::image type="content" source="media/how-to-use-pg-partman/pg-partman-pgcron-prerequisites-2.png" alt-text="Screenshot of pgcron prerequisites2.":::
165+
166+
:::image type="content" source="media/how-to-use-pg-partman/pg-partman-pgcron-database-name.png" alt-text="Screenshot of pgcron databasename.":::
167+
168+
2. Hit Save button and let the deployment complete. 
169+
170+
3. Once done the pg_cron is automatically created. If you still, try to install then you get the below message. 
171+
172+
```sql
173+
postgres=> CREATE EXTENSION pg_cron; 
174+
ERROR:  extension "pg_cron" already exists 
175+
176+
postgres=> 
177+
```
178+
179+
4. To schedule the cron job, use the below command. 
180+
181+
```sql
182+
postgres=> SELECT cron.schedule_in_database('sample_job','@hourly', $$SELECT partman.run_maintenance(p_parent_table:= 'partman.partition_test')$$,'postgres'); 
183+
```
184+
185+
5. You can view all the cron job using the command below. 
186+
187+
```sql
188+
postgres=> select * from cron.job; 
189+
190+
-[ RECORD 1 ]----------------------------------------------------------------------- 
191+
192+
jobid    | 1 
193+
schedule | @hourly 
194+
command  | SELECT partman.run_maintenance(p_parent_table:= 'partman.partition_test') 
195+
nodename | /tmp 
196+
nodeport | 5432 
197+
database | postgres 
198+
username | postgres 
199+
active   | t 
200+
jobname  | sample_job 
201+
```
202+
203+
6. Run history of the job can be checked using the command below. 
204+
205+
```sql
206+
postgres=> select * from cron.job_run_details; 
207+
208+
(0 rows) 
209+
```
210+
211+
Currently the results show 0 records as the job has not run yet. 
212+
213+
7. To unschedule the cron job, use the command below. 
214+
215+
```sql
216+
postgres=> select cron.unschedule(1); 
217+
```
218+
219+
## Limitations and considerations
220+
221+
- Why is my `bgw` not running the maintenance proc based on the interval provided. 
222+
223+
Check the server parameter  `pg_partman_bgw.dbname` and update it with the proper databasename. Also, check the server parameter `pg_partman_bgw.role` and provide the appropriate role with the role. You should also make sure you connecting to server using the same user to create the extension instead of postgres. 
224+
225+
- I'm encountering an error when my bgw is running the maintenance proc. What could be the reasons? 
226+
227+
Same as above. 
228+
229+
- How to set the partitions to start from the previous day. 
230+
231+
`p_start_partition` in which we mention the previous date from which the partition needs to be created. 
232+
233+
This can be done by running the command below. 
234+
235+
```sql
236+
SELECT public.create_parent( 
237+
p_parent_table := 'partman.partition_test'
238+
p_control := 'd_date'
239+
p_type := 'native'
240+
p_interval := 'daily'
241+
p_premake :=20, 
242+
p_start_partition := (now() - interval '10 days')::date::text  
243+
);
244+
```
245+
246+
## Related content
247+
248+
- [pg vector](how-to-use-pgvector.md)
204 KB
Loading
24.3 KB
Loading
73 KB
Loading
61.6 KB
Loading
75.2 KB
Loading
23.9 KB
Loading
532 KB
Loading
43.3 KB
Loading

0 commit comments

Comments
 (0)