|
| 1 | +--- |
| 2 | +title: How to enable and use pg_partman - Azure Database for PostgreSQL - Flexible Server |
| 3 | +description: How to enable and use pg_partman on Azure Database for PostgreSQL - Flexible Server |
| 4 | +ms.author: gapaderla |
| 5 | +author: GayathriPaderla |
| 6 | +ms.reviewer: sbalijepalli |
| 7 | +ms.service: postgresql |
| 8 | +ms.subservice: flexible-server |
| 9 | +ms.topic: how-to |
| 10 | +ms.date: 03/14/2024 |
| 11 | +--- |
| 12 | + |
| 13 | +# How to enable and use `pg_partman` on Azure Database for PostgreSQL - Flexible Server |
| 14 | + |
| 15 | +**Optimize Azure Database for PostgreSQL Flexible Server by using pg_partman** |
| 16 | + |
| 17 | +When tables in the database get large, it's hard to manage how often they're vacuumed, how much space they take up, and how to keep their indexes efficient. This can make queries slower and affect performance. Partitioning of large tables is a solution for these situations. In this article, you find out how to use pg_partman extension to create range-based partitions of tables in your Azure Database for PostgreSQL Flexible Server. |
| 18 | + |
| 19 | +## Prerequisites |
| 20 | + |
| 21 | +To enable pg_partman extension, follow these steps. |
| 22 | + |
| 23 | +- Add pg_partman extension under azure extensions as shown from server parameters on the portal. |
| 24 | + |
| 25 | +:::image type="content" source="media/how-to-use-pg-partman/pg-partman-prerequisites.png" alt-text="Screenshot of prerequisites."::: |
| 26 | + |
| 27 | +```sql |
| 28 | +CREATE EXTENSION PG_PARTMAN; |
| 29 | +``` |
| 30 | + |
| 31 | +## Overview |
| 32 | + |
| 33 | +When an identity feature uses sequences, the data that comes from the parent table gets new sequence value. It doesn't generate new sequence values when the data is directly added to the child table. |
| 34 | + |
| 35 | +PG_partman uses a template to control whether the table is UNLOGGED or not. This means that the Alter table command can't change this status for a partition set. By changing the status on the template, you can apply it to all future partitions. But for existing child tables, you must use the Alter command manually. [Here](https://www.postgresql.org/message-id/flat/15954-b61523bed4b110c4%40postgresql.org) is a bug that shows why. |
| 36 | + |
| 37 | +There's another extension related to PG_partman called pg_partman_bgw, which must be included in Shared_Preload_Libraries. It offers a scheduled function run_maintenance(). It takes care of the partition sets that have automatic_maintenance turned ON in `part_config`. |
| 38 | + |
| 39 | +:::image type="content" source="media/how-to-use-pg-partman/pg-partman-prerequisites-outlined.png" alt-text="Screenshot of prerequisites highlighted."::: |
| 40 | + |
| 41 | +You can use server parameters in the Azure portal to change the following configuration options that affect the BGW process: |
| 42 | + |
| 43 | +`pg_partman_bgw.dbname` - Required. This parameter should contain one or more databases that run_maintenance() needs to be run on. If more than one, use a comma separated list. If nothing is set, BGW doesn't run the procedure. |
| 44 | + |
| 45 | +`pg_partman_bgw.interval` - Number of seconds between calls to run_maintenance() procedure. Default is 3600 (1 hour). This can be updated based on the requirement of the project. |
| 46 | + |
| 47 | +`pg_partman_bgw.role` - The role that run_maintenance() procedure runs as. Default is postgres. Only a single role name is allowed. |
| 48 | + |
| 49 | +`pg_partman_bgw.analyze` - By default, it's set to OFF. Same purpose as the p_analyze argument to run_maintenance(). |
| 50 | + |
| 51 | +`pg_partman_bgw.jobmon` - Same purpose as the p_jobmon argument to run_maintenance(). By default, it's set to ON. |
| 52 | + |
| 53 | +## Permissions |
| 54 | + |
| 55 | +Pg_partman doesn't require a super user role to run. The only requirement is that the role that runs pg_partman functions has ownership over all the partition sets/schema where new objects will be created. It's recommended to create a separate role for pg_partman and give it ownership over the schema/all the objects that pg_partman will operate on. |
| 56 | + |
| 57 | +```sql |
| 58 | +CREATE ROLE partman_role WITH LOGIN; |
| 59 | +CREATE SCHEMA partman; |
| 60 | +GRANT ALL ON SCHEMA partman TO partman_role; |
| 61 | +GRANT ALL ON ALL TABLES IN SCHEMA partman TO partman_role; |
| 62 | +GRANT EXECUTE ON ALL FUNCTIONS IN SCHEMA partman TO partman_role; |
| 63 | +GRANT EXECUTE ON ALL PROCEDURES IN SCHEMA partman TO partman_role; |
| 64 | +GRANT ALL ON SCHEMA <partition_schema> TO partman_role; |
| 65 | +GRANT TEMPORARY ON DATABASE <databasename> to partman_role; -- this allows creation of temporary table to move data. |
| 66 | +``` |
| 67 | +## Creating partitions |
| 68 | + |
| 69 | +Pg_partman relies on range type partitions and not on trigger-based partitions. This shows how pg_partman assists with the partitioning of a table. |
| 70 | + |
| 71 | +```sql |
| 72 | +CREATE SCHEMA partman; |
| 73 | +CREATE TABLE partman.partition_test |
| 74 | +(a_int INT, b_text TEXT,c_text TEXT,d_date TIMESTAMP DEFAULT now()) |
| 75 | +PARTITION BY RANGE(d_date); |
| 76 | +CREATE INDEX idx_partition_date ON partman.partition_test(d_date); |
| 77 | +``` |
| 78 | + |
| 79 | +:::image type="content" source="media/how-to-use-pg-partman/pg-partman-table-output.png" alt-text="Screenshot of table output."::: |
| 80 | + |
| 81 | +Using the create_parent function, you can set up the number of partitions you want on the partition table. |
| 82 | + |
| 83 | +```sql |
| 84 | +SELECT public.create_parent( |
| 85 | +p_parent_table := 'partman.partition_test', |
| 86 | +p_control := 'd_date', |
| 87 | +p_type := 'native', |
| 88 | +p_interval := 'daily', |
| 89 | +p_premake :=20, |
| 90 | +p_start_partition := (now() - interval '10 days')::date::text |
| 91 | +); |
| 92 | + |
| 93 | +UPDATE public.part_config |
| 94 | +SET infinite_time_partitions = true, |
| 95 | + retention = '1 hour', |
| 96 | + retention_keep_table=true |
| 97 | + WHERE parent_table = 'partman.partition_test'; |
| 98 | +``` |
| 99 | + |
| 100 | +This command divides the p_parent_table into smaller parts based on the p_control column, using native partitioning (the other option is trigger-based partitioning, but pg_partman doesn't support it yet). The partitions are created at a daily interval. We'll create 20 future partitions in advance, instead of the default value of 4. We'll also specify the p_start_partition, where we mention the past date from which the partitions should start. |
| 101 | + |
| 102 | +The `create_parent()` function populates two tables `part_config` and `part_config_sub`. There's a maintenance function `run_maintenance()`. You can schedule a cron job for this procedure to run on a periodic basis. This function checks all parent tables in *part_config* table and creates new partitions for them or runs the tables set retention policy. To know more about the functions and tables in pg_partman go through [here.](https://github.com/pgpartman/pg_partman/blob/master/doc/pg_partman.md) |
| 103 | + |
| 104 | +To create new partitions every time the `run_maintenance()` is run in the background using `bgw` extension, run the below update statement. |
| 105 | + |
| 106 | +```sql |
| 107 | +update partman.part_config set premake = premake+1 where parent_table = 'partman.partition_test'; |
| 108 | +``` |
| 109 | + |
| 110 | +If the premake is the same and your run_maintenance() procedure is run, there wont be any new partitions created for that day. For the next day as premake defines from the current day a new partition for a day is created with the execution of you run_maintenance() function. |
| 111 | + |
| 112 | +Using the insert command below, insert 100k rows for each month. |
| 113 | + |
| 114 | +```sql |
| 115 | +insert into partman.partition_test select generate_series(1,100000),generate_series(1, 100000) || 'abcdefghijklmnopqrstuvwxyz', |
| 116 | + |
| 117 | +generate_series(1, 100000) || 'zyxwvutsrqponmlkjihgfedcba', generate_series (timestamp '2024-03-01',timestamp '2024-03-30', interval '1 day ') ; |
| 118 | + |
| 119 | +insert into partman.partition_test select generate_series(100000,200000),generate_series(100000,200000) || 'abcdefghijklmnopqrstuvwxyz', |
| 120 | + |
| 121 | +generate_series(100000,200000) || 'zyxwvutsrqponmlkjihgfedcba', generate_series (timestamp '2024-04-01',timestamp '2024-04-30', interval '1 day') ; |
| 122 | + |
| 123 | +insert into partman.partition_test select generate_series(200000,300000),generate_series(200000,300000) || 'abcdefghijklmnopqrstuvwxyz', |
| 124 | + |
| 125 | +generate_series(200000,300000) || 'zyxwvutsrqponmlkjihgfedcba', generate_series (timestamp '2024-05-01',timestamp '2024-05-30', interval '1 day') ; |
| 126 | + |
| 127 | +insert into partman.partition_test select generate_series(300000,400000),generate_series(300000,400000) || 'abcdefghijklmnopqrstuvwxyz', |
| 128 | + |
| 129 | +generate_series(300000,400000) || 'zyxwvutsrqponmlkjihgfedcba', generate_series (timestamp '2024-06-01',timestamp '2024-06-30', interval '1 day') ; |
| 130 | + |
| 131 | +insert into partman.partition_test select generate_series(400000,500000),generate_series(400000,500000) || 'abcdefghijklmnopqrstuvwxyz', |
| 132 | + |
| 133 | +generate_series(400000,500000) || 'zyxwvutsrqponmlkjihgfedcba', generate_series (timestamp '2024-07-01',timestamp '2024-07-30', interval '1 day') ; |
| 134 | +``` |
| 135 | + |
| 136 | +Run the command below to see the partitions created. |
| 137 | + |
| 138 | +```sql |
| 139 | +Postgres=> \d+ partman.partition_test; |
| 140 | +``` |
| 141 | + |
| 142 | +:::image type="content" source="media/how-to-use-pg-partman/pg-partman-table-output-partitions.png" alt-text="Screenshot of table out with partitions." lightbox="media/how-to-use-pg-partman/pg-partman-table-output-partitions.png"::: |
| 143 | + |
| 144 | +Here's the output of the select statement executed. |
| 145 | + |
| 146 | +:::image type="content" source="media/how-to-use-pg-partman/pg-partman-explain-plan-output.png" alt-text="Screenshot of explain plan output." lightbox="media/how-to-use-pg-partman/pg-partman-explain-plan-output.png"::: |
| 147 | + |
| 148 | +## How to manually run the run_maintenance procedure |
| 149 | + |
| 150 | +```sql |
| 151 | +select partman.run_maintenance(p_parent_table:='partman.partition_test'); |
| 152 | +``` |
| 153 | + |
| 154 | +> [!WARNING] |
| 155 | +> If you insert data before creating partitions, the data goes to the default partition. If the default partition has data that belongs to a new partition that you want to be created later, then you get a default partition violation error and the procedure won't work. Therefore, change the premake value as recommended above and then run the procedure. |
| 156 | +
|
| 157 | +## How to schedule maintenance procedure using pg_cron |
| 158 | + |
| 159 | +Run the maintenance procedure using pg_cron. To enable `pg_cron` on your server follow the below steps. |
| 160 | +1. Add PG_CRON to `azure.extensions`, `Shared_preload_libraries` and `cron.database_name` server parameter from Azure portal. |
| 161 | + |
| 162 | + :::image type="content" source="media/how-to-use-pg-partman/pg-partman-pgcron-prerequisites.png" alt-text="Screenshot of pgcron prerequisites."::: |
| 163 | + |
| 164 | + :::image type="content" source="media/how-to-use-pg-partman/pg-partman-pgcron-prerequisites-2.png" alt-text="Screenshot of pgcron prerequisites2."::: |
| 165 | + |
| 166 | + :::image type="content" source="media/how-to-use-pg-partman/pg-partman-pgcron-database-name.png" alt-text="Screenshot of pgcron databasename."::: |
| 167 | + |
| 168 | +2. Hit Save button and let the deployment complete. |
| 169 | + |
| 170 | +3. Once done the pg_cron is automatically created. If you still, try to install then you get the below message. |
| 171 | + |
| 172 | + ```sql |
| 173 | + postgres=> CREATE EXTENSION pg_cron; |
| 174 | + ERROR: extension "pg_cron" already exists |
| 175 | + |
| 176 | + postgres=> |
| 177 | + ``` |
| 178 | + |
| 179 | +4. To schedule the cron job, use the below command. |
| 180 | + |
| 181 | + ```sql |
| 182 | + postgres=> SELECT cron.schedule_in_database('sample_job','@hourly', $$SELECT partman.run_maintenance(p_parent_table:= 'partman.partition_test')$$,'postgres'); |
| 183 | + ``` |
| 184 | + |
| 185 | +5. You can view all the cron job using the command below. |
| 186 | + |
| 187 | + ```sql |
| 188 | + postgres=> select * from cron.job; |
| 189 | +
|
| 190 | + -[ RECORD 1 ]----------------------------------------------------------------------- |
| 191 | +
|
| 192 | + jobid | 1 |
| 193 | + schedule | @hourly |
| 194 | + command | SELECT partman.run_maintenance(p_parent_table:= 'partman.partition_test') |
| 195 | + nodename | /tmp |
| 196 | + nodeport | 5432 |
| 197 | + database | postgres |
| 198 | + username | postgres |
| 199 | + active | t |
| 200 | + jobname | sample_job |
| 201 | + ``` |
| 202 | + |
| 203 | +6. Run history of the job can be checked using the command below. |
| 204 | + |
| 205 | + ```sql |
| 206 | + postgres=> select * from cron.job_run_details; |
| 207 | +
|
| 208 | + (0 rows) |
| 209 | + ``` |
| 210 | + |
| 211 | + Currently the results show 0 records as the job has not run yet. |
| 212 | + |
| 213 | +7. To unschedule the cron job, use the command below. |
| 214 | + |
| 215 | + ```sql |
| 216 | + postgres=> select cron.unschedule(1); |
| 217 | + ``` |
| 218 | + |
| 219 | +## Limitations and considerations |
| 220 | + |
| 221 | +- Why is my `bgw` not running the maintenance proc based on the interval provided. |
| 222 | + |
| 223 | + Check the server parameter `pg_partman_bgw.dbname` and update it with the proper databasename. Also, check the server parameter `pg_partman_bgw.role` and provide the appropriate role with the role. You should also make sure you connecting to server using the same user to create the extension instead of postgres. |
| 224 | + |
| 225 | +- I'm encountering an error when my bgw is running the maintenance proc. What could be the reasons? |
| 226 | +
|
| 227 | + Same as above. |
| 228 | +
|
| 229 | +- How to set the partitions to start from the previous day. |
| 230 | +
|
| 231 | + `p_start_partition` in which we mention the previous date from which the partition needs to be created. |
| 232 | +
|
| 233 | + This can be done by running the command below. |
| 234 | +
|
| 235 | + ```sql |
| 236 | + SELECT public.create_parent( |
| 237 | + p_parent_table := 'partman.partition_test', |
| 238 | + p_control := 'd_date', |
| 239 | + p_type := 'native', |
| 240 | + p_interval := 'daily', |
| 241 | + p_premake :=20, |
| 242 | + p_start_partition := (now() - interval '10 days')::date::text |
| 243 | + ); |
| 244 | + ``` |
| 245 | +
|
| 246 | +## Related content |
| 247 | +
|
| 248 | +- [pg vector](how-to-use-pgvector.md) |
0 commit comments