Memory overhead of a large number of Partman partitions #788

xcu · 2025-06-20T08:55:18Z

xcu
Jun 20, 2025

I would like to understand how much can Partman contribute to memory usage in Postgres backend processes.

My application mainly runs SELECT statements in a partitioned table. The table is partitioned by size (100K rows per partition, 730 partitions in total). However, each Postgres backend process to run these SELECTs takes ~300 MB in memory, which seems a lot compared to what I've seen around.

work_mem is set to 16MB, so nothing too crazy. Also, I am using Pss to measure the memory used by each backend process so that shared_buffers don't mess the whole thing. Here's the output for a couple of them


Pss:             1260383 kB
Pss_Anon:         305501 kB
Pss_File:            199 kB
Pss_Shmem:        954682 kB

Pss:             1247796 kB
Pss_Anon:         293041 kB
Pss_File:            200 kB
Pss_Shmem:        954554 kB

Is this expected? Or the overhead of Partman partitions should be smaller in terms of memory consumption?

keithf4 · 2025-06-20T15:31:53Z

keithf4
Jun 20, 2025
Maintainer

Moved this to discussion

0 replies

keithf4 · 2025-06-20T15:44:10Z

keithf4
Jun 20, 2025
Maintainer

Just to clarify a bit up front, pg_partman is only using the built in partitioning mechanisms in PostgreSQL itself. So it in itself does not contribute to any further memory consumption outside of that.

How much more memory a partitioned set would use vs a standard table is really up to the data itself being pulled out and what you are doing with that data. There could be additional overhead for the meta-data of multiple tables being involved and joined. But one of the purposes of partitioning is to reduce the amount of pages that need to be pulled into memory for evaluation using partition pruning. So I would hope that when you do your SELECTs, you set the ideal conditions for that query so that the partition pruning reduces the overall number of tables and pages that need to be evaluated. And that should balance that additional table overhead out, or even improve upon it. If you're not seeing partition pruning, then either the query isn't written right or partitioning itself may not be a suitable solution for the types of queries that you run.

PG can let you know if partition pruning is happening and give you a much better idea of how much memory it's using for a given query by doing an EXPLAIN ANALYZE on that query. For any operations that use work_mem, it will give you the exact amounts used for those steps. For shared_buffers it can show you cache hits vs misses to see if maybe you don't have enough memory in the system to contain your average working data set.

If you'd like to get a better idea of how much of a given table is being kept in shared buffers, the pg_buffercache extension can give you some insight into that. https://www.postgresql.org/docs/current/pgbuffercache.html
I also have a few blog posts that discuss how to interpret the data from pg_buffercache better
https://www.keithf4.com/a-large-database-does-not-mean-large-shared_buffers/
https://www.keithf4.com/a-small-database-does-not-mean-small-shared_buffers/

Hopefully that helps. If you still have any questions, let me know.

0 replies

xcu · 2025-06-20T21:32:10Z

xcu
Jun 20, 2025
Author

Thanks for the thorough response Keith, much appreciated!

To provide some more context, no partion pruning is happening here because the table is used to run Approximate Nearest Neighbour searches with pgvector, which means that all 730 partitions must be examined on each SELECT.

The index is fully kept in memory though (otherwise performance would be atrocious), so my first guess is that the only time PG needs to go to disk is to read the selected attributes from the returned rows (typically these SELECTs return 10-20 rows, so pretty much negligible).

I'm not very fond of the internals of Postgres partitioning systems, so I don't know how much extra memory that can add to a regular SELECT that needs to work in all the partitions. Is there a good way to measure that?

0 replies

keithf4 · 2025-06-21T05:31:38Z

keithf4
Jun 21, 2025
Maintainer

Assuming you're partitioning in order to use the retention system to expire old data then? If not, then there's really no need to be partitioning at all here if you're reading the entire data set anyway on every select.

0 replies

keithf4 · 2025-06-21T05:32:48Z

keithf4
Jun 21, 2025
Maintainer

Unless you're having issue with maintenance and vacuuming. That could possibly be another reason for partitioning, but I'd only suggest that if you're actually seeing that be a problem.

0 replies

xcu · 2025-06-21T09:30:54Z

xcu
Jun 21, 2025
Author

There are a couple of reasons why partitioning is very convenient.

First one is recall. For my dataset recall drops as partition size grows. For instance, recall for 100K partitions is ~95%, but drops to ~75% for partitions of 1M (go figure with just a non-partitioned table...)
Second one is index rollout. For this table size the ANN index weights ~200GB, so rolling out a new index without partitioning results in downtime for production systems because it takes a few minutes to bring the new index to memory. While that happens searches can take up to 2 orders of magnitude more due to having to go to disk. With partitioning you can just replace index partitions one by one and wait for Postgres to bring the partition index to memory.

So overall Partman is a really convenient and robust solution for this problem. I want to understand why the PostgreSQL background processes have such a large memory footprint, and whether partitioning could be one of the reasons.

0 replies

keithf4 · 2025-06-22T02:47:55Z

keithf4
Jun 22, 2025
Maintainer

That's a very good use case I hadn't considered. Thank you for sharing!

I'd still recommend having a look at the full explain analyzes of the queries and also investigate your shared buffers usage. Beyond that is a bit beyond my skill to investigate further. I can try asking some of my work colleagues to see if they might know. But you may have some luck asking on either the Postgresql general mailing list or even the hackers list. I'd recommend trying general first.

https://www.postgresql.org/list/

0 replies

xcu · 2025-06-22T18:08:03Z

xcu
Jun 22, 2025
Author

I just emailed general mailing list, hopefully they can give me some more insights. Thanks a lot for all the assistance Keith!

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Memory overhead of a large number of Partman partitions #788

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 8 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Memory overhead of a large number of Partman partitions #788

Uh oh!

Uh oh!

xcu Jun 20, 2025

Replies: 8 comments

Uh oh!

keithf4 Jun 20, 2025 Maintainer

Uh oh!

keithf4 Jun 20, 2025 Maintainer

Uh oh!

xcu Jun 20, 2025 Author

Uh oh!

keithf4 Jun 21, 2025 Maintainer

Uh oh!

keithf4 Jun 21, 2025 Maintainer

Uh oh!

xcu Jun 21, 2025 Author

Uh oh!

keithf4 Jun 22, 2025 Maintainer

Uh oh!

xcu Jun 22, 2025 Author

xcu
Jun 20, 2025

keithf4
Jun 20, 2025
Maintainer

keithf4
Jun 20, 2025
Maintainer

xcu
Jun 20, 2025
Author

keithf4
Jun 21, 2025
Maintainer

keithf4
Jun 21, 2025
Maintainer

xcu
Jun 21, 2025
Author

keithf4
Jun 22, 2025
Maintainer

xcu
Jun 22, 2025
Author