Compressed columns/tables #491

tollefslaathaug · 2025-10-10T07:13:11Z

tollefslaathaug
Oct 10, 2025

It would be nice to be able to support compressed columns - but then again it may be a tall ask. If you have a lot of identical values in some columns it would be nice to reduce memory footprint for very large tables.

NeilMacMullen · 2025-10-16T22:21:14Z

NeilMacMullen
Oct 16, 2025
Maintainer

Apologies - somehow I missed notification on this discussion.

Yes - I agree. The string columns use a StringPool to try and reduce footprint on the basis that many string columns tend to be highly repetitive.

Columns start off being accessed quite linearly which would seem to be a good fit for some kinds of decompression. (Parallelisation means that there are some access discontinuities though.) Where things tend to get discontinuous when the first where/join/lookup operator is hit. The current implementation builds index tables into the original data to avoid having to duplicate it but of course this means that subsequent accesses may be highly non-linear so decompression would need to be able to cope with this efficiently.

Still.. if you have ideas in this area would be happy to discuss more - in principle table access is quite well contained and it should be relatively easy to experiment.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compressed columns/tables #491

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Compressed columns/tables #491

Uh oh!

tollefslaathaug Oct 10, 2025

Replies: 1 comment

Uh oh!

NeilMacMullen Oct 16, 2025 Maintainer

tollefslaathaug
Oct 10, 2025

NeilMacMullen
Oct 16, 2025
Maintainer