-
Notifications
You must be signed in to change notification settings - Fork 10
Possible optimization for very large tables #9
Description
I'm going to implement this on my custom subclass of MVTManager but I thought it was worth writing up the plan (and my eventual findings) here in case it's worth implementing upstream.
Problem:
I have a table with a few million rows and it's both slow and fairly useless to return a full result set when zoomed out. You probably only need to return a random subset of the results when zoomed out as most items will end up taking up less than a single pixel.
The slowness comes from 2 sources:
- the cost of actually querying that many rows
- the amount of data you have to send down the wire
Solving (2) would be easy if it wasn't for (1). You still pay a performance penalty for any normal approach to filtering the results.
However Postgres supports a very fast way to return a subset of rows and this is done before any WHERE clause is evaluated:
SELECT * FROM tablename TABLESAMPLE SYSTEM(0.1) WHERE ...;
I'm going to experiment with dynamically calculating the sample % based on zoom level. I have a hunch there's a simple linear formula based on the total rows in the result set and the zoom level that will return a visually similar set of tiles much quicker. It's just a case of tweaking the slope of that formula.
One more point worth noting. Because COUNT itself is slow on large tables there's a trick that gets a fast approximate COUNT:
SELECT reltuples AS ct FROM pg_class WHERE oid = 'tablename'::regclass;