Possible optimization for very large tables

I'm going to implement this on my custom subclass of MVTManager but I thought it was worth writing up the plan (and my eventual findings) here in case it's worth implementing upstream.

Problem:

I have a table with a few million rows and it's both slow and fairly useless to return a full result set when zoomed out. You probably only need to return a random subset of the results when zoomed out as most items will end up taking up less than a single pixel.

The slowness comes from 2 sources:

1. the cost of actually querying that many rows
2. the amount of data you have to send down the wire

Solving (2) would be easy if it wasn't for (1). You still pay a performance penalty for any normal approach to filtering the results.

However Postgres supports a very fast way to return a subset of rows and this is done before any WHERE clause is evaluated:

    SELECT * FROM tablename TABLESAMPLE SYSTEM(0.1) WHERE ...;

I'm going to experiment with dynamically calculating the sample % based on zoom level. I have a hunch there's a simple linear formula based on the total rows in the result set and the zoom level that will return a visually similar set of tiles much quicker. It's just a case of tweaking the slope of that formula.

One more point worth noting. Because COUNT itself is slow on large tables there's a trick that gets a fast approximate COUNT:

    SELECT reltuples AS ct FROM pg_class WHERE oid = 'tablename'::regclass;

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Possible optimization for very large tables #9

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Possible optimization for very large tables #9

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions