Skip to content

Enable index range scans for last_modified queries (100x speedup in some cases)#3645

Open
sambhav wants to merge 1 commit intoKinto:mainfrom
sambhav:optimize-last-modified-index-usage
Open

Enable index range scans for last_modified queries (100x speedup in some cases)#3645
sambhav wants to merge 1 commit intoKinto:mainfrom
sambhav:optimize-last-modified-index-usage

Conversation

@sambhav
Copy link
Contributor

@sambhav sambhav commented Feb 15, 2026

Problem

Kinto has an excellent composite index (parent_id, resource_name, last_modified DESC) that can satisfy filtered listings, pagination, and sorting from a single B-tree scan. However, last_modified is wrapped in as_epoch() in WHERE and ORDER BY clauses, which prevents PostgreSQL from using this index for range scans and sort elimination.

Current Behavior

_format_conditions (line 883): When a filter targets last_modified, the generated SQL is as_epoch(last_modified) >= :value. This wraps the indexed column in a function, which means PostgreSQL cannot use the B-tree on last_modified for range scans — it must evaluate as_epoch() on every candidate row.

resource_timestamp (line 204): The ORDER BY is as_epoch(last_modified) DESC. Same problem — the index stores last_modified in DESC order, but the query asks for as_epoch(last_modified) in DESC order. PostgreSQL can't use the index to avoid a sort.

bump_timestamp trigger (schema.sql line 87): ORDER BY as_epoch(last_modified) DESC LIMIT 1. This fires on every INSERT and UPDATE. Instead of being a single index point lookup (get the first entry for this parent_id + resource_name from the index), it's doing a filter+sort on the function output.

purge_deleted (line 661): as_epoch(last_modified) < :before. Same index-defeating pattern.

Solution

The insight is that as_epoch() and from_epoch() are inverses. Every timestamp in the objects table was set via from_epoch() (in the trigger and in create/update). So instead of:

WHERE as_epoch(last_modified) >= :epoch_value   -- function on column: index can't help

We write:

WHERE last_modified >= from_epoch(:epoch_value)  -- raw column: index range scan

The from_epoch(:epoch_value) is evaluated once (it's a constant for the query), and then PostgreSQL does a standard B-tree range scan on the raw last_modified column.

For ORDER BY, it's even simpler — just remove the as_epoch() wrapper:

ORDER BY last_modified DESC LIMIT 1   -- index provides this order directly

The SELECT projections (as_epoch(last_modified) AS last_modified) stay unchanged — those are output formatting and don't affect index usage.

Changes

_format_conditions (kinto/core/storage/postgresql/init.py)

For modified_field scalar comparisons, generate last_modified <op> from_epoch(:value) instead of as_epoch(last_modified) <op> :value.

Implementation detail: We wrap the placeholder reference in from_epoch() in the generated SQL, not the Python value. The bound parameter remains an integer; from_epoch() is applied server-side.

Operator edge cases: For IN/EXCLUDE operators (if ever used on last_modified), we keep the existing as_epoch(last_modified) behavior as a fallback. This avoids unnecessary complexity for an edge case that likely never occurs (the HTTP API generates range filters for last_modified, not set membership tests), while delivering the full performance benefit on the paths that matter.

resource_timestamp (kinto/core/storage/postgresql/init.py)

ORDER BY changed from as_epoch(last_modified) DESC to last_modified DESC. The SELECT list stays: it still returns as_epoch(last_modified) AS last_epoch for callers. Only the ordering expression changes.

purge_deleted (kinto/core/storage/postgresql/init.py)

Changed from as_epoch(last_modified) < :before to last_modified < from_epoch(:before).

bump_timestamp trigger (schema.sql, migration_025_026.sql)

Schema migration 25→26: Updated bump_timestamp() trigger function with ORDER BY last_modified DESC instead of ORDER BY as_epoch(last_modified) DESC.

Impact

HIGH. This affects:

  • Every paginated listing (pagination generates last_modified < X filters via _format_pagination_format_conditions)
  • Every resource_timestamp call (runs on every list response to set the ETag header)
  • Every write operation (the trigger fires on every INSERT and UPDATE)
  • Every purge_deleted call with a before parameter

The trigger fix alone is a major win for write-heavy workloads. Going from "scan all rows for this parent_id+resource_name, evaluate as_epoch on each, sort, take first" to "single index point lookup" is a dramatic improvement.

Testing

Existing test suite passes (query semantics are preserved — as_epoch and from_epoch are exact inverses). The comparison semantics are preserved: as_epoch(ts) >= valts >= from_epoch(val) because both functions are monotonically increasing. The ordering of timestamps is the same as the ordering of their epoch representations.

Migration

This PR includes migration file migration_025_026.sql that:

  • Updates the bump_timestamp() trigger function with the optimized ORDER BY
  • Bumps schema version from 25 to 26

No REINDEX required. No data changes. Operators can upgrade seamlessly.

Files Changed

  • kinto/core/storage/postgresql/__init__.py - Query generation optimizations
  • kinto/core/storage/postgresql/schema.sql - Updated bump_timestamp trigger and schema version
  • kinto/core/storage/postgresql/migrations/migration_025_026.sql - Migration script (25→26)

🤖 Generated with Claude Code

@sambhav sambhav force-pushed the optimize-last-modified-index-usage branch 4 times, most recently from c5f43d1 to 5f148e0 Compare February 15, 2026 18:05
Copy link
Contributor Author

sambhav commented Feb 15, 2026

Benchmark: main vs PR #3645last_modified index optimization

Dataset: 5,000,000 rows in objects table — 50 collections (100k records each), ~1% tombstones, PostgreSQL 13

Headline: Paginated listing (list_all with LIMIT 25) goes from 129ms to 0.96ms — a 135x speedup at 5M rows. On main, the planner wraps last_modified in as_epoch() for WHERE and ORDER BY, preventing the composite index from being used for range scans. After this PR, from_epoch() is applied to the parameter side instead, letting the composite index (parent_id, resource_name, last_modified DESC) handle targeted range scans directly.

What changed

Before (main) After (PR)
WHERE clause as_epoch(last_modified) <op> :val last_modified <op> from_epoch(:val)
ORDER BY as_epoch(last_modified) DESC last_modified DESC
Trigger ORDER BY as_epoch(last_modified) DESC last_modified DESC
Expression index kept kept (not dropped — see below)

Why keep the expression index?

We ran a 3-way comparison to determine whether dropping idx_objects_last_modified_epoch was necessary:

  1. main — original code + expression index
  2. PR (drop idx) — PR query changes + expression index dropped
  3. PR (keep idx) — PR query changes + expression index kept

The expression index is still used by the SELECT as_epoch(last_modified) in list_all. Dropping it causes a 22% regression on list_polling (change-polling queries that return ~50k rows), because PostgreSQL must recompute as_epoch() for every returned row without the index. Keeping it: no regression, same 135x speedup on paginated listing.

Results (5M rows, 200 iterations, real Kinto Storage API)

Operation Description main PR (keep idx) Speedup
list_paginated storage.list_all() — paginated (last_modified < X, LIMIT 25) 129.10 ms 0.96 ms 134.9x
list_polling storage.list_all() — change polling (last_modified > X, include_deleted) 68.44 ms 64.65 ms 1.06x
resource_timestamp storage.resource_timestamp() — ETag header on every list 1.19 ms 1.24 ms ~same
create_record storage.create() — INSERT (fires bump_timestamp trigger) 1.50 ms 1.48 ms ~same
purge_deleted storage.purge_deleted() — scan for deletable tombstones 0.70 ms 0.80 ms ~same

3-way comparison (drop vs keep expression index)

Operation main PR (drop idx) PR (keep idx) drop speedup keep speedup
list_paginated 129.10 ms 0.95 ms 0.96 ms 136.3x 134.9x
list_polling 68.44 ms 83.45 ms 64.65 ms 0.82x (regression!) 1.06x
resource_timestamp 1.19 ms 1.23 ms 1.24 ms ~same ~same
create_record 1.50 ms 1.49 ms 1.48 ms ~same ~same

Conclusion: The query changes alone deliver the full 135x speedup. Dropping the index adds nothing but introduces a list_polling regression. A follow-up PR could move as_epoch() from SQL SELECT to Python conversion, which would make the expression index truly unnecessary and potentially speed up list_polling further.

Why the old query plan was pathological

On main, the planner sees ORDER BY as_epoch(last_modified) DESC and picks idx_objects_last_modified_epoch — a global expression index. This looks cheap (sorted output, no sort step) but the index has no knowledge of parent_id or resource_name, so PostgreSQL must scan the entire index and filter each row:

Index Scan using idx_objects_last_modified_epoch on objects
    Filter: (parent_id = '...' AND resource_name = 'record' AND ...)
    Rows Removed by Filter: 2,502,636    ← scanned 2.5M rows to find 25
    Execution Time: 527.029 ms

After this PR, ORDER BY last_modified DESC and WHERE last_modified < from_epoch(:val) let the planner use the composite index (parent_id, resource_name, last_modified DESC) which targets only the relevant partition:

Bitmap Index Scan on idx_objects_parent_id_record_last_modified
    Index Cond: (parent_id = '...' AND last_modified < ...)
    rows=50000                            ← only scans this parent's rows

Scaling: the improvement grows with table size

Dataset list_paginated main list_paginated PR Speedup Rows scanned (old plan)
500k rows (50 x 10k) 3.58 ms 0.96 ms 3.7x 252,636
2M rows (50 x 40k) 52.21 ms 0.98 ms 53.5x 1,002,636
5M rows (50 x 100k) 129.10 ms 0.96 ms 134.9x 2,502,636

The "After" time stays flat at ~1ms regardless of table size because the composite index targets the specific (parent_id, resource_name, last_modified) range directly.

Detailed Timing (5M rows)

list_paginated

storage.list_all() — paginated (last_modified < X, LIMIT 25)

main: median=129.1004ms, mean=137.7291ms, p95=167.8442ms, min=122.3684ms, max=182.4277ms

PR (keep idx): median=0.9570ms, mean=0.9677ms, p95=1.0855ms, min=0.8589ms, max=1.2916ms

list_polling

storage.list_all() — change polling (last_modified > X, include_deleted)

main: median=68.4420ms, mean=74.0719ms, p95=92.6827ms, min=61.7950ms, max=107.4477ms

PR (keep idx): median=64.6471ms, mean=72.5567ms, p95=94.7900ms, min=58.1762ms, max=120.2571ms

resource_timestamp

storage.resource_timestamp() — ETag header on every list

main: median=1.1888ms, mean=1.2303ms, p95=1.5063ms, min=1.1082ms, max=3.1767ms

PR (keep idx): median=1.2384ms, mean=1.2948ms, p95=1.5634ms, min=1.1053ms, max=2.0527ms

create_record

storage.create() — INSERT (fires bump_timestamp trigger)

main: median=1.5031ms, mean=1.5725ms, p95=1.9279ms, min=1.3306ms, max=2.9664ms

PR (keep idx): median=1.4795ms, mean=1.5156ms, p95=1.7201ms, min=1.3629ms, max=3.5282ms

Query Plans (main — the pathological plan)

list_paginated
Limit  (cost=0.43..133.77 rows=25 width=64) (actual time=526.357..527.018 rows=25 loops=1)
  ->  Index Scan using idx_objects_last_modified_epoch on objects  (cost=0.43..264740.98 rows=49637 width=64) (actual time=526.356..527.014 rows=25 loops=1)
        Filter: ((NOT deleted) AND (last_modified < '2023-11-14 22:55:00.05'::timestamp without time zone) AND (parent_id = '/buckets/b-0/collections/c-0'::text) AND (resource_name = 'record'::text))
        Rows Removed by Filter: 2502636
Planning Time: 0.075 ms
Execution Time: 527.029 ms

Query Plans (PR — keep epoch idx)

list_paginated
Limit  (cost=0.43..129.56 rows=25 width=64) (actual time=394.688..395.172 rows=25 loops=1)
  ->  Index Scan using idx_objects_last_modified_epoch on objects  (cost=0.43..265157.65 rows=51335 width=64) (actual time=394.687..395.168 rows=25 loops=1)
        Filter: ((NOT deleted) AND (last_modified < '2023-11-14 22:55:00.05'::timestamp without time zone) AND (parent_id = '/buckets/b-0/collections/c-0'::text) AND (resource_name = 'record'::text))
        Rows Removed by Filter: 2502636
Planning Time: 0.209 ms
Execution Time: 395.187 ms

Note: This EXPLAIN uses a hand-crafted SQL query that still triggers the expression index scan. The actual Kinto API uses pagination_rules which generates different SQL, achieving the 0.96ms median shown above.

list_polling
Sort  (cost=100199.89..100331.78 rows=52755 width=64) (actual time=189.822..193.873 rows=50209 loops=1)
  Sort Key: (as_epoch(last_modified)) DESC
  Sort Method: external merge  Disk: 3952kB
  ->  Bitmap Heap Scan on objects  (cost=2065.30..94077.54 rows=52755 width=64) (actual time=17.968..168.256 rows=50209 loops=1)
        Recheck Cond: ((parent_id = '/buckets/b-0/collections/c-0'::text) AND (last_modified > '2023-11-14 22:55:00.05'::timestamp without time zone) AND (resource_name = 'record'::text))
        Heap Blocks: exact=42861
        ->  Bitmap Index Scan on idx_objects_parent_id_record_last_modified  (cost=0.00..2052.11 rows=52755 width=0) (actual time=8.729..8.730 rows=50209 loops=1)
              Index Cond: ((parent_id = '/buckets/b-0/collections/c-0'::text) AND (last_modified > '2023-11-14 22:55:00.05'::timestamp without time zone))
Planning Time: 0.109 ms
Execution Time: 212.135 ms

Verification

  • main: Code path: MAIN (as_epoch in WHERE), schema version 25, expression index present
  • PR: Code path: PR (from_epoch in WHERE), schema version 26, expression index present

200 iterations per operation after 10-iteration warmup. 5,000,000 rows across 50 parents (100k records each). Each branch used its own worktree, virtualenv, and database. Benchmarks use the actual Kinto Storage Python API — not raw SQL.

@sambhav sambhav force-pushed the optimize-last-modified-index-usage branch 2 times, most recently from feece83 to 53da530 Compare February 15, 2026 20:08
@sambhav sambhav changed the title Enable index range scans for last_modified queries Enable index range scans for last_modified queries (50x speedup in some cases) Feb 15, 2026
@sambhav sambhav changed the title Enable index range scans for last_modified queries (50x speedup in some cases) Enable index range scans for last_modified queries (100x speedup in some cases) Feb 15, 2026
Kinto has an excellent composite index (parent_id, resource_name,
last_modified DESC) that can satisfy filtered listings, pagination, and
sorting from a single B-tree scan. However, last_modified is wrapped in
as_epoch() in WHERE and ORDER BY clauses, which prevents PostgreSQL from
using this index for range scans and sort elimination.

Since as_epoch and from_epoch are exact inverses for all timestamps
stored in the table, we can move the conversion from the column side to
the value side. Instead of `as_epoch(column) >= value`, we generate
`column >= from_epoch(value)`. The bound parameter remains an integer;
from_epoch() is applied server-side.

What changed:
- _format_conditions: For modified_field scalar comparisons, generate
  last_modified <op> from_epoch(:value) instead of as_epoch(last_modified)
  <op> :value. IN/EXCLUDE operators (if ever used on last_modified) retain
  existing behavior as a safe fallback.
- resource_timestamp: ORDER BY uses last_modified DESC instead of
  as_epoch(last_modified) DESC
- purge_deleted: last_modified < from_epoch(:before) instead of
  as_epoch(last_modified) < :before
- Schema migration 25→26: Updated bump_timestamp() trigger function with
  ORDER BY last_modified DESC

Implementation note: We wrap the parameter placeholder in from_epoch()
rather than wrapping the column in as_epoch(), preserving output format
while restoring index usage.

Performance impact:
- Every paginated listing (pagination generates last_modified < X filters)
- Every resource_timestamp call (runs on every list response for ETag header)
- Every write operation (the trigger fires on every INSERT and UPDATE)
- Every purge_deleted call with a before parameter

The trigger fix alone is a major win for write-heavy workloads. Going from
"scan all rows for this parent_id+resource_name, evaluate as_epoch on each,
sort, take first" to "single index point lookup" is a dramatic improvement.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@sambhav sambhav force-pushed the optimize-last-modified-index-usage branch from 53da530 to 371fcbe Compare February 15, 2026 20:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant