-
Notifications
You must be signed in to change notification settings - Fork 36
Open
Description
Environment
- PostgreSQL 16.5 (data_checksums = on)
- pg_squeeze 1.9 (latest)
- Table: comm.lis_dict_lab_vs_item (small dictionary table, low row count, frequent lookups)
- Indexes:
- PRIMARY KEY (report_item_id) unique btree (remains valid)
- Secondary composite btree (lab_id, is_deleted) → consistently becomes invalid
Reproduction (repeatable)
- CREATE EXTENSION amcheck;
- Verify index clean:
SELECT bt_index_check('comm.idx_comm_lis_dict_lab_vs_item_lab_id_deleted_union'::regclass, true); - Run squeeze:
SELECT squeeze.squeeze_table('comm','lis_dict_lab_vs_item', NULL, NULL, NULL);
(Also reproduced with recreate_index := true, and inside/outside explicit transaction; also with ACCESS EXCLUSIVE pre-lock.) - Re-check:
SELECT bt_index_check('comm.idx_comm_lis_dict_lab_vs_item_lab_id_deleted_union'::regclass, true); - Error:
ERROR: heap tuple (0,1) from table "lis_dict_lab_vs_item" lacks matching index tuple within index "idx_comm_lis_dict_lab_vs_item_lab_id_deleted_union"
Behavior Characteristics
- Always affects the composite non-unique index; primary key stays valid.
- REINDEX INDEX CONCURRENTLY fixes immediately.
- Large tables processed by pg_squeeze do NOT show the problem.
- Occurs even during quiescent period (writes paused) and with lock_timeout=0 (unlimited wait).
- bt_index_parent_check reproduces; bt_metap shows normal metadata.
Why small table + composite index?
Hypothesis:
- Fast rewrite of small table narrows time between slot creation and index build; tuples committed just before slot start are missed (shadow table lacks them; index built without them).
- apply_changes() may insert replayed tuples via heap_insert/simple_heap_insert without invoking table_insert or index_insert, so secondary non-unique index misses entries while unique PK path gets handled differently.
- Composite (lab_id, is_deleted) has frequent updates of is_deleted causing UPDATE replay path; if UPDATE handling skips index maintenance for non-unique indexes, gap appears.
- Large tables: longer rewrite duration lets logical decoding capture more WAL; window bug not triggered.
Diagnostics Used
SELECT bt_index_parent_check('comm.idx_comm_lis_dict_lab_vs_item_lab_id_deleted_union'::regclass);
SELECT * FROM bt_metap('comm.idx_comm_lis_dict_lab_vs_item_lab_id_deleted_union');
SELECT count(*) FROM comm.lis_dict_lab_vs_item;
REINDEX INDEX CONCURRENTLY comm.idx_comm_lis_dict_lab_vs_item_lab_id_deleted_union; → restores integrity.
Requested Upstream Actions
- Confirm insertion/update replay uses table_insert() (PG16 table AM) or explicit index_insert for all indexes.
- Review start LSN acquisition relative to lock acquisition for small tables.
- Provide guidance or patch to ensure composite non-unique indexes are updated during WAL change replay.
Potential Patch Direction (pseudo)
/* In change replay: replace raw heap_* calls */
- heap_insert(rel, tuple);
+ table_insert(rel, tuple, GetCurrentCommandId(true), 0, NULL); /* ensures index maintenance */Metadata
Metadata
Assignees
Labels
No labels