-
-
Notifications
You must be signed in to change notification settings - Fork 62
Description
Bug Report
Describe the bug
Suppose we have a table in Postgres that is partitioned. I was trying to kick off an initial sync of a partitioned table, and ran into PK errors.
I asked ChatGPT 5 Thinking to write me the following SQL command to figure out if the leaf nodes of my partitioned table are correct:
-- Validate that each leaf partition has a UNIQUE, valid index
-- matching the parent table's PRIMARY KEY columns.
WITH RECURSIVE t AS (
SELECT 'redacted_table'::regclass AS relid
UNION ALL
SELECT i.inhparent
FROM t
JOIN pg_inherits i ON i.inhrelid = t.relid
),
parent AS (
SELECT relid
FROM t
WHERE NOT EXISTS (SELECT 1 FROM pg_inherits i WHERE i.inhrelid = t.relid)
LIMIT 1
),
-- Parent PK columns (0 rows if no PK)
pk AS (
SELECT con.conname AS pk_name,
array_agg(a.attname ORDER BY x.n) AS pk_cols
FROM pg_constraint con
JOIN unnest(con.conkey) WITH ORDINALITY AS x(attnum, n) ON true
JOIN pg_attribute a ON a.attrelid = con.conrelid AND a.attnum = x.attnum
WHERE con.contype = 'p' AND con.conrelid = (SELECT relid FROM parent)
GROUP BY con.conname
),
-- Leaf partitions
leaves AS (
SELECT c.oid AS relid, c.oid::regclass::text AS relname
FROM pg_partition_tree((SELECT relid FROM parent)) pt
JOIN pg_class c ON c.oid = pt.relid
WHERE pt.isleaf
)
SELECT * FROM (
-- 1) Parent PK summary
SELECT
'parent_pk'::text AS section,
(SELECT (SELECT relid FROM parent)::regclass::text) AS relname,
pk.pk_name AS pk_name,
pk.pk_cols AS details -- text[]: PK column list
FROM pk
UNION ALL
-- 2) Per-leaf status (does a UNIQUE, valid index exist on the PK cols?)
SELECT
'leaf_check'::text AS section,
l.relname AS relname,
NULL::text AS pk_name,
(CASE WHEN EXISTS (
SELECT 1
FROM pg_index ix
WHERE ix.indrelid = l.relid
AND ix.indisunique
AND ix.indisvalid
AND ARRAY(
SELECT a.attname
FROM unnest(ix.indkey) WITH ORDINALITY k(attnum, ord)
JOIN pg_attribute a
ON a.attrelid = ix.indrelid AND a.attnum = k.attnum
WHERE ord <= ix.indnkeyatts -- exclude INCLUDE columns
ORDER BY ord
) = (SELECT pk_cols FROM pk)
)
THEN ARRAY['OK']::text[]
ELSE ARRAY['MISSING']::text[]
END) AS details -- text[]: OK/MISSING
FROM leaves l
) AS u(section, relname, pk_name, details)
ORDER BY 1, 2; -- section, relname
And returned the following result:
section,relname,pk_name,details
leaf_check,redacted_table_2025_aug_28_06,NULL,{OK}
leaf_check,redacted_table_2025_aug_28_07,NULL,{OK}
leaf_check,redacted_table_2025_aug_28_08,NULL,{OK}
leaf_check,redacted_table_2025_aug_28_09,NULL,{OK}
leaf_check,redacted_table_2025_aug_28_10,NULL,{OK}
leaf_check,redacted_table_2025_aug_28_11,NULL,{OK}
leaf_check,redacted_table_2025_aug_28_12,NULL,{OK}
leaf_check,redacted_table_2025_aug_28_13,NULL,{OK}
leaf_check,redacted_table_2025_aug_28_14,NULL,{OK}
leaf_check,redacted_table_2025_aug_28_15,NULL,{OK}
leaf_check,redacted_table_2025_aug_28_16,NULL,{OK}
leaf_check,redacted_table_2025_aug_28_17,NULL,{OK}
leaf_check,redacted_table_2025_aug_28_18,NULL,{OK}
leaf_check,redacted_table_2025_aug_28_19,NULL,{OK}
leaf_check,redacted_table_2025_aug_28_20,NULL,{OK}
leaf_check,redacted_table_2025_aug_28_21,NULL,{OK}
leaf_check,redacted_table_2025_aug_28_22,NULL,{OK}
leaf_check,redacted_table_2025_aug_28_23,NULL,{OK}
leaf_check,redacted_table_2025_aug_29_00,NULL,{OK}
leaf_check,redacted_table_2025_aug_29_01,NULL,{OK}
leaf_check,redacted_table_2025_aug_29_02,NULL,{OK}
leaf_check,redacted_table_2025_aug_29_03,NULL,{OK}
leaf_check,redacted_table_2025_aug_29_04,NULL,{OK}
leaf_check,redacted_table_2025_aug_29_05,NULL,{OK}
parent_pk,redacted_table,redacted_table_pkey,"{id,track_timestamp}"
It appears that although partitioned table leafs do have unique key enforcement, they don't technically have PKs. This causes the failure in this code:
https://github.com/supabase/etl/blob/main/etl/src/replication/table_sync.rs#L187-L211
Ideally, PKs should be "inherited" from their parent table in the case of partitioned tables. This might need to be done in the code path I touched in #295. Maybe even in the eyes of etl
, it should be synced to a single table and not be treated as unique?
To Reproduce
Working on creating a minimum reproduction, but the general idea should be that table syncing should not fail with partitioned tables whose parent table has a PK.
Expected behavior
Not failure.
Screenshots
N/A
System information
- MacOS Sequoia
- Aurora Postgres RDS
Additional context
Add any other context about the problem here.