Skip to content

Partitioned tables do not work directly, due to lack of PK enforcement #296

@abhi-airspace-intelligence

Description

Bug Report

Describe the bug

Suppose we have a table in Postgres that is partitioned. I was trying to kick off an initial sync of a partitioned table, and ran into PK errors.

I asked ChatGPT 5 Thinking to write me the following SQL command to figure out if the leaf nodes of my partitioned table are correct:

-- Validate that each leaf partition has a UNIQUE, valid index
-- matching the parent table's PRIMARY KEY columns.
WITH RECURSIVE t AS (
  SELECT 'redacted_table'::regclass AS relid
  UNION ALL
  SELECT i.inhparent
  FROM t
  JOIN pg_inherits i ON i.inhrelid = t.relid
),
parent AS (
  SELECT relid
  FROM t
  WHERE NOT EXISTS (SELECT 1 FROM pg_inherits i WHERE i.inhrelid = t.relid)
  LIMIT 1
),
-- Parent PK columns (0 rows if no PK)
pk AS (
  SELECT con.conname AS pk_name,
         array_agg(a.attname ORDER BY x.n) AS pk_cols
  FROM pg_constraint con
  JOIN unnest(con.conkey) WITH ORDINALITY AS x(attnum, n) ON true
  JOIN pg_attribute a ON a.attrelid = con.conrelid AND a.attnum = x.attnum
  WHERE con.contype = 'p' AND con.conrelid = (SELECT relid FROM parent)
  GROUP BY con.conname
),
-- Leaf partitions
leaves AS (
  SELECT c.oid AS relid, c.oid::regclass::text AS relname
  FROM pg_partition_tree((SELECT relid FROM parent)) pt
  JOIN pg_class c ON c.oid = pt.relid
  WHERE pt.isleaf
)

SELECT * FROM (
  -- 1) Parent PK summary
  SELECT
    'parent_pk'::text         AS section,
    (SELECT (SELECT relid FROM parent)::regclass::text) AS relname,
    pk.pk_name                AS pk_name,
    pk.pk_cols                AS details            -- text[]: PK column list
  FROM pk

  UNION ALL

  -- 2) Per-leaf status (does a UNIQUE, valid index exist on the PK cols?)
  SELECT
    'leaf_check'::text        AS section,
    l.relname                 AS relname,
    NULL::text                AS pk_name,
    (CASE WHEN EXISTS (
        SELECT 1
        FROM pg_index ix
        WHERE ix.indrelid = l.relid
          AND ix.indisunique
          AND ix.indisvalid
          AND ARRAY(
                SELECT a.attname
                FROM unnest(ix.indkey) WITH ORDINALITY k(attnum, ord)
                JOIN pg_attribute a
                  ON a.attrelid = ix.indrelid AND a.attnum = k.attnum
                WHERE ord <= ix.indnkeyatts        -- exclude INCLUDE columns
                ORDER BY ord
              ) = (SELECT pk_cols FROM pk)
      )
      THEN ARRAY['OK']::text[]
      ELSE ARRAY['MISSING']::text[]
     END)                      AS details           -- text[]: OK/MISSING
  FROM leaves l
) AS u(section, relname, pk_name, details)
ORDER BY 1, 2;   -- section, relname

And returned the following result:

section,relname,pk_name,details
leaf_check,redacted_table_2025_aug_28_06,NULL,{OK}
leaf_check,redacted_table_2025_aug_28_07,NULL,{OK}
leaf_check,redacted_table_2025_aug_28_08,NULL,{OK}
leaf_check,redacted_table_2025_aug_28_09,NULL,{OK}
leaf_check,redacted_table_2025_aug_28_10,NULL,{OK}
leaf_check,redacted_table_2025_aug_28_11,NULL,{OK}
leaf_check,redacted_table_2025_aug_28_12,NULL,{OK}
leaf_check,redacted_table_2025_aug_28_13,NULL,{OK}
leaf_check,redacted_table_2025_aug_28_14,NULL,{OK}
leaf_check,redacted_table_2025_aug_28_15,NULL,{OK}
leaf_check,redacted_table_2025_aug_28_16,NULL,{OK}
leaf_check,redacted_table_2025_aug_28_17,NULL,{OK}
leaf_check,redacted_table_2025_aug_28_18,NULL,{OK}
leaf_check,redacted_table_2025_aug_28_19,NULL,{OK}
leaf_check,redacted_table_2025_aug_28_20,NULL,{OK}
leaf_check,redacted_table_2025_aug_28_21,NULL,{OK}
leaf_check,redacted_table_2025_aug_28_22,NULL,{OK}
leaf_check,redacted_table_2025_aug_28_23,NULL,{OK}
leaf_check,redacted_table_2025_aug_29_00,NULL,{OK}
leaf_check,redacted_table_2025_aug_29_01,NULL,{OK}
leaf_check,redacted_table_2025_aug_29_02,NULL,{OK}
leaf_check,redacted_table_2025_aug_29_03,NULL,{OK}
leaf_check,redacted_table_2025_aug_29_04,NULL,{OK}
leaf_check,redacted_table_2025_aug_29_05,NULL,{OK}
parent_pk,redacted_table,redacted_table_pkey,"{id,track_timestamp}"

It appears that although partitioned table leafs do have unique key enforcement, they don't technically have PKs. This causes the failure in this code:

https://github.com/supabase/etl/blob/main/etl/src/replication/table_sync.rs#L187-L211

Ideally, PKs should be "inherited" from their parent table in the case of partitioned tables. This might need to be done in the code path I touched in #295. Maybe even in the eyes of etl, it should be synced to a single table and not be treated as unique?

To Reproduce

Working on creating a minimum reproduction, but the general idea should be that table syncing should not fail with partitioned tables whose parent table has a PK.

Expected behavior

Not failure.

Screenshots

N/A

System information

  • MacOS Sequoia
  • Aurora Postgres RDS

Additional context

Add any other context about the problem here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions