Skip to content

Fix role propagation error due to missing grantor dependencies#8466

Closed
Copilot wants to merge 64 commits into
mainfrom
copilot/fix-role-propagation-error
Closed

Fix role propagation error due to missing grantor dependencies#8466
Copilot wants to merge 64 commits into
mainfrom
copilot/fix-role-propagation-error

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Feb 5, 2026

DESCRIPTION: Add grantor role dependencies to fix node addition with interdependent roles

Problem

When adding nodes to a cluster, role propagation fails if role A is used as a grantor when granting role B to role C:

-- Role setup on coordinator
GRANT read_only_role TO role1 WITH ADMIN OPTION;
GRANT read_only_role TO role2 GRANTED BY role1;

-- Adding node fails with:
-- ERROR: permission denied to grant privileges as role "role1"
-- DETAIL: The grantor must have the ADMIN option on the role

The issue: role2 is propagated before role1 has its admin options established, causing the GRANTED BY role1 statement to fail.

Changes

Modified ExpandRolesToGroups in dependency.c to track grantor roles as dependencies alongside role memberships. This ensures grantors are propagated before roles that reference them in grant statements.

Before: Only roles where roleid is a member were tracked
After: Both member roles and grantor roles are tracked as dependencies

Added regression test covering the interdependent role scenario from the issue.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • esm.ubuntu.com
    • Triggering command: /usr/lib/apt/methods/https /usr/lib/apt/methods/https 2-1--11.3-1.sql 3-1--12.2-1.sql 2-1--13.2-1.sql 2-1--14.0-1.sql 0-1--15.0-1.sql 0.0-2.sql 0.1-1.sql 0.2-1.sql 0.2-2.sql 0.2-3.sql 0.2-4.sql 1.0-�� 1.1-1.sql .0-1.sql /home/REDACTED/.config/composer/vendor/bin/bash (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

Original prompt

This section details on the original issue you should resolve

<issue_title>Role propagation error due to interdependent roles while adding new nodes</issue_title>
<issue_description>On the cluster, we had a role set like the following:

select * from pg_roles;

rolname: read_only_role
rolsuper = f       
rolinherit = t          
rolcreaterole = f
rolcreatedb = f
rolcanlogin = t
rolreplication = f
rolbypassrls = f
rolconnlimit = -1 
rolvaliduntil = ''
SELECT
  parent.rolname  AS role,
  member.rolname  AS member,
  am.admin_option
FROM pg_auth_members am
JOIN pg_roles parent ON parent.oid = am.roleid
JOIN pg_roles member ON member.oid = am.member
ORDER BY role, member;

            role             |     member     | admin_option
-----------------------------+----------------+--------------
...
...
read_only_role              | role1     | t
read_only_role              | role2     | f
role2                       | role1     | t

On the Pg logs for the newly created node:

... ERROR: permission denied to grant priviledges as role "role1"
... DETAIL:  The grantor must have the ADMIN option on the role "read_only_role"
STATEMENT: SELECT woker_create_or_alter_role("role2", 'CREATE ROLE role2 NOSUPERUSER NOCREATEDB NOCREATEROLE INHERIT LOGIN NOREPLICATION NOBYPASSRLS CONNECTION LIMIT -1 PASSWORD ''SCRAM SHA *** '); GRANT read_only_role to role2 WITH INHERIT TRUE GRANTED BY role1;

In the newly created node, citus creates the objects in the following order:

- CREATE ROLE read_only_role 
- CREATE ROLE role2
- GRANT read_only_role to role2 WITH INHERIT TRUE GRANTED BY role1;
      -  ... but at this time, since role1 does not have the ADMIN option, above query cannot succeed. 
      - The fix is to make sure to create the "read_only_role | role1 | t" entry before above

To sum up, citus creates roles that have interdependencies in an incorrect order, and this failure causes add node operation to fail. The new node cannot be registered until we login to the new node and execute:

CREATE ROLE read_only_role WITH NOSUPERUSER INHERIT NOCREATEROLE NOCREATEDB LOGIN NOREPLICATION NOBYPASSRLS CONNECTION LIMIT -1
GRANT read_only_role TO  espdevadmin  WITH ADMIN OPTION;

so that node registration can succeed.
</issue_description>

Comments on the Issue (you are @copilot in this section)


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

@emelsimsek emelsimsek removed their assignment Feb 5, 2026
Copilot AI changed the title [WIP] Fix role propagation error for interdependent roles Fix role propagation error due to missing grantor dependencies Feb 5, 2026
Copilot AI requested a review from emelsimsek February 5, 2026 09:15
@codecov
Copy link
Copy Markdown

codecov Bot commented Feb 26, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 88.75%. Comparing base (9036b48) to head (d43c49f).

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #8466      +/-   ##
==========================================
- Coverage   88.75%   88.75%   -0.01%     
==========================================
  Files         288      288              
  Lines       64254    64268      +14     
  Branches     8085     8086       +1     
==========================================
+ Hits        57027    57039      +12     
- Misses       4892     4893       +1     
- Partials     2335     2336       +1     
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copilot AI and others added 4 commits April 24, 2026 10:26
Co-authored-by: emelsimsek <13130350+emelsimsek@users.noreply.github.com>
…O(n) complexity

Co-authored-by: emelsimsek <13130350+emelsimsek@users.noreply.github.com>
Co-authored-by: emelsimsek <13130350+emelsimsek@users.noreply.github.com>
@ihalatci ihalatci force-pushed the copilot/fix-role-propagation-error branch from 8dade25 to 1735140 Compare April 24, 2026 10:40
- Replace separate seenGrantors list with a single seenRoleIds set so the
  parent role and grantor candidates share dedup, eliminating the case
  where the same OID could appear twice in the dependency list.
- Iterate over the (roleid, grantor) pair through a small array to keep
  the per-tuple logic compact.
- Filter out InvalidOid grantors defensively.
- Update expected output of create_role_propagation: the existing
  dist_role_1 -> non_dist_role_2 (via non_dist_role_1) scenario now
  correctly propagates non_dist_role_1 to worker_2 as well, matching
  the new interdep_role* test.
@ihalatci ihalatci force-pushed the copilot/fix-role-propagation-error branch from 1735140 to 992d063 Compare April 24, 2026 10:40
Including grantors of pg_auth_members tuples in the generic role
dependency expansion (ExpandCitusSupportedTypes) caused false-positive
circular-dependency errors for legitimate mutual GRANTs. This surfaced
in PG16/17/18 vanilla regression (privileges, create_role, psql tests)
where statements like

    GRANT regress_priv_user9 TO regress_priv_user8;
    SET SESSION AUTHORIZATION regress_priv_user8;
    GRANT pg_read_all_settings TO regress_priv_user9 WITH ADMIN OPTION;

started failing with:

    ERROR: Citus can not handle circular dependencies between distributed
    objects
    DETAIL: "role regress_priv_user9" circularly depends itself, resolve
    circular dependency first

Grantor ordering is only actually required when creating the complete
role graph on a freshly-activated node. Move that behavior to a new
ExpandCitusSupportedTypesForNodeActivation callback used exclusively by
OrderObjectAddressListInDependencyOrder, and restore the
ExpandCitusSupportedTypes AuthId expansion to parent-roles-only.
@ihalatci ihalatci marked this pull request as ready for review April 24, 2026 17:29
ihalatci and others added 11 commits April 24, 2026 20:31
Phase 1 of the sorted-merge feature. This commit adds the data  structures and GUC needed by later phases, with zero behavioral changes:
- SortedMergeKey typedef in multi_physical_planner.h describing one
  sort key for the coordinator k-way merge
- useSortedMerge, sortedMergeKeys[], sortedMergeKeyCount fields on
  DistributedPlan (plan-time decision, never checked at runtime via GUC)
- sortedMergeEligible field on MultiExtendedOp (logical optimizer tag
  read by the physical planner)
- Hidden GUC citus.enable_sorted_merge (PGC_SUSET, default off,
  GUC_NO_SHOW_ALL) consulted only during planning
- Serialization in citus_outfuncs.c and deep-copy in citus_copyfuncs.c
  for all new fields

All new fields default to false/0/NULL. Existing regression tests are
unaffected.

Co-authored-by: Copilot
Phase 2 of the sorted-merge feature. Workers now sort their results
when citus.enable_sorted_merge is enabled at planning time, even for
queries without LIMIT. The plan metadata is populated so later phases
can execute the merge and set pathkeys.

Logical optimizer changes (multi_logical_optimizer.c):
- WorkerSortClauseList() gains an early-return path that pushes the
  sort clause to workers when the GUC is on and the sort is safe
  (no aggregates in ORDER BY, no non-pushable window functions,
  and either no GROUP BY or GROUP BY on partition column).
- WorkerExtendedOpNode() sets sortedMergeEligible = true when the
  worker sort clause semantically matches the original sort clause,
  using the new SortClauseListsMatch() helper.
- SortClauseListsMatch() compares tleSortGroupRef, sortop,
  nulls_first, and eqop for each pair.

Physical planner changes (multi_physical_planner.c):
- CreatePhysicalDistributedPlan() finds the worker MultiExtendedOp
  with sortedMergeEligible = true, builds SortedMergeKey metadata
  from the worker job query, and sets useSortedMerge on the plan.
- BuildSortedMergeKeys() constructs the key array from the worker
  query's SortGroupClause list and target list.

The coordinator Sort node is still present above the CustomScan
(pathkeys not set yet — that is Phase 4). Results are correct
because the redundant Sort re-sorts already-sorted data.

Co-authored-by: Copilot
Phase 3 of the sorted-merge feature. When distributedPlan->useSortedMerge
is true (set at planning time by Phase 2), the adaptive executor now:
1. Routes worker results into per-task tuple stores via a new
   PerTaskDispatchTupleDest that dispatches by task->taskId hash lookup.
   No Task fields are mutated — all state lives on DistributedExecution.
2. After all tasks complete, performs a k-way merge of the per-task stores
   into the final scanState->tuplestorestate using PostgreSQL's public
   binaryheap and SortSupport APIs.
3. Frees per-task stores after the merge.

The existing CitusExecScan/ReturnTupleFromTuplestore/CitusEndScan/
CitusReScan code paths are completely unchanged — they read from
the final tuplestore exactly as before.

New files:
- sorted_merge.h: CreatePerTaskDispatchDest, MergePerTaskStoresIntoFinalStore
- sorted_merge.c: PerTaskDispatchTupleDest with taskId->index hash routing,
  MergePerTaskStoresIntoFinalStore with binaryheap merge, MergeHeapComparator
  modeled after PG's heap_compare_slots in nodeMergeAppend.c
Modified:
- adaptive_executor.c: DistributedExecution gains useSortedMerge/perTaskStores/
  perTaskStoreCount fields. AdaptiveExecutor() branches on useSortedMerge to
  create per-task stores, then merges post-execution. EXPLAIN ANALYZE falls
  back to existing single-tuplestore path.
Safety:
- Shared TupleDestinationStats preserves citus.max_intermediate_result_size
- Per-task stores allocated in AdaptiveExecutor local memory context
  (auto-cleanup on error via PG memory context teardown)
- task->totalReceivedTupleData tracking preserved

The coordinator Sort node is still present above the CustomScan (pathkeys
not set until Phase 4). Results are correct because the redundant Sort
re-sorts already-sorted data.
  Co-authored-by: Copilot
Phase 1 of the sorted-merge feature. This commit adds the data  structures and GUC needed by later phases, with zero behavioral changes:
- SortedMergeKey typedef in multi_physical_planner.h describing one
  sort key for the coordinator k-way merge
- useSortedMerge, sortedMergeKeys[], sortedMergeKeyCount fields on
  DistributedPlan (plan-time decision, never checked at runtime via GUC)
- sortedMergeEligible field on MultiExtendedOp (logical optimizer tag
  read by the physical planner)
- Hidden GUC citus.enable_sorted_merge (PGC_SUSET, default off,
  GUC_NO_SHOW_ALL) consulted only during planning
- Serialization in citus_outfuncs.c and deep-copy in citus_copyfuncs.c
  for all new fields

All new fields default to false/0/NULL. Existing regression tests are
unaffected.

Co-authored-by: Copilot
neildsh and others added 2 commits May 13, 2026 23:14
…Y queries (#8529)

# Streaming sorted merge for distributed `ORDER BY`

## Summary

This PR adds a streaming sorted-merge execution path for multi-shard
`SELECT ... ORDER BY` queries. When `citus.enable_sorted_merge` is
enabled at planning time, Citus pushes safe `ORDER BY` clauses to
workers, advertises sorted output from the coordinator `CustomScan`, and
streams globally sorted tuples through a binary-heap merge over per-task
tuplestores.

The main effect is that PostgreSQL no longer needs to add a
coordinator-side `Sort` node above the Citus scan. Worker shards do
shard-local sorting in parallel, and the coordinator performs an `O(N
log K)` k-way merge, where `K` is the task count, instead of an `O(N log
N)` full-result sort.

## What changed

### Planner

- Adds hidden experimental GUC `citus.enable_sorted_merge` (currently
default `on`, `PGC_SUSET`).
- Extends worker sort pushdown beyond the existing LIMIT/DISTINCT cases
when the query is safe for sorted merge:
  - query has an `ORDER BY`,
  - window functions, if any, are pushable,
  - `ORDER BY` does not contain aggregate expressions,
  - `GROUP BY` is empty or grouped by the partitioning side.
- Records `DistributedPlan.useSortedMerge` during physical planning.
- Sets Citus CustomScan `pathkeys` to the combine query's required sort
order so PostgreSQL elides the parent Sort.
- Uses a dedicated `CustomScanMethods` entry, shown in EXPLAIN as
`Custom Scan (Citus Sorted Merge Adaptive)`.

### Executor

- Adds `sorted_merge.c` / `sorted_merge.h` with a `SortedMergeAdapter`
modeled after PostgreSQL `MergeAppend`.
- Creates one tuplestore per task and routes worker tuples directly into
the matching task store through `task->tupleDest`.
- Shares one `TupleDestinationStats` object across all per-task
destinations so `citus.max_intermediate_result_size` is enforced across
the whole result, not per task.
- Builds `SortSupportData` from the worker query sort clauses and uses a
binary heap to return the next globally sorted tuple.
- Adds a sorted-merge-specific `ExecCustomScan` callback that reads
directly from the adapter, avoiding per-row branching between adapter
and normal tuplestore paths.
- Clears cached task `tupleDest` pointers after execution so
prepared-plan reuse cannot see stale execution-local state.

### Cursor and backward-scan behavior

The streaming adapter is forward-only.

- Sorted-merge plans do not advertise
`CUSTOMPATH_SUPPORT_BACKWARD_SCAN`.
- For `SCROLL` cursors, Citus reinserts the same Material wrapper
PostgreSQL would normally add for non-backward-scannable plans. This is
necessary because Citus replaces the plan tree after
`standard_planner()`.
- Non-scroll backward fetches remain unsupported, matching cursor
semantics.
- `SortedMergeAdapterRescan()` exists as a defensive/rescan path and
rebuilds the heap from the per-task stores.

### Observability and stats

- EXPLAIN plan shape changes from `Sort -> Custom Scan (Citus Adaptive)`
to `Custom Scan (Citus Sorted Merge Adaptive)` for eligible queries.
- The executor type is tracked as `MULTI_EXECUTOR_SORTED_MERGE` so query
stats can distinguish sorted-merge executions from regular adaptive
executions.

## Query behavior

Eligible examples:

```sql
SELECT id, val
FROM distributed_table
ORDER BY id;

SELECT id, val, created_at
FROM distributed_table
WHERE created_at >= now() - interval '1 day'
ORDER BY created_at, id;

SELECT tenant_id, count(*)
FROM distributed_table
GROUP BY tenant_id
ORDER BY tenant_id;
```

Intentionally not eligible:

```sql
-- ORDER BY aggregate expression
SELECT tenant_id, sum(amount) AS total
FROM distributed_table
GROUP BY tenant_id
ORDER BY total;

-- GROUP BY non-distribution column
SELECT user_id, count(*)
FROM distributed_table
GROUP BY user_id
ORDER BY user_id;
```

## Performance summary

Numbers below are from `ssm_presentation_0505.md`, using the latest
documented run:

- PostgreSQL 18.3 release build under `$HOME/pg18-release`
- Citus built from this branch
- 1 coordinator on port 9700, 4 workers on ports 9701-9704
- 8 shards, `work_mem=64MB`, `shared_buffers=4GB`, `jit=off`
- `pgbench -n -T 60 -c $c -j $c -P 30 -p 9700 -d citus`
- 2 tables x 2 configs x 8 query shapes x 3 client counts, 3 runs per
cell
- `BASE`: `citus.enable_sorted_merge=off`
- `STREAM`: `citus.enable_sorted_merge=on`

### Query-shape rollup

| Query | Shape | STREAM faster cells | Median speedup |
|---|---|---:|---:|
| q1 | small index only scan | 1/6 | 0.99x |
| q2 | wider projection sorted by `created_at` | 6/6 | 1.03x |
| q3 | multi-key sort | 6/6 | 1.08x |
| q4 | unindexed sort key | 6/6 | 1.12x |
| q5 | full row, no LIMIT | 6/6 | 1.57x |
| q6 | full row with LIMIT | 6/6 | 1.65x |
| q7/q8 | safety counter-tests | flat | 1.00x |

### Most important performance takeaways

- The strongest wins are full-row sorted reads:
- q5 (`SELECT * ... ORDER BY event_id`) improved about **1.50x-1.63x**.
- q6 (`SELECT * ... ORDER BY event_id LIMIT 20000`) improved about
**1.64x-1.94x**.
- q4, which orders by an unindexed key, improved in every cell by about
**1.10x-1.14x**, showing the benefit of parallel worker sorts plus
coordinator merge.
- q2/q3 show smaller but consistent wins when the coordinator Sort is
not the dominant cost.
- q1 is neutral to slightly negative on the indexed table; the result is
small enough that sorted-merge overhead can outweigh the coordinator
Sort it removes.
- q7/q8 counter-tests stayed flat, validating that the planner
exclusions for aggregate `ORDER BY` and non-partition-column grouping
avoid changing those query shapes.

## Tests

- Adds `src/test/regress/sql/multi_orderby_pushdown.sql` and expected
output.
- Adds coverage for:
  - ASC/DESC, NULLS FIRST/LAST, mixed direction, multi-column ORDER BY,
  - ORDER BY non-distribution columns,
  - GROUP BY distribution column,
  - ORDER BY expressions,
- LIMIT/OFFSET, DISTINCT, DISTINCT ON, UNION ALL, CTEs, subqueries,
joins,
  - prepared statements where the GUC is toggled after plan caching,
  - non-scroll cursor backward-fetch behavior,
  - SCROLL cursor behavior via Material insertion,
  - small `work_mem`,
  - `citus.max_intermediate_result_size`,
  - MX mode.
- Updates EXPLAIN expected output where eligible queries now show
`Custom Scan (Citus Sorted Merge Adaptive)` instead of a coordinator
Sort.

## Notes / limitations

- The adapter streams from per-task tuplestores after
`RunDistributedExecution()` has drained worker output. It does not yet
interleave network reads with merge output.
- The feature is intentionally limited to safe `ORDER BY` shapes.
Aggregate `ORDER BY` and non-partition-column grouping remain on
existing planning paths.
ihalatci added a commit that referenced this pull request May 14, 2026
DESCRIPTION: Add grantor role dependencies to fix node addition with interdependent roles

Modified ExpandRolesToGroups in dependency.c to track grantor roles as
dependencies alongside role memberships during node activation. This
ensures grantors are propagated before roles that reference them in
GRANT ... GRANTED BY statements.

Added regression test covering the interdependent role scenario.

Fixes #8425
ihalatci added a commit that referenced this pull request May 15, 2026
Release prep for **citus v14.1.0** from `release-14.0`.

DESCRIPTION: add changelog for 14.1.0

All version-bearing files (`configure.ac`, `configure`, `citus.control`,
`citus_columnar.control`, `multi_extension.out` `SHOW citus.version`)
were already bumped to 14.1.0 / 14.1-1 in a5ae9f4 (Mar 17, 2026), so
this PR is a single-file CHANGELOG addition.

### Changelog entries (user-visible only)
- #8541 — schema-based-sharding DDLs from any node
- #8543, #8571 — `citus_cluster_changes_{block,unblock,block_status}`
UDFs for snapshot-backup coordination
- #8466 — fix role propagation when `GRANT ... GRANTED BY` introduces
grantor deps
- #8540, #8560 — fix self-deadlock on named constraint with long
partition shard name

CI/test-infra and version-bump commits intentionally excluded.
ihalatci added a commit that referenced this pull request May 15, 2026
…sing grantor dependencies (#8579)

Backport of #8466 to release-13.2.

DESCRIPTION: Add grantor role dependencies to fix node addition with
interdependent roles

Modified `ExpandRolesToGroups` in `dependency.c` to track grantor roles
as dependencies alongside role memberships during node activation. This
ensures grantors are propagated before roles that reference them in
`GRANT ... GRANTED BY` statements.

Added regression test covering the interdependent role scenario.

Fixes #8425

### Verification (local)
- Build: PG 16.10 ✅
- Regression (`check-multi`): 188/188 ✅
- Style check (vs. stylechecker image, uncrustify 0.68.1): no new issues
introduced
- Rebased onto latest origin/release-13.2 (12 newer upstream commits,
none touch the same files)
@ihalatci
Copy link
Copy Markdown
Contributor

Superseded by #8589, which applies the same change cleanly onto current main (this PR had accumulated unrelated merge commits during its lifetime). The change has already been backported to release-14.0 (#8568), release-13.2 (#8579), and release-12.1 (#8569).

@ihalatci ihalatci closed this May 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Role propagation error due to interdependent roles while adding new nodes

4 participants