Skip to content

Enable PgAdmin to cope with Citus-sharding query plans #7885

@RoystonS

Description

@RoystonS

Describe the solution you'd like

Currently, PgAdmin doesn't cope well with query plans coming out of running EXPLAIN queries in a Citus-enabled shared environment. Typically PgAdmin merely shows a 'Custom Scan' node at the top level and no other information:

For example:
image
and
image

It would be much nicer if PgAdmin could poke into the query plan and apply all its usual lovely query formatting logic where possible.

Looking at the existing code, it wouldn't be terribly difficult to modify the existing parsing logic to cope with a Citus query plan.

Such a plan looks like this:

[
    {
        "Plan": {
            "Node Type": "Custom Scan",
            "Custom Plan Provider": "Citus Adaptive",
            "Parallel Aware": false,
            "Async Capable": false,
            "Startup Cost": 0,
            "Total Cost": 0,
            "Plan Rows": 0,
            "Plan Width": 0,
            "Output": [
                "remote_scan.campaign_id",
                "remote_scan.rank",
                "remote_scan.n_impressions",
                "remote_scan.id"
            ],
            "Distributed Query": {
                "Job": {
                    "Task Count": 1,
                    "Tasks Shown": "All",
                    "Tasks": [
                        {
                            "Query": "SELECT a.campaign_id, rank() OVER (PARTITION BY a.campaign_id ORDER BY a.campaign_id, (count(*)) DESC) AS rank, count(*) AS n_impressions, a.id FROM (public.ads_102110 a JOIN public.impressions_102174 i ON (((i.company_id OPERATOR(pg_catalog.=) a.company_id) AND (i.ad_id OPERATOR(pg_catalog.=) a.id)))) WHERE (a.company_id OPERATOR(pg_catalog.=) 5) GROUP BY a.campaign_id, a.id ORDER BY a.campaign_id, (count(*)) DESC",
                            "Node": "host=citus-worker-7 port=5432 dbname=postgres",
                            "Remote Plan": [
                                [
                                    {
                                        "Plan": {
                                            "Node Type": "WindowAgg",
                                            "Parallel Aware": false,
                                            "Async Capable": false,
                                            "Startup Cost": 230.91,
                                            "Total Cost": 232.55,
                                            "Plan Rows": 73,
                                            "Plan Width": 32,
                                            "Output": [
                                                "a.campaign_id",
                                                "rank() OVER (?)",
                                                "(count(*))",
                                                "a.id"
                                            ],
                                            "Plans": [
                                                {
                                                    "Node Type": "Sort",
                                                    "Parent Relationship": "Outer",
                                                    "Parallel Aware": false,
                                                    "Async Capable": false,
                                                    "Startup Cost": 230.91,
                                                    "Total Cost": 231.09,
                                                    "Plan Rows": 73,
                                                    "Plan Width": 24,
                                                    "Output": [
                                                        "a.campaign_id",
                                                        "(count(*))",
                                                        "a.id"
                                                    ],
                                                    "Sort Key": [
                                                        "a.campaign_id",
                                                        "(count(*)) DESC"
                                                    ],
                                                    "Plans": [
                                                        {
                                                            "Node Type": "Aggregate",
                                                            "Strategy": "Hashed",
                                                            "Partial Mode": "Simple",
                                                            "Parent Relationship": "Outer",
                                                            "Parallel Aware": false,
                                                            "Async Capable": false,
                                                            "Startup Cost": 227.92,
                                                            "Total Cost": 228.65,
                                                            "Plan Rows": 73,
                                                            "Plan Width": 24,
                                                            "Output": [
                                                                "a.campaign_id",
                                                                "count(*)",
                                                                "a.id"
                                                            ],
                                                            "Group Key": [
                                                                "a.campaign_id",
                                                                "a.id"
                                                            ],
                                                            "Planned Partitions": 0,
                                                            "Plans": [
                                                                {
                                                                    "Node Type": "Hash Join",
                                                                    "Parent Relationship": "Outer",
                                                                    "Parallel Aware": false,
                                                                    "Async Capable": false,
                                                                    "Join Type": "Inner",
                                                                    "Startup Cost": 5.66,
                                                                    "Total Cost": 204.4,
                                                                    "Plan Rows": 3136,
                                                                    "Plan Width": 16,
                                                                    "Output": [
                                                                        "a.campaign_id",
                                                                        "a.id"
                                                                    ],
                                                                    "Inner Unique": true,
                                                                    "Hash Cond": "(i.ad_id = a.id)",
                                                                    "Plans": [
                                                                        {
                                                                            "Node Type": "Seq Scan",
                                                                            "Parent Relationship": "Outer",
                                                                            "Parallel Aware": false,
                                                                            "Async Capable": false,
                                                                            "Relation Name": "impressions_102174",
                                                                            "Schema": "public",
                                                                            "Alias": "i",
                                                                            "Startup Cost": 0,
                                                                            "Total Cost": 190.26,
                                                                            "Plan Rows": 3136,
                                                                            "Plan Width": 16,
                                                                            "Output": [
                                                                                "i.id",
                                                                                "i.company_id",
                                                                                "i.ad_id",
                                                                                "i.seen_at",
                                                                                "i.site_url",
                                                                                "i.cost_per_impression_usd",
                                                                                "i.user_ip",
                                                                                "i.user_data"
                                                                            ],
                                                                            "Filter": "(i.company_id = 5)"
                                                                        },
                                                                        {
                                                                            "Node Type": "Hash",
                                                                            "Parent Relationship": "Inner",
                                                                            "Parallel Aware": false,
                                                                            "Async Capable": false,
                                                                            "Startup Cost": 4.75,
                                                                            "Total Cost": 4.75,
                                                                            "Plan Rows": 73,
                                                                            "Plan Width": 24,
                                                                            "Output": [
                                                                                "a.campaign_id",
                                                                                "a.id",
                                                                                "a.company_id"
                                                                            ],
                                                                            "Plans": [
                                                                                {
                                                                                    "Node Type": "Seq Scan",
                                                                                    "Parent Relationship": "Outer",
                                                                                    "Parallel Aware": false,
                                                                                    "Async Capable": false,
                                                                                    "Relation Name": "ads_102110",
                                                                                    "Schema": "public",
                                                                                    "Alias": "a",
                                                                                    "Startup Cost": 0,
                                                                                    "Total Cost": 4.75,
                                                                                    "Plan Rows": 73,
                                                                                    "Plan Width": 24,
                                                                                    "Output": [
                                                                                        "a.campaign_id",
                                                                                        "a.id",
                                                                                        "a.company_id"
                                                                                    ],
                                                                                    "Filter": "(a.company_id = 5)"
                                                                                }
                                                                            ]
                                                                        }
                                                                    ]
                                                                }
                                                            ]
                                                        }
                                                    ]
                                                }
                                            ]
                                        },
                                        "Planning Time": 0.125
                                    }
                                ]
                            ]
                        }
                    ]
                }
            }
        },
        "Settings": {},
        "Planning Time": 0.266
    }
]

Note that once we get down to the WindowAgg Plan, we're into territory that PgAdmin can already deal with, so implementing this feature should basically be a case of adding support for the top-level Citus JSON nodes, and then the existing code takes over.

Describe alternatives you've considered

There are at least a couple of ways to do this:

  1. Add specific handling for the various 'Distributed Query', 'Job', 'Tasks' portions of the Citus query in Explain/index.jsx#parsePlan. This would somewhat duplicate some of the nesting logic already present where PgAdmin already knows how to cope with Plan nodes which contain Plans array properties. So there's a simple alternative:
  2. It would be easy to transform the Citus query nodes into something that looks more like the nodes that PgAdmin already knows how to handle. That is, transform the upper Citus query part of the query plan tree into nodes that contain Node Type and Plans properties. PgAdmin would then 'just work' as-is.

Additional context

I have regular access to Citus deployments, and would be happy to work on this feature (with a little architecture/design assistance from an expert).

In fact, I've already made a proof-of-concept for option 2 above, producing these displays for the above JSON:
image
and
image

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions