This document isolates the Athena-specific and Lero-specific changes in the current PG16.1-based fork and maps them to PostgreSQL 18.1 touchpoints.
Reference baseline:
- base commit:
3edc6580c0(Stamp 16.1.) - current fork head:
a06410cb3b
Diff summary:
17 files changed, 669 insertions(+), 31 deletions(-)
src/backend/optimizer/plan/planner.csrc/backend/optimizer/path/allpaths.csrc/backend/optimizer/path/joinpath.csrc/backend/optimizer/util/pathnode.csrc/backend/optimizer/jop/jop_extension.csrc/include/optimizer/jop_extension.hsrc/backend/utils/misc/guc_tables.csrc/backend/optimizer/Makefilesrc/backend/optimizer/jop/Makefilesrc/include/optimizer/cost.hsrc/backend/optimizer/path/costsize.c
src/backend/lero/lero_extension.csrc/include/lero/lero_extension.hsrc/backend/lero/Makefilesrc/backend/Makefilesrc/backend/optimizer/plan/planner.csrc/backend/optimizer/path/costsize.csrc/backend/utils/misc/guc_tables.c
README-> deletedREADME.md-> added
File: src/backend/optimizer/plan/planner.c
Changes:
- adds
#include "lero/lero_extension.h" - adds
#include "optimizer/jop_extension.h" - changes
planner()soenable_lerodiverts planning throughlero_pgsysml_hook_planner(...) - in
standard_planner(), callssave_join_order_plans(root, final_rel->pathlist)before selectingbest_path
Intent:
- Lero hijacks the planning entry point
- Athena dumps the final root candidate set before pruning to one chosen path
PG18.1 mapping:
src/backend/optimizer/plan/planner.cplanner()at lines290-299standard_planner()at lines303-441subquery_planner(...)call at line435final_rel/best_pathselection at lines438-441
Port status:
- carry Athena hook point forward
- do not keep file output
- do not carry Lero in the first PG18 port
File: src/backend/optimizer/path/allpaths.c
Changes:
- in
standard_join_search(), setssave_join_order_plan_finished = falsebefore join DP starts - sets
save_join_order_plan_finished = trueafter final joinrel is produced
Intent:
- signal whether Athena candidate retention is still inside join exploration or has moved into the final post-search stage
PG18.1 mapping:
standard_join_search(...)still exists insrc/backend/optimizer/path/allpaths.c- same conceptual insertion point is available
Port status:
- carry forward in spirit
- likely replace the flag with planner-private catalog state once the in-memory candidate catalog exists
File: src/backend/optimizer/path/joinpath.c
Changes:
add_paths_to_joinrel(...)detects whether the currentjoinrelis the root relation- if it thinks the joinrel is root, it temporarily empties
joinrel->pathlistbefore more paths are added, then appends the newly generated paths back into the original list afterwards
Current implementation issue:
- root detection compares
root->all_baserels->words[0]andjoinrelids->words[0] - this is not a safe relid-set equality check
Intent:
- weaken root-level pruning so more top-level alternatives survive
PG18.1 mapping:
src/backend/optimizer/path/joinpath.cadd_paths_to_joinrel(...)at lines124-130
Port status:
- carry forward concept
- replace root detection with
bms_equal(joinrelids, root->all_baserels) - redesign around explicit root-candidate capture, not pathlist swapping
File: src/backend/optimizer/util/pathnode.c
Changes:
- in
add_path(...), ifenable_join_order_plans && save_join_order_plan_finishedthen forcecostcmp = COSTS_DIFFERENT
Intent:
- keep more candidate paths by disabling normal fuzzy cost dominance
Risk:
- this is a coarse global knob
- it is not targeted to root-level candidates only
PG18.1 mapping:
src/backend/optimizer/util/pathnode.cadd_path(...)at lines464-497
Port status:
- do not port literally
- replace with explicit candidate catalog logic and blueprint retention
Files:
src/backend/optimizer/jop/jop_extension.csrc/include/optimizer/jop_extension.h
Changes:
- adds
catch_join_order(...)that recursively converts aPath *tree into a parenthesized join order string - adds
save_join_order_plans(...)that writes candidate strings to/tmp/Athena_join_order_plans.txt
Current implementation characteristics:
- supports only a subset of path node types
- several node types call
elog(WARNING)orelog(ERROR) - special-cases
ProjectionPathunderT_Result - serializes alias names from
root->parse->rtable
Intent:
- externalize root candidate join orders for Athena-side consumption
PG18.1 mapping:
- no direct port as-is
- the replacement belongs in a new Athena module, not under a file dump path
Port status:
- replace with in-memory candidate catalog
- later replace stringification with completed-plan serialization
Files:
src/backend/utils/misc/guc_tables.csrc/backend/optimizer/path/costsize.csrc/include/optimizer/cost.h
Added state:
enable_join_order_planssave_join_order_plan_finishedenable_lerolero_swing_factorlero_subquery_table_num
Intent:
- expose Athena / Lero toggles through PostgreSQL configuration
PG18.1 mapping:
src/backend/utils/misc/guc_tables.csrc/backend/optimizer/path/costsize.csrc/include/optimizer/cost.h
Port status:
enable_join_order_plansshould be keptsave_join_order_plan_finishedshould probably move to Athena-private state- Lero GUCs are optional and out of scope for the first port
File:
src/backend/optimizer/path/allpaths.c
Observation:
- Athena_PG does not yet modify
set_subquery_pathlist(...) - current behavior is still the PostgreSQL default:
every
sub_final_rel->pathlistmember becomes aSubqueryScanPath
Why this matters:
- this is the correct insertion point for the new PG18 late-bound subquery frontier extraction
PG18.1 mapping:
src/backend/optimizer/path/allpaths.cset_subquery_pathlist(...)at lines2529-2783
Port status:
- new work starts here for V1
- this is not a direct port of existing Athena code
File:
src/backend/optimizer/plan/createplan.c
Observation:
create_subqueryscan_plan(...)recursively callscreate_plan(rel->subroot, best_path->subpath)
Why this matters:
- direct path selection is feasible because a stored completed
Path *naturally recurses into subquerysubpath
PG18.1 mapping:
src/backend/optimizer/plan/createplan.ccreate_subqueryscan_plan(...)at lines3695-3743
Port status:
- keep this mechanism
- no need for
pg_hint_plan-style reconstruction
Files:
src/backend/lero/lero_extension.csrc/include/lero/lero_extension.hsrc/backend/optimizer/path/costsize.csrc/backend/optimizer/plan/planner.c
Behavior:
- first run
standard_planner(copyObject(parse), ...) - record join cardinalities during
set_joinrel_size_estimates(...) - rescale selected cardinalities based on join input table count
- rerun
standard_planner(...)
Port status:
- explicitly exclude from the first PG18 Athena port
- revisit only if a Lero baseline is required later
Verified against local PostgreSQL 18.1 source:
standard_planner(Query *parse, const char *query_string, int cursorOptions, ParamListInfo boundParams)still has the 4-argument signaturesubquery_planner(...)issubquery_planner(PlannerGlobal *glob, Query *parse, PlannerInfo *parent_root, bool hasRecursion, double tuple_fraction, SetOperationStmt *setops)set_subquery_pathlist(...)still loops oversub_final_rel->pathlistand wraps each subpath increate_subqueryscan_path(...)create_subqueryscan_plan(...)still recursively plansbest_path->subpath
enable_join_order_plansGUC- top-level post-
final_relcandidate interception instandard_planner() standard_join_search()state entry / exit hook points- recursive execution path through
create_subqueryscan_plan()
- root candidate retention in
add_paths_to_joinrel() - pruning weakening in
add_path() - join order serialization
- candidate persistence format
- all subquery handling for late-bound assembly
- Lero hook planner path
/tmp/Athena_join_order_plans.txt- literal
Path *mutation or post-hoc cost patching
Step 2 can be considered complete when:
- the Athena-specific patch areas are isolated
- each area is mapped to a PG18.1 file/function
- every area is classified as direct port, redesign, or exclude
- known factual mismatches in the main implementation plan are corrected