Skip to content

Commit d0496a9

Browse files
authored
GH-5691 CLI for running and storing query explanations (#5692)
2 parents a710fe7 + 1d73643 commit d0496a9

File tree

278 files changed

+19707
-169
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

278 files changed

+19707
-169
lines changed
Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
---
2+
name: query-plan-snapshot-cli
3+
description: Use QueryPlanSnapshotCli to capture and compare RDF4J query plans, then assess likely performance improvements/regressions from execution verification and semantic plan diffs. Trigger when users ask about optimizer impact, query-plan drift, join algorithm changes, or query performance regressions in testsuites/benchmark.
4+
---
5+
6+
# query-plan-snapshot-cli
7+
8+
Use this skill to run reproducible query-plan captures and classify likely regression/improvement signals.
9+
10+
## Fast workflow
11+
12+
1. Capture baseline run (main/reference commit).
13+
2. Capture candidate run (changed commit) with same query selector + `--query-id`.
14+
3. Produce semantic diff (`--compare-existing`).
15+
4. Interpret runtime + diff together.
16+
17+
## Commands
18+
19+
Use wrapper (enforces pre-install and optional logging):
20+
21+
- Baseline:
22+
- `./.codex/skills/query-plan-snapshot-cli/scripts/run_query_plan_snapshot.sh --log /tmp/qps-baseline.log -- --store memory --theme MEDICAL_RECORDS --query-index 0 --query-id med-q0`
23+
- Candidate:
24+
- `./.codex/skills/query-plan-snapshot-cli/scripts/run_query_plan_snapshot.sh --log /tmp/qps-candidate.log -- --store memory --theme MEDICAL_RECORDS --query-index 0 --query-id med-q0 --compare-latest --diff-mode structure+estimates`
25+
- Compare existing snapshots explicitly:
26+
- `mvn -o -Dmaven.repo.local=.m2_repo -pl testsuites/benchmark -DskipTests exec:java@query-plan-snapshot -Dexec.args="--compare-existing --query-id med-q0 --compare-indices 1,0 --no-interactive --diff-mode structure+estimates" | tee /tmp/qps-compare.log`
27+
- Summarize improvement/regression signal:
28+
- `python3 ./.codex/skills/query-plan-snapshot-cli/scripts/interpret_query_plan_regression.py --baseline-log /tmp/qps-baseline.log --candidate-log /tmp/qps-candidate.log --comparison-log /tmp/qps-compare.log`
29+
30+
## Interpretation rule-of-thumb
31+
32+
- `averageMillis` down with stable `resultCount`: improvement signal.
33+
- `averageMillis` up with stable `resultCount`: regression signal.
34+
- `actualResultSizes=diff`: semantic/data-shape risk; perf conclusion low confidence.
35+
- `joinAlgorithms=diff` or `structure=diff`: optimizer behavior changed; correlate with runtime delta.
36+
- `estimates=diff` only: model/statistics shift; validate with repeated runs.
37+
38+
For more detailed reading patterns and triage prompts, use `references/workflow.md`.
Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,54 @@
1+
# QueryPlanSnapshotCli workflow
2+
3+
## Goal
4+
5+
Read optimizer/query-plan changes as performance signals without mixing in unrelated variables.
6+
7+
## Guardrails
8+
9+
- Same store, theme, and query selector between baseline/candidate.
10+
- Same `--query-id` to simplify lookup.
11+
- Keep JVM/system-property flags identical unless intentionally testing a flag.
12+
- Always refresh build artifacts first:
13+
- `mvn -T 1C -o -Dmaven.repo.local=.m2_repo -Pquick clean install | tail -200`
14+
15+
## Minimal run pair
16+
17+
1. Baseline:
18+
19+
`./.codex/skills/query-plan-snapshot-cli/scripts/run_query_plan_snapshot.sh --log /tmp/qps-baseline.log -- --store memory --theme MEDICAL_RECORDS --query-index 0 --query-id med-q0`
20+
21+
2. Candidate:
22+
23+
`./.codex/skills/query-plan-snapshot-cli/scripts/run_query_plan_snapshot.sh --log /tmp/qps-candidate.log -- --store memory --theme MEDICAL_RECORDS --query-index 0 --query-id med-q0 --compare-latest --diff-mode structure+estimates`
24+
25+
3. Explicit compare-existing (stable reproducible diff text):
26+
27+
`mvn -o -Dmaven.repo.local=.m2_repo -pl testsuites/benchmark -DskipTests exec:java@query-plan-snapshot -Dexec.args="--compare-existing --query-id med-q0 --compare-indices 1,0 --no-interactive --diff-mode structure+estimates" | tee /tmp/qps-compare.log`
28+
29+
4. Regression/improvement summary:
30+
31+
`python3 ./.codex/skills/query-plan-snapshot-cli/scripts/interpret_query_plan_regression.py --baseline-log /tmp/qps-baseline.log --candidate-log /tmp/qps-candidate.log --comparison-log /tmp/qps-compare.log`
32+
33+
## Reading semantic diff fields
34+
35+
- `structure=diff`: operator tree changed.
36+
- `joinAlgorithms=diff`: join strategy changed; usually high-impact for runtime.
37+
- `actualResultSizes=diff`: result-size flow changed; can indicate data-shape or semantic shifts.
38+
- `estimates=diff`: cost model changed. In isolation, not enough to claim runtime regression.
39+
40+
## Confidence ladder
41+
42+
- High confidence regression:
43+
- `averageMillis` up >= 10% and `structure`/`joinAlgorithms` diff.
44+
- Medium confidence regression:
45+
- `averageMillis` up >= 10% and no semantic diff file available.
46+
- Low confidence / inconclusive:
47+
- Runtime neutral but semantic diff exists, or result counts changed.
48+
49+
## Common mistakes
50+
51+
- Comparing different query IDs or different query text.
52+
- Forgetting pre-install (`-Pquick clean install`) before CLI run.
53+
- Treating estimate-only diffs as hard regressions.
54+
- Ignoring `resultCount` mismatch in execution verification.
Lines changed: 150 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,150 @@
1+
#!/usr/bin/env python3
2+
"""Summarize likely query-plan performance regression/improvement signals."""
3+
4+
from __future__ import annotations
5+
6+
import argparse
7+
import re
8+
from pathlib import Path
9+
from typing import Dict, List, Optional
10+
11+
EXECUTION_LINE = re.compile(
12+
r"runs=(?P<runs>\d+),\s*"
13+
r"totalMillis=(?P<total>\d+),\s*"
14+
r"averageMillis=(?P<avg>\d+),\s*"
15+
r"resultCount=(?P<results>\d+),\s*"
16+
r"softLimitMillis=(?P<soft_limit>\d+),\s*"
17+
r"softLimitReached=(?P<soft_reached>true|false),\s*"
18+
r"maxRunsReached=(?P<max_reached>true|false)"
19+
)
20+
21+
DIFF_LINE = re.compile(
22+
r"^\s*(?P<level>unoptimized|optimized|executed):\s+"
23+
r".*structure=(?P<structure>[^,]+),\s*"
24+
r"joinAlgorithms=(?P<joins>[^,]+),\s*"
25+
r"actualResultSizes=(?P<actual>[^,]+),\s*"
26+
r"estimates=(?P<estimates>[^,\s]+)"
27+
)
28+
29+
30+
def parse_execution_metrics(path: Path) -> Dict[str, int]:
31+
text = path.read_text(encoding="utf-8", errors="replace")
32+
matches = list(EXECUTION_LINE.finditer(text))
33+
if not matches:
34+
raise ValueError(f"No execution verification line found in {path}")
35+
last = matches[-1]
36+
return {
37+
"runs": int(last.group("runs")),
38+
"total": int(last.group("total")),
39+
"avg": int(last.group("avg")),
40+
"results": int(last.group("results")),
41+
}
42+
43+
44+
def parse_semantic_diff(path: Optional[Path]) -> List[Dict[str, str]]:
45+
if path is None:
46+
return []
47+
text = path.read_text(encoding="utf-8", errors="replace")
48+
rows: List[Dict[str, str]] = []
49+
for line in text.splitlines():
50+
match = DIFF_LINE.search(line)
51+
if not match:
52+
continue
53+
rows.append(
54+
{
55+
"level": match.group("level"),
56+
"structure": match.group("structure").strip(),
57+
"joins": match.group("joins").strip(),
58+
"actual": match.group("actual").strip(),
59+
"estimates": match.group("estimates").strip(),
60+
}
61+
)
62+
return rows
63+
64+
65+
def runtime_classification(delta_percent: Optional[float]) -> str:
66+
if delta_percent is None:
67+
return "unknown"
68+
if delta_percent <= -10.0:
69+
return "improvement"
70+
if delta_percent >= 10.0:
71+
return "regression"
72+
return "neutral"
73+
74+
75+
def find_diff(rows: List[Dict[str, str]], key: str) -> bool:
76+
return any(row[key] == "diff" for row in rows)
77+
78+
79+
def main() -> int:
80+
parser = argparse.ArgumentParser(description=__doc__)
81+
parser.add_argument("--baseline-log", required=True, type=Path)
82+
parser.add_argument("--candidate-log", required=True, type=Path)
83+
parser.add_argument("--comparison-log", type=Path)
84+
args = parser.parse_args()
85+
86+
baseline = parse_execution_metrics(args.baseline_log)
87+
candidate = parse_execution_metrics(args.candidate_log)
88+
semantic_rows = parse_semantic_diff(args.comparison_log)
89+
90+
avg_base = baseline["avg"]
91+
avg_candidate = candidate["avg"]
92+
delta_percent: Optional[float]
93+
if avg_base == 0:
94+
delta_percent = None
95+
else:
96+
delta_percent = ((avg_candidate - avg_base) / avg_base) * 100.0
97+
98+
runtime_signal = runtime_classification(delta_percent)
99+
result_count_changed = baseline["results"] != candidate["results"]
100+
101+
structure_changed = find_diff(semantic_rows, "structure")
102+
joins_changed = find_diff(semantic_rows, "joins")
103+
actual_changed = find_diff(semantic_rows, "actual")
104+
estimates_changed = find_diff(semantic_rows, "estimates")
105+
106+
if result_count_changed:
107+
verdict = "semantic regression risk: result count changed; runtime delta not comparable"
108+
elif runtime_signal == "regression" and (structure_changed or joins_changed or actual_changed):
109+
verdict = "likely performance regression with plan-shape change"
110+
elif runtime_signal == "improvement" and (structure_changed or joins_changed):
111+
verdict = "likely performance improvement with optimizer-plan change"
112+
elif runtime_signal == "regression":
113+
verdict = "possible performance regression (no semantic diff evidence provided)"
114+
elif runtime_signal == "improvement":
115+
verdict = "possible performance improvement"
116+
elif structure_changed or joins_changed or actual_changed or estimates_changed:
117+
verdict = "plan changed but runtime signal neutral"
118+
else:
119+
verdict = "no clear regression/improvement signal"
120+
121+
print("QueryPlanSnapshotCli regression summary")
122+
print(f"- baseline avgMillis: {avg_base}")
123+
print(f"- candidate avgMillis: {avg_candidate}")
124+
if delta_percent is None:
125+
print("- delta: n/a (baseline averageMillis=0)")
126+
else:
127+
print(f"- delta: {delta_percent:+.2f}%")
128+
print(f"- baseline resultCount: {baseline['results']}")
129+
print(f"- candidate resultCount: {candidate['results']}")
130+
print(f"- runtime signal: {runtime_signal}")
131+
132+
if semantic_rows:
133+
print("- semantic diff:")
134+
for row in semantic_rows:
135+
print(
136+
" "
137+
f"{row['level']}: structure={row['structure']}, "
138+
f"joinAlgorithms={row['joins']}, "
139+
f"actualResultSizes={row['actual']}, "
140+
f"estimates={row['estimates']}"
141+
)
142+
else:
143+
print("- semantic diff: not provided")
144+
145+
print(f"- verdict: {verdict}")
146+
return 0
147+
148+
149+
if __name__ == "__main__":
150+
raise SystemExit(main())
Lines changed: 87 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,87 @@
1+
#!/usr/bin/env bash
2+
set -euo pipefail
3+
4+
usage() {
5+
cat <<'USAGE'
6+
Usage:
7+
run_query_plan_snapshot.sh [--log <path>] [--online] -- <QueryPlanSnapshotCli args>
8+
9+
Examples:
10+
run_query_plan_snapshot.sh --log /tmp/qps.log -- \
11+
--store memory --theme MEDICAL_RECORDS --query-index 0 --query-id med-q0
12+
13+
Notes:
14+
- Always runs root install first: mvn -T 1C [-o] -Dmaven.repo.local=.m2_repo -Pquick clean install
15+
- Pass QueryPlanSnapshotCli args after '--'
16+
USAGE
17+
}
18+
19+
log_file=""
20+
offline_flag="-o"
21+
22+
while [[ $# -gt 0 ]]; do
23+
case "$1" in
24+
--log)
25+
[[ $# -ge 2 ]] || {
26+
echo "Missing value for --log" >&2
27+
exit 2
28+
}
29+
log_file="$2"
30+
shift 2
31+
;;
32+
--online)
33+
offline_flag=""
34+
shift
35+
;;
36+
--help|-h)
37+
usage
38+
exit 0
39+
;;
40+
--)
41+
shift
42+
break
43+
;;
44+
*)
45+
echo "Unknown wrapper option: $1" >&2
46+
usage
47+
exit 2
48+
;;
49+
esac
50+
done
51+
52+
if [[ $# -eq 0 ]]; then
53+
echo "No QueryPlanSnapshotCli args provided. Pass args after '--'." >&2
54+
usage
55+
exit 2
56+
fi
57+
58+
raw_cli_args=("$@")
59+
printf -v cli_args '%q ' "${raw_cli_args[@]}"
60+
cli_args="${cli_args% }"
61+
62+
install_cmd=(mvn -T 1C)
63+
if [[ -n "$offline_flag" ]]; then
64+
install_cmd+=("$offline_flag")
65+
fi
66+
install_cmd+=(-Dmaven.repo.local=.m2_repo -Pquick install)
67+
68+
cli_cmd=(mvn)
69+
if [[ -n "$offline_flag" ]]; then
70+
cli_cmd+=("$offline_flag")
71+
fi
72+
cli_cmd+=(-Dmaven.repo.local=.m2_repo -pl testsuites/benchmark -DskipTests exec:java@query-plan-snapshot)
73+
cli_cmd+=(-Dexec.args="$cli_args")
74+
75+
echo ">>> Refreshing artifacts"
76+
"${install_cmd[@]}" | tail -200
77+
78+
echo ">>> Running QueryPlanSnapshotCli"
79+
echo ">>> args: $cli_args"
80+
81+
if [[ -n "$log_file" ]]; then
82+
mkdir -p "$(dirname "$log_file")"
83+
"${cli_cmd[@]}" | tee "$log_file"
84+
echo ">>> log: $log_file"
85+
else
86+
"${cli_cmd[@]}"
87+
fi

compliance/elasticsearch/pom.xml

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -109,10 +109,16 @@
109109
</dependency>
110110
<dependency>
111111
<groupId>org.testcontainers</groupId>
112-
<artifactId>junit-jupiter</artifactId>
112+
<artifactId>testcontainers-junit-jupiter</artifactId>
113113
<version>${testcontainers.version}</version>
114114
<scope>test</scope>
115115
</dependency>
116+
<dependency>
117+
<groupId>com.fasterxml.jackson.core</groupId>
118+
<artifactId>jackson-annotations</artifactId>
119+
<version>${jackson.annotations.version}</version>
120+
<scope>test</scope>
121+
</dependency>
116122
<dependency>
117123
<groupId>org.apache.logging.log4j</groupId>
118124
<artifactId>log4j-core</artifactId>

core/common/iterator/src/main/java/org/eclipse/rdf4j/common/iteration/LookAheadIteration.java

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -54,6 +54,12 @@ public final boolean hasNext() {
5454
return false;
5555
}
5656

57+
if (Thread.currentThread().isInterrupted()) {
58+
log.debug("Thread {} is interrupted, closing iteration", Thread.currentThread().getName());
59+
close();
60+
return false;
61+
}
62+
5763
try {
5864
return lookAhead() != null;
5965
} catch (NoSuchElementException logged) {

core/queryrender/src/main/java/org/eclipse/rdf4j/queryrender/sparql/TupleExprToIrConverter.java

Lines changed: 18 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -923,7 +923,8 @@ private static Normalized normalize(final TupleExpr root, final boolean peelScop
923923
// Keep BIND chains inside WHERE: stop peeling when we hit the first nested Extension, otherwise peel and
924924
// remember bindings for reinsertion later.
925925
if (cur instanceof Extension) {
926-
if (((Extension) cur).getArg() instanceof Extension) {
926+
if (((Extension) cur).getArg() instanceof Extension
927+
&& !extensionChainLeadsToHavingFilter((Extension) cur)) {
927928
break;
928929
}
929930
final Extension ext = (Extension) cur;
@@ -1460,6 +1461,22 @@ private static boolean isAnonHavingName(String name) {
14601461
return name != null && name.startsWith("_anon_having_");
14611462
}
14621463

1464+
private static boolean extensionChainLeadsToHavingFilter(Extension ext) {
1465+
TupleExpr cur = ext;
1466+
while (cur instanceof Extension) {
1467+
cur = ((Extension) cur).getArg();
1468+
}
1469+
if (!(cur instanceof Filter)) {
1470+
return false;
1471+
}
1472+
for (String name : freeVars(((Filter) cur).getCondition())) {
1473+
if (isAnonHavingName(name)) {
1474+
return true;
1475+
}
1476+
}
1477+
return false;
1478+
}
1479+
14631480
// Render expressions for HAVING with substitution of _anon_having_* variables
14641481
private String renderExprForHaving(final ValueExpr e, final Normalized n) {
14651482
return renderExprWithSubstitution(e, n == null ? null : n.selectAssignments);

0 commit comments

Comments
 (0)