Skip to content

Commit 7eed071

Browse files
authored
fix: PR compress benchmarks (#2419)
PR compression benchmarks fail because the PR results do not have a "storage" key. They indeed do not use any hard drive or object storage. Moreover, those benchmarks don't use our `QueryMeasurement` / `GenericMeasurement` structs, so they don't get a `null` storage key. This ensures that if _either_ the PR results or the base results are missing a storage key, we treat it as NA. If we try to compare results from before the "storage" key to after the "storage" key, this will not error but the right join will only include the PR's results. The table will look very similar to the one I'm removing here.
1 parent f2efc7d commit 7eed071

File tree

1 file changed

+10
-15
lines changed

1 file changed

+10
-15
lines changed

scripts/compare-benchmark-jsons.py

Lines changed: 10 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -21,22 +21,17 @@
2121
pr_commit_id = next(iter(pr_commit_id))
2222

2323
if "storage" not in base:
24-
# This means the base commit was generated in the pre-object-store days. We cannot give a true
25-
# diff because we're comparing different storage systems.
26-
pr
27-
print(
28-
pd.DataFrame(
29-
{
30-
"name": pr["name"],
31-
f"PR {pr_commit_id[:8]}": pr["value"],
32-
f"base {base_commit_id[:8]} (no S3 results found)": pd.NA,
33-
"ratio (PR/base)": pd.NA,
34-
"unit": pr["unit"],
35-
}
36-
).to_markdown(index=False)
37-
)
38-
sys.exit(0)
24+
# For whatever reason, the base lacks storage. Might be an old database of results. Might be a
25+
# database of results without any storage fields.
26+
base["storage"] = pd.NA
3927

28+
if "storage" not in pr:
29+
# Not all benchmarks have a "storage" key. If none of the JSON objects in the PR results file
30+
# had a "storage" key, then the PR DataFrame will lack that key and the join will fail.
31+
pr["storage"] = pd.NA
32+
33+
# NB: `pd.merge` considers two null key values to be equal, so benchmarks without storage keys will
34+
# match.
4035
df3 = pd.merge(base, pr, on=["name", "storage"], how="right", suffixes=("_base", "_pr"))
4136

4237
assert df3["unit_base"].equals(df3["unit_pr"]), (df3["unit_base"], df3["unit_pr"])

0 commit comments

Comments
 (0)