|
| 1 | +# CI Investigation: `Run sqllogictests with the sqlite test suite` |
| 2 | + |
| 3 | +## Question |
| 4 | +Why did the CI task |
| 5 | +`Datafusion extended tests / Run sqllogictests with the sqlite test suite (pull_request)` |
| 6 | +increase from about 1 hour to about 2 hours after merge commit `76be0b64c`? |
| 7 | + |
| 8 | +## Scope |
| 9 | +Compared: |
| 10 | +- pre-merge parent: `8f959bba6` |
| 11 | +- merge result: `76be0b64c` |
| 12 | + |
| 13 | +## Findings |
| 14 | + |
| 15 | +### 1) CI workflow/job definition did not change in the merge |
| 16 | +I diffed `.github/workflows/extended.yml` between `8f959bba6` and `76be0b64c` and found no changes. |
| 17 | + |
| 18 | +Implication: the slowdown is not explained by a direct change to the job steps, image, or command in this merge. |
| 19 | + |
| 20 | +### 2) The sqllogictest workload increased materially |
| 21 | +`datafusion/sqllogictest/test_files` changes in this merge range: |
| 22 | +- `34 files changed` |
| 23 | +- `+2359 / -296` lines (net `+2063`) |
| 24 | +- new files include: |
| 25 | + - `join_limit_pushdown.slt` |
| 26 | + - `spark/bitmap/bitmap_bit_position.slt` |
| 27 | + - `spark/bitmap/bitmap_bucket_number.slt` |
| 28 | + - `spark/json/json_tuple.slt` |
| 29 | + |
| 30 | +Largest growth files: |
| 31 | +- `sort_pushdown.slt`: `+748` |
| 32 | +- `projection_pushdown.slt`: `+348/-191` |
| 33 | +- `dynamic_filter_pushdown_config.slt`: `+301` |
| 34 | +- `join_limit_pushdown.slt`: `+269` |
| 35 | + |
| 36 | +Aggregate test corpus size in `datafusion/sqllogictest/test_files`: |
| 37 | +- files: `459 -> 463` |
| 38 | +- lines: `134122 -> 136189` |
| 39 | +- runnable records (query/statement/skipif/onlyif markers): `14845 -> 15110` (`+265`) |
| 40 | + |
| 41 | +Implication: the same CI command now executes more sqllogictest content than before. |
| 42 | + |
| 43 | +### 2.1) PRs in this merge range that expanded sqllogictest corpus |
| 44 | +The following PRs (from `8f959bba6..76be0b64c`) had positive net line growth under |
| 45 | +`datafusion/sqllogictest/test_files`: |
| 46 | + |
| 47 | +- #20329 `fix: validate inter-file ordering in eq_properties()` (`+538`) |
| 48 | +- #20192 `Support parent dynamic filters for more join types` (`+282`) |
| 49 | +- #20228 `feat: Push limit into hash join` (`+265`) |
| 50 | +- #20247 `Fix incorrect SortExec removal before AggregateExec` (`+210`) |
| 51 | +- |
| 52 | +- #20117 `feat: add ExtractLeafExpressions optimizer rule for get_field pushdown` (`+166`) |
| 53 | +- #20412 `feat: support Spark-compatible json_tuple function` (`+154`) |
| 54 | +- #20288 `feat: Implement Spark bitmap_bucket_number function` (`+122`) |
| 55 | +- #20275 `feat: Implement Spark bitmap_bit_position function` (`+112`) |
| 56 | +- #20420 `test: Extend Spark Array functions: array_repeat, shuffle and slice test coverage` (`+55`) |
| 57 | +- #20189 `Adds support for ANSI mode in negative function` (`+52`) |
| 58 | +- #20224 `fix: Fix scalar broadcast for to_timestamp()` (`+26`) |
| 59 | +- #20279 `fix: disable dynamic filter pushdown for non min/max aggregates` (`+19`) |
| 60 | +- #20361 `fix: Handle Utf8View and LargeUtf8 separators in concat_ws` (`+19`) |
| 61 | +- #20191 `Support pushing down empty projections into joins` (`+19`) |
| 62 | +- #20328 `perf: Optimize trim UDFs for single-character trims` (`+9`) |
| 63 | +- #20241 `fix: Add integer check for bitwise coercion` (`+8`) |
| 64 | +- #20305 `perf: Optimize translate() UDF for scalar inputs` (`+5`) |
| 65 | +- #20341 `Reduce ExtractLeafExpressions optimizer overhead with fast pre-scan` (`+2`) |
| 66 | + |
| 67 | +Notes: |
| 68 | +- Net growth values above are line-based deltas in `datafusion/sqllogictest/test_files`. |
| 69 | +- Some PRs touched sqllogictests with net `0` (balanced add/remove) and are excluded here. |
| 70 | + |
| 71 | +### 3) Sqllogictest crate/dependency changes also landed from `main` |
| 72 | +In `datafusion/sqllogictest/Cargo.toml`: |
| 73 | +- `sqllogictest 0.29.0 -> 0.29.1` |
| 74 | +- `clap 4.5.57 -> 4.5.60` |
| 75 | + |
| 76 | +`Cargo.lock` in the merge range changed significantly (`+350/-185`), including new packages. |
| 77 | + |
| 78 | +Implication: compile/setup time for the job can increase even if workflow YAML is unchanged. |
| 79 | + |
| 80 | +### 4) Datafusion engine/query-planning code changed heavily in the merge range |
| 81 | +This merge pulled many optimizer/execution changes from `main` (plus extensive sqllogictest updates). Even with "perf" commits, net runtime of this specific test corpus can still shift. |
| 82 | + |
| 83 | +Implication: execution time of thousands of sqllogictest queries can change due to planner/executor behavior changes, not only due to test-count growth. |
| 84 | + |
| 85 | +## Most likely explanation |
| 86 | +The duration increase is most likely from **workload growth + dependency/build churn introduced from `main`**, not from a workflow definition change in commit `76be0b64c` itself. |
| 87 | + |
| 88 | +In other words, `76be0b64c` is the integration point where many upstream changes became active on this branch. |
| 89 | + |
| 90 | +## Confidence |
| 91 | +- High confidence: no job YAML change in this merge, and sqllogictest corpus/deps grew. |
| 92 | +- Medium confidence on exact split between "build-time increase" vs "test-runtime increase" because I could not fetch GitHub step timing logs in this environment. |
| 93 | + |
| 94 | +## Limitation encountered |
| 95 | +`gh auth status` shows the local GitHub token is invalid, so I could not inspect historical GitHub Actions step durations for direct Build-vs-Run timing attribution. |
| 96 | + |
| 97 | +## Recommended next check (to confirm exact driver) |
| 98 | +Compare step durations for two runs (before/after `76be0b64c`) for: |
| 99 | +1. `Build sqllogictest binary` |
| 100 | +2. `Run sqllogictest` |
| 101 | + |
| 102 | +If Build step grew most: dependency/compile churn is primary. |
| 103 | +If Run step grew most: test corpus / query execution behavior is primary. |
0 commit comments