Optimize fuzzer artifact upload/download in scheduled.yml (#16767)#16767
Optimize fuzzer artifact upload/download in scheduled.yml (#16767)#16767pratikpugalia wants to merge 1 commit intofacebookincubator:mainfrom
Conversation
|
@pratikpugalia has exported this pull request. If you are a Meta employee, you can view the originating Diff in D96555567. |
✅ Deploy Preview for meta-velox canceled.
|
12a5eca to
625662e
Compare
…cubator#16767) Summary: The compile job in `scheduled.yml` builds 14 fuzzer binaries and uploads each as a separate artifact (~500-700 MB unstripped debug C++ binaries). This results in 14 sequential `upload-artifact` calls totaling ~15-30 min of upload time, plus redundant download overhead across 19 downstream fuzzer jobs. This diff optimizes artifact handling in two ways: 1. **Strip debug symbols**: After the build, `strip -o` copies each binary to a staging directory without debug info, reducing individual binary sizes from ~500-700 MB to ~50-70 MB each (~1 GB total). 2. **Bundle into a single artifact**: Replaces 14 individual `upload-artifact` steps with one upload of the staging directory, and updates all 19 downstream fuzzer jobs to download this single `fuzzer-binaries` artifact. Estimated impact: - Upload time: ~15-30 min → ~2-3 min (1 upload vs 14 sequential) - Download size per fuzzer job: ~500-700 MB → ~1 GB shared bundle (stripped) - Net wall-clock savings: ~15-25 min per workflow run The `signatures` artifact is unaffected. All binary filenames used by `chmod +x` and execution steps remain unchanged. Differential Revision: D96555567
|
@claude /pr-review Review from the context that this change should bring substantial improvement in either build time or build stability. Also make sure these changes are compatible in Mars if we deploy it there too |
|
Claude finished @pratikpugalia's task in 3m 8s —— View job SummaryThis is a well-motivated optimization that consolidates 14 individual fuzzer artifact uploads into a single stripped bundle. The change is mechanically correct — all binary names match, the working directory context is handled properly, and no other workflows reference the old artifact names. The estimated wall-clock savings of ~15-25 min per workflow run is credible given the reduction from 14 sequential uploads of ~500-700 MB binaries to 1 upload of ~1 GB stripped total. Two points warrant discussion before merge. Issues Found🟡 Suggestion: Stripping debug symbols may degrade fuzzer crash investigation Stripping binaries ( The fuzzers exist specifically to find bugs, and when they do, fast root-cause analysis is the priority. Consider one of:
🟡 Suggestion: No error handling if The strip step runs 14 Adding 🟢 Nitpick: Each downstream job downloads all 14 binaries but uses only 1-2 Mars CompatibilityRegarding deployment to Mars: this PR only modifies
If Mars uses its own CI system (not GitHub Actions), this change is entirely irrelevant to it — Mars wouldn't consume this workflow at all. If Mars mirrors or forks the GitHub Actions workflows, the only requirement is that the runner environment has There are no compatibility concerns for Mars deployment. Positive Observations
|
| - name: Strip fuzzer binaries | ||
| run: | | ||
| mkdir -p /tmp/fuzzer-binaries | ||
| strip -o /tmp/fuzzer-binaries/velox_expression_fuzzer_test _build/debug/velox/expression/fuzzer/velox_expression_fuzzer_test | ||
| strip -o /tmp/fuzzer-binaries/spark_expression_fuzzer_test _build/debug/velox/expression/fuzzer/spark_expression_fuzzer_test | ||
| strip -o /tmp/fuzzer-binaries/spark_aggregation_fuzzer_test _build/debug/velox/functions/sparksql/fuzzer/spark_aggregation_fuzzer_test | ||
| strip -o /tmp/fuzzer-binaries/velox_aggregation_fuzzer_test _build/debug/velox/functions/prestosql/fuzzer/velox_aggregation_fuzzer_test | ||
| strip -o /tmp/fuzzer-binaries/velox_join_fuzzer _build/debug/velox/exec/fuzzer/velox_join_fuzzer | ||
| strip -o /tmp/fuzzer-binaries/velox_exchange_fuzzer _build/debug/velox/exec/fuzzer/velox_exchange_fuzzer | ||
| strip -o /tmp/fuzzer-binaries/velox_window_fuzzer_test _build/debug/velox/functions/prestosql/fuzzer/velox_window_fuzzer_test | ||
| strip -o /tmp/fuzzer-binaries/velox_cache_fuzzer _build/debug/velox/exec/fuzzer/velox_cache_fuzzer | ||
| strip -o /tmp/fuzzer-binaries/velox_table_evolution_fuzzer_test _build/debug/velox/exec/tests/velox_table_evolution_fuzzer_test | ||
| strip -o /tmp/fuzzer-binaries/velox_memory_arbitration_fuzzer _build/debug/velox/exec/fuzzer/velox_memory_arbitration_fuzzer | ||
| strip -o /tmp/fuzzer-binaries/velox_row_number_fuzzer _build/debug/velox/exec/fuzzer/velox_row_number_fuzzer | ||
| strip -o /tmp/fuzzer-binaries/velox_topn_row_number_fuzzer _build/debug/velox/exec/fuzzer/velox_topn_row_number_fuzzer | ||
| strip -o /tmp/fuzzer-binaries/velox_writer_fuzzer_test _build/debug/velox/functions/prestosql/fuzzer/velox_writer_fuzzer_test | ||
| strip -o /tmp/fuzzer-binaries/velox_spatial_join_fuzzer _build/debug/velox/exec/fuzzer/velox_spatial_join_fuzzer |
There was a problem hiding this comment.
Consider using strip --strip-debug instead of strip (full strip). Full strip removes the symbol table, so fuzzer crash stack traces will only show raw addresses — making it harder to investigate failures without rebuilding.
--strip-debug removes DWARF debug info (the bulk of the size) while preserving function names in the symbol table. You'd still get ~80-90% size reduction, and crash stack traces would show meaningful function names.
Also, add set -euo pipefail at the top of this block so that a failed strip command fails the step immediately rather than silently producing a partial set of binaries:
| - name: Strip fuzzer binaries | |
| run: | | |
| mkdir -p /tmp/fuzzer-binaries | |
| strip -o /tmp/fuzzer-binaries/velox_expression_fuzzer_test _build/debug/velox/expression/fuzzer/velox_expression_fuzzer_test | |
| strip -o /tmp/fuzzer-binaries/spark_expression_fuzzer_test _build/debug/velox/expression/fuzzer/spark_expression_fuzzer_test | |
| strip -o /tmp/fuzzer-binaries/spark_aggregation_fuzzer_test _build/debug/velox/functions/sparksql/fuzzer/spark_aggregation_fuzzer_test | |
| strip -o /tmp/fuzzer-binaries/velox_aggregation_fuzzer_test _build/debug/velox/functions/prestosql/fuzzer/velox_aggregation_fuzzer_test | |
| strip -o /tmp/fuzzer-binaries/velox_join_fuzzer _build/debug/velox/exec/fuzzer/velox_join_fuzzer | |
| strip -o /tmp/fuzzer-binaries/velox_exchange_fuzzer _build/debug/velox/exec/fuzzer/velox_exchange_fuzzer | |
| strip -o /tmp/fuzzer-binaries/velox_window_fuzzer_test _build/debug/velox/functions/prestosql/fuzzer/velox_window_fuzzer_test | |
| strip -o /tmp/fuzzer-binaries/velox_cache_fuzzer _build/debug/velox/exec/fuzzer/velox_cache_fuzzer | |
| strip -o /tmp/fuzzer-binaries/velox_table_evolution_fuzzer_test _build/debug/velox/exec/tests/velox_table_evolution_fuzzer_test | |
| strip -o /tmp/fuzzer-binaries/velox_memory_arbitration_fuzzer _build/debug/velox/exec/fuzzer/velox_memory_arbitration_fuzzer | |
| strip -o /tmp/fuzzer-binaries/velox_row_number_fuzzer _build/debug/velox/exec/fuzzer/velox_row_number_fuzzer | |
| strip -o /tmp/fuzzer-binaries/velox_topn_row_number_fuzzer _build/debug/velox/exec/fuzzer/velox_topn_row_number_fuzzer | |
| strip -o /tmp/fuzzer-binaries/velox_writer_fuzzer_test _build/debug/velox/functions/prestosql/fuzzer/velox_writer_fuzzer_test | |
| strip -o /tmp/fuzzer-binaries/velox_spatial_join_fuzzer _build/debug/velox/exec/fuzzer/velox_spatial_join_fuzzer | |
| - name: Strip fuzzer binaries | |
| run: | | |
| set -euo pipefail | |
| mkdir -p /tmp/fuzzer-binaries | |
| strip --strip-debug -o /tmp/fuzzer-binaries/velox_expression_fuzzer_test _build/debug/velox/expression/fuzzer/velox_expression_fuzzer_test | |
| strip --strip-debug -o /tmp/fuzzer-binaries/spark_expression_fuzzer_test _build/debug/velox/expression/fuzzer/spark_expression_fuzzer_test | |
| strip --strip-debug -o /tmp/fuzzer-binaries/spark_aggregation_fuzzer_test _build/debug/velox/functions/sparksql/fuzzer/spark_aggregation_fuzzer_test | |
| strip --strip-debug -o /tmp/fuzzer-binaries/velox_aggregation_fuzzer_test _build/debug/velox/functions/prestosql/fuzzer/velox_aggregation_fuzzer_test | |
| strip --strip-debug -o /tmp/fuzzer-binaries/velox_join_fuzzer _build/debug/velox/exec/fuzzer/velox_join_fuzzer | |
| strip --strip-debug -o /tmp/fuzzer-binaries/velox_exchange_fuzzer _build/debug/velox/exec/fuzzer/velox_exchange_fuzzer | |
| strip --strip-debug -o /tmp/fuzzer-binaries/velox_window_fuzzer_test _build/debug/velox/functions/prestosql/fuzzer/velox_window_fuzzer_test | |
| strip --strip-debug -o /tmp/fuzzer-binaries/velox_cache_fuzzer _build/debug/velox/exec/fuzzer/velox_cache_fuzzer | |
| strip --strip-debug -o /tmp/fuzzer-binaries/velox_table_evolution_fuzzer_test _build/debug/velox/exec/tests/velox_table_evolution_fuzzer_test | |
| strip --strip-debug -o /tmp/fuzzer-binaries/velox_memory_arbitration_fuzzer _build/debug/velox/exec/fuzzer/velox_memory_arbitration_fuzzer | |
| strip --strip-debug -o /tmp/fuzzer-binaries/velox_row_number_fuzzer _build/debug/velox/exec/fuzzer/velox_row_number_fuzzer | |
| strip --strip-debug -o /tmp/fuzzer-binaries/velox_topn_row_number_fuzzer _build/debug/velox/exec/fuzzer/velox_topn_row_number_fuzzer | |
| strip --strip-debug -o /tmp/fuzzer-binaries/velox_writer_fuzzer_test _build/debug/velox/functions/prestosql/fuzzer/velox_writer_fuzzer_test | |
| strip --strip-debug -o /tmp/fuzzer-binaries/velox_spatial_join_fuzzer _build/debug/velox/exec/fuzzer/velox_spatial_join_fuzzer |
…cubator#16767) Summary: The compile job in `scheduled.yml` builds 14 fuzzer binaries and uploads each as a separate artifact (~500-700 MB unstripped debug C++ binaries). This results in 14 sequential `upload-artifact` calls totaling ~15-30 min of upload time, plus redundant download overhead across 19 downstream fuzzer jobs. This diff optimizes artifact handling in two ways: 1. **Strip debug symbols**: After the build, `strip -o` copies each binary to a staging directory without debug info, reducing individual binary sizes from ~500-700 MB to ~50-70 MB each (~1 GB total). 2. **Bundle into a single artifact**: Replaces 14 individual `upload-artifact` steps with one upload of the staging directory, and updates all 19 downstream fuzzer jobs to download this single `fuzzer-binaries` artifact. Estimated impact: - Upload time: ~15-30 min → ~2-3 min (1 upload vs 14 sequential) - Download size per fuzzer job: ~500-700 MB → ~1 GB shared bundle (stripped) - Net wall-clock savings: ~15-25 min per workflow run The `signatures` artifact is unaffected. All binary filenames used by `chmod +x` and execution steps remain unchanged. Differential Revision: D96555567
625662e to
0d488e7
Compare
…cubator#16767) Summary: Pull Request resolved: facebookincubator#16767 The compile job in `scheduled.yml` builds 14 fuzzer binaries and uploads each as a separate artifact (~500-700 MB unstripped debug C++ binaries). This results in 14 sequential `upload-artifact` calls totaling ~15-30 min of upload time, plus redundant download overhead across 19 downstream fuzzer jobs. This diff optimizes artifact handling in two ways: 1. **Strip debug symbols**: After the build, `strip -o` copies each binary to a staging directory without debug info, reducing individual binary sizes from ~500-700 MB to ~50-70 MB each (~1 GB total). 2. **Bundle into a single artifact**: Replaces 14 individual `upload-artifact` steps with one upload of the staging directory, and updates all 19 downstream fuzzer jobs to download this single `fuzzer-binaries` artifact. Estimated impact: - Upload time: ~15-30 min → ~2-3 min (1 upload vs 14 sequential) - Download size per fuzzer job: ~500-700 MB → ~1 GB shared bundle (stripped) - Net wall-clock savings: ~15-25 min per workflow run The `signatures` artifact is unaffected. All binary filenames used by `chmod +x` and execution steps remain unchanged. Differential Revision: D96555567
0d488e7 to
67013db
Compare
Summary:
The compile job in
scheduled.ymlbuilds 14 fuzzer binaries and uploads each asa separate artifact (~500-700 MB unstripped debug C++ binaries). This results in
14 sequential
upload-artifactcalls totaling ~15-30 min of upload time, plusredundant download overhead across 19 downstream fuzzer jobs.
This diff optimizes artifact handling in two ways:
Strip debug symbols: After the build,
strip -ocopies each binary to astaging directory without debug info, reducing individual binary sizes from
~500-700 MB to ~50-70 MB each (~1 GB total).
Bundle into a single artifact: Replaces 14 individual
upload-artifactsteps with one upload of the staging directory, and updates all 19 downstream
fuzzer jobs to download this single
fuzzer-binariesartifact.Estimated impact:
The
signaturesartifact is unaffected. All binary filenames used bychmod +xand execution steps remain unchanged.
Differential Revision: D96555567