Skip to content

Optimize fuzzer artifact upload/download in scheduled.yml (#16767)#16767

Open
pratikpugalia wants to merge 1 commit intofacebookincubator:mainfrom
pratikpugalia:export-D96555567
Open

Optimize fuzzer artifact upload/download in scheduled.yml (#16767)#16767
pratikpugalia wants to merge 1 commit intofacebookincubator:mainfrom
pratikpugalia:export-D96555567

Conversation

@pratikpugalia
Copy link
Contributor

@pratikpugalia pratikpugalia commented Mar 14, 2026

Summary:

The compile job in scheduled.yml builds 14 fuzzer binaries and uploads each as
a separate artifact (~500-700 MB unstripped debug C++ binaries). This results in
14 sequential upload-artifact calls totaling ~15-30 min of upload time, plus
redundant download overhead across 19 downstream fuzzer jobs.

This diff optimizes artifact handling in two ways:

  1. Strip debug symbols: After the build, strip -o copies each binary to a
    staging directory without debug info, reducing individual binary sizes from
    ~500-700 MB to ~50-70 MB each (~1 GB total).

  2. Bundle into a single artifact: Replaces 14 individual upload-artifact
    steps with one upload of the staging directory, and updates all 19 downstream
    fuzzer jobs to download this single fuzzer-binaries artifact.

Estimated impact:

  • Upload time: ~15-30 min → ~2-3 min (1 upload vs 14 sequential)
  • Download size per fuzzer job: ~500-700 MB → ~1 GB shared bundle (stripped)
  • Net wall-clock savings: ~15-25 min per workflow run

The signatures artifact is unaffected. All binary filenames used by chmod +x
and execution steps remain unchanged.

Differential Revision: D96555567

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 14, 2026
@meta-codesync
Copy link

meta-codesync bot commented Mar 14, 2026

@pratikpugalia has exported this pull request. If you are a Meta employee, you can view the originating Diff in D96555567.

@netlify
Copy link

netlify bot commented Mar 14, 2026

Deploy Preview for meta-velox canceled.

Name Link
🔨 Latest commit 67013db
🔍 Latest deploy log https://app.netlify.com/projects/meta-velox/deploys/69b883e5d25e8b0008ecceb9

@pratikpugalia pratikpugalia changed the title Optimize fuzzer artifact upload/download in scheduled.yml build: Optimize fuzzer artifact upload/download in scheduled.yml Mar 14, 2026
@meta-codesync meta-codesync bot changed the title build: Optimize fuzzer artifact upload/download in scheduled.yml Optimize fuzzer artifact upload/download in scheduled.yml (#16767) Mar 14, 2026
pratikpugalia added a commit to pratikpugalia/velox that referenced this pull request Mar 14, 2026
…cubator#16767)

Summary:

The compile job in `scheduled.yml` builds 14 fuzzer binaries and uploads each as
a separate artifact (~500-700 MB unstripped debug C++ binaries). This results in
14 sequential `upload-artifact` calls totaling ~15-30 min of upload time, plus
redundant download overhead across 19 downstream fuzzer jobs.

This diff optimizes artifact handling in two ways:

1. **Strip debug symbols**: After the build, `strip -o` copies each binary to a
   staging directory without debug info, reducing individual binary sizes from
   ~500-700 MB to ~50-70 MB each (~1 GB total).

2. **Bundle into a single artifact**: Replaces 14 individual `upload-artifact`
   steps with one upload of the staging directory, and updates all 19 downstream
   fuzzer jobs to download this single `fuzzer-binaries` artifact.

Estimated impact:
- Upload time: ~15-30 min → ~2-3 min (1 upload vs 14 sequential)
- Download size per fuzzer job: ~500-700 MB → ~1 GB shared bundle (stripped)
- Net wall-clock savings: ~15-25 min per workflow run

The `signatures` artifact is unaffected. All binary filenames used by `chmod +x`
and execution steps remain unchanged.

Differential Revision: D96555567
@pratikpugalia pratikpugalia changed the title Optimize fuzzer artifact upload/download in scheduled.yml (#16767) build: Optimize fuzzer artifact upload/download in scheduled.yml (#16767) Mar 14, 2026
@pratikpugalia
Copy link
Contributor Author

@claude /pr-review Review from the context that this change should bring substantial improvement in either build time or build stability. Also make sure these changes are compatible in Mars if we deploy it there too

@claude
Copy link

claude bot commented Mar 16, 2026

Claude finished @pratikpugalia's task in 3m 8s —— View job


Summary

This is a well-motivated optimization that consolidates 14 individual fuzzer artifact uploads into a single stripped bundle. The change is mechanically correct — all binary names match, the working directory context is handled properly, and no other workflows reference the old artifact names. The estimated wall-clock savings of ~15-25 min per workflow run is credible given the reduction from 14 sequential uploads of ~500-700 MB binaries to 1 upload of ~1 GB stripped total.

Two points warrant discussion before merge.

Issues Found

🟡 Suggestion: Stripping debug symbols may degrade fuzzer crash investigation
.github/workflows/scheduled.yml:288-304

Stripping binaries (strip -o) removes debug info, DWARF symbols, and line-number tables. When a fuzzer finds a crash, the resulting stack trace will only show raw addresses instead of function names and source locations. This means that every fuzzer crash will require a rebuild with debug symbols to investigate the failure, which adds friction to the debugging workflow.

The fuzzers exist specifically to find bugs, and when they do, fast root-cause analysis is the priority. Consider one of:

  • Using strip --strip-debug (or objcopy --strip-debug) instead of strip — this removes DWARF debug info but preserves the symbol table, giving you function names in stack traces while still achieving significant size reduction (~80-90% reduction vs ~90-95% for full strip).
  • Uploading the full unstripped binaries as a second artifact with a short retention (e.g., 1 day) that's only downloaded on-demand for crash investigation.

🟡 Suggestion: No error handling if strip fails
.github/workflows/scheduled.yml:289

The strip step runs 14 strip commands in sequence. If any one fails (e.g., disk space, corrupted binary), the remaining commands still execute, and the upload step will proceed with a partial set of binaries. Downstream jobs that need a missing binary will fail with a confusing "file not found" error at chmod +x time rather than a clear build failure.

Adding set -euo pipefail at the top of the run block would fail fast with a clear error if any strip command fails. This is already standard practice in the other run blocks in this workflow.

🟢 Nitpick: Each downstream job downloads all 14 binaries but uses only 1-2
Previously, each fuzzer job downloaded only its specific binary (~50-70 MB stripped). Now every job downloads the entire ~1 GB bundle. While this is offset by GitHub's artifact caching and the overall upload time savings, it does increase total bandwidth consumption across the 19 downstream jobs (from ~1 GB total to ~19 GB total download). This is a reasonable trade-off given the upload time savings, but worth noting.

Mars Compatibility

Regarding deployment to Mars: this PR only modifies .github/workflows/scheduled.yml, which is a GitHub Actions workflow file. It has no impact on the Velox library code, build system, or binary interfaces. The changes are purely CI-operational:

  • No source code changes
  • No CMake / build configuration changes
  • No changes to binary names, APIs, or library behavior
  • The strip command is a standard binutils tool available on all Linux environments

If Mars uses its own CI system (not GitHub Actions), this change is entirely irrelevant to it — Mars wouldn't consume this workflow at all. If Mars mirrors or forks the GitHub Actions workflows, the only requirement is that the runner environment has strip available, which is standard on any Linux build environment with a C++ toolchain.

There are no compatibility concerns for Mars deployment.

Positive Observations

  • Net reduction of 73 lines — the diff removes substantial YAML duplication (14 near-identical upload steps → 1 strip step + 1 upload step).
  • The binary filenames in the strip commands exactly match the basenames used by all 19 downstream jobs (verified exhaustively).
  • The artifact name fuzzer-binaries is clear and descriptive.
  • The signatures artifact is correctly left unchanged.
  • The old upload paths had inconsistencies (e.g., double-slash in velox/_build/debug//velox/exec/fuzzer/) that are cleaned up by using explicit output filenames in strip -o.

Comment on lines +288 to +304
- name: Strip fuzzer binaries
run: |
mkdir -p /tmp/fuzzer-binaries
strip -o /tmp/fuzzer-binaries/velox_expression_fuzzer_test _build/debug/velox/expression/fuzzer/velox_expression_fuzzer_test
strip -o /tmp/fuzzer-binaries/spark_expression_fuzzer_test _build/debug/velox/expression/fuzzer/spark_expression_fuzzer_test
strip -o /tmp/fuzzer-binaries/spark_aggregation_fuzzer_test _build/debug/velox/functions/sparksql/fuzzer/spark_aggregation_fuzzer_test
strip -o /tmp/fuzzer-binaries/velox_aggregation_fuzzer_test _build/debug/velox/functions/prestosql/fuzzer/velox_aggregation_fuzzer_test
strip -o /tmp/fuzzer-binaries/velox_join_fuzzer _build/debug/velox/exec/fuzzer/velox_join_fuzzer
strip -o /tmp/fuzzer-binaries/velox_exchange_fuzzer _build/debug/velox/exec/fuzzer/velox_exchange_fuzzer
strip -o /tmp/fuzzer-binaries/velox_window_fuzzer_test _build/debug/velox/functions/prestosql/fuzzer/velox_window_fuzzer_test
strip -o /tmp/fuzzer-binaries/velox_cache_fuzzer _build/debug/velox/exec/fuzzer/velox_cache_fuzzer
strip -o /tmp/fuzzer-binaries/velox_table_evolution_fuzzer_test _build/debug/velox/exec/tests/velox_table_evolution_fuzzer_test
strip -o /tmp/fuzzer-binaries/velox_memory_arbitration_fuzzer _build/debug/velox/exec/fuzzer/velox_memory_arbitration_fuzzer
strip -o /tmp/fuzzer-binaries/velox_row_number_fuzzer _build/debug/velox/exec/fuzzer/velox_row_number_fuzzer
strip -o /tmp/fuzzer-binaries/velox_topn_row_number_fuzzer _build/debug/velox/exec/fuzzer/velox_topn_row_number_fuzzer
strip -o /tmp/fuzzer-binaries/velox_writer_fuzzer_test _build/debug/velox/functions/prestosql/fuzzer/velox_writer_fuzzer_test
strip -o /tmp/fuzzer-binaries/velox_spatial_join_fuzzer _build/debug/velox/exec/fuzzer/velox_spatial_join_fuzzer
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using strip --strip-debug instead of strip (full strip). Full strip removes the symbol table, so fuzzer crash stack traces will only show raw addresses — making it harder to investigate failures without rebuilding.

--strip-debug removes DWARF debug info (the bulk of the size) while preserving function names in the symbol table. You'd still get ~80-90% size reduction, and crash stack traces would show meaningful function names.

Also, add set -euo pipefail at the top of this block so that a failed strip command fails the step immediately rather than silently producing a partial set of binaries:

Suggested change
- name: Strip fuzzer binaries
run: |
mkdir -p /tmp/fuzzer-binaries
strip -o /tmp/fuzzer-binaries/velox_expression_fuzzer_test _build/debug/velox/expression/fuzzer/velox_expression_fuzzer_test
strip -o /tmp/fuzzer-binaries/spark_expression_fuzzer_test _build/debug/velox/expression/fuzzer/spark_expression_fuzzer_test
strip -o /tmp/fuzzer-binaries/spark_aggregation_fuzzer_test _build/debug/velox/functions/sparksql/fuzzer/spark_aggregation_fuzzer_test
strip -o /tmp/fuzzer-binaries/velox_aggregation_fuzzer_test _build/debug/velox/functions/prestosql/fuzzer/velox_aggregation_fuzzer_test
strip -o /tmp/fuzzer-binaries/velox_join_fuzzer _build/debug/velox/exec/fuzzer/velox_join_fuzzer
strip -o /tmp/fuzzer-binaries/velox_exchange_fuzzer _build/debug/velox/exec/fuzzer/velox_exchange_fuzzer
strip -o /tmp/fuzzer-binaries/velox_window_fuzzer_test _build/debug/velox/functions/prestosql/fuzzer/velox_window_fuzzer_test
strip -o /tmp/fuzzer-binaries/velox_cache_fuzzer _build/debug/velox/exec/fuzzer/velox_cache_fuzzer
strip -o /tmp/fuzzer-binaries/velox_table_evolution_fuzzer_test _build/debug/velox/exec/tests/velox_table_evolution_fuzzer_test
strip -o /tmp/fuzzer-binaries/velox_memory_arbitration_fuzzer _build/debug/velox/exec/fuzzer/velox_memory_arbitration_fuzzer
strip -o /tmp/fuzzer-binaries/velox_row_number_fuzzer _build/debug/velox/exec/fuzzer/velox_row_number_fuzzer
strip -o /tmp/fuzzer-binaries/velox_topn_row_number_fuzzer _build/debug/velox/exec/fuzzer/velox_topn_row_number_fuzzer
strip -o /tmp/fuzzer-binaries/velox_writer_fuzzer_test _build/debug/velox/functions/prestosql/fuzzer/velox_writer_fuzzer_test
strip -o /tmp/fuzzer-binaries/velox_spatial_join_fuzzer _build/debug/velox/exec/fuzzer/velox_spatial_join_fuzzer
- name: Strip fuzzer binaries
run: |
set -euo pipefail
mkdir -p /tmp/fuzzer-binaries
strip --strip-debug -o /tmp/fuzzer-binaries/velox_expression_fuzzer_test _build/debug/velox/expression/fuzzer/velox_expression_fuzzer_test
strip --strip-debug -o /tmp/fuzzer-binaries/spark_expression_fuzzer_test _build/debug/velox/expression/fuzzer/spark_expression_fuzzer_test
strip --strip-debug -o /tmp/fuzzer-binaries/spark_aggregation_fuzzer_test _build/debug/velox/functions/sparksql/fuzzer/spark_aggregation_fuzzer_test
strip --strip-debug -o /tmp/fuzzer-binaries/velox_aggregation_fuzzer_test _build/debug/velox/functions/prestosql/fuzzer/velox_aggregation_fuzzer_test
strip --strip-debug -o /tmp/fuzzer-binaries/velox_join_fuzzer _build/debug/velox/exec/fuzzer/velox_join_fuzzer
strip --strip-debug -o /tmp/fuzzer-binaries/velox_exchange_fuzzer _build/debug/velox/exec/fuzzer/velox_exchange_fuzzer
strip --strip-debug -o /tmp/fuzzer-binaries/velox_window_fuzzer_test _build/debug/velox/functions/prestosql/fuzzer/velox_window_fuzzer_test
strip --strip-debug -o /tmp/fuzzer-binaries/velox_cache_fuzzer _build/debug/velox/exec/fuzzer/velox_cache_fuzzer
strip --strip-debug -o /tmp/fuzzer-binaries/velox_table_evolution_fuzzer_test _build/debug/velox/exec/tests/velox_table_evolution_fuzzer_test
strip --strip-debug -o /tmp/fuzzer-binaries/velox_memory_arbitration_fuzzer _build/debug/velox/exec/fuzzer/velox_memory_arbitration_fuzzer
strip --strip-debug -o /tmp/fuzzer-binaries/velox_row_number_fuzzer _build/debug/velox/exec/fuzzer/velox_row_number_fuzzer
strip --strip-debug -o /tmp/fuzzer-binaries/velox_topn_row_number_fuzzer _build/debug/velox/exec/fuzzer/velox_topn_row_number_fuzzer
strip --strip-debug -o /tmp/fuzzer-binaries/velox_writer_fuzzer_test _build/debug/velox/functions/prestosql/fuzzer/velox_writer_fuzzer_test
strip --strip-debug -o /tmp/fuzzer-binaries/velox_spatial_join_fuzzer _build/debug/velox/exec/fuzzer/velox_spatial_join_fuzzer

@meta-codesync meta-codesync bot changed the title build: Optimize fuzzer artifact upload/download in scheduled.yml (#16767) Optimize fuzzer artifact upload/download in scheduled.yml (#16767) Mar 16, 2026
pratikpugalia added a commit to pratikpugalia/velox that referenced this pull request Mar 16, 2026
…cubator#16767)

Summary:

The compile job in `scheduled.yml` builds 14 fuzzer binaries and uploads each as
a separate artifact (~500-700 MB unstripped debug C++ binaries). This results in
14 sequential `upload-artifact` calls totaling ~15-30 min of upload time, plus
redundant download overhead across 19 downstream fuzzer jobs.

This diff optimizes artifact handling in two ways:

1. **Strip debug symbols**: After the build, `strip -o` copies each binary to a
   staging directory without debug info, reducing individual binary sizes from
   ~500-700 MB to ~50-70 MB each (~1 GB total).

2. **Bundle into a single artifact**: Replaces 14 individual `upload-artifact`
   steps with one upload of the staging directory, and updates all 19 downstream
   fuzzer jobs to download this single `fuzzer-binaries` artifact.

Estimated impact:
- Upload time: ~15-30 min → ~2-3 min (1 upload vs 14 sequential)
- Download size per fuzzer job: ~500-700 MB → ~1 GB shared bundle (stripped)
- Net wall-clock savings: ~15-25 min per workflow run

The `signatures` artifact is unaffected. All binary filenames used by `chmod +x`
and execution steps remain unchanged.

Differential Revision: D96555567
…cubator#16767)

Summary:
Pull Request resolved: facebookincubator#16767

The compile job in `scheduled.yml` builds 14 fuzzer binaries and uploads each as
a separate artifact (~500-700 MB unstripped debug C++ binaries). This results in
14 sequential `upload-artifact` calls totaling ~15-30 min of upload time, plus
redundant download overhead across 19 downstream fuzzer jobs.

This diff optimizes artifact handling in two ways:

1. **Strip debug symbols**: After the build, `strip -o` copies each binary to a
   staging directory without debug info, reducing individual binary sizes from
   ~500-700 MB to ~50-70 MB each (~1 GB total).

2. **Bundle into a single artifact**: Replaces 14 individual `upload-artifact`
   steps with one upload of the staging directory, and updates all 19 downstream
   fuzzer jobs to download this single `fuzzer-binaries` artifact.

Estimated impact:
- Upload time: ~15-30 min → ~2-3 min (1 upload vs 14 sequential)
- Download size per fuzzer job: ~500-700 MB → ~1 GB shared bundle (stripped)
- Net wall-clock savings: ~15-25 min per workflow run

The `signatures` artifact is unaffected. All binary filenames used by `chmod +x`
and execution steps remain unchanged.

Differential Revision: D96555567
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported meta-exported

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant