Skip to content

fix(build): Harden the merge to main builds#16424

Closed
kgpai wants to merge 12 commits intofacebookincubator:mainfrom
kgpai:export-D93500381
Closed

fix(build): Harden the merge to main builds#16424
kgpai wants to merge 12 commits intofacebookincubator:mainfrom
kgpai:export-D93500381

Conversation

@kgpai
Copy link
Contributor

@kgpai kgpai commented Feb 17, 2026

Summary: The merge to main job with resolve_dependencies fails occasionaly since it tries to get all dependencies from the internet . As we are trying to make the builds more robust and green - this change sets the job to use SYSTEM dependency. We will still test the resolve dependency module out but run it via our scheduled jobs.

Differential Revision: D93500381

@netlify
Copy link

netlify bot commented Feb 17, 2026

Deploy Preview for meta-velox canceled.

Name Link
🔨 Latest commit 6354c52
🔍 Latest deploy log https://app.netlify.com/projects/meta-velox/deploys/69b2d4c00e7c67000873c537

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 17, 2026
@meta-codesync
Copy link

meta-codesync bot commented Feb 17, 2026

@kgpai has exported this pull request. If you are a Meta employee, you can view the originating Diff in D93500381.

@kgpai
Copy link
Contributor Author

kgpai commented Feb 26, 2026

/claude-review

kgpai pushed a commit to kgpai/velox-1 that referenced this pull request Feb 26, 2026
Summary:

The merge to main job with resolve_dependencies fails occasionaly since it tries to get all dependencies from the internet . As we are trying to make the builds more robust and green - this change sets the job to use SYSTEM dependency. We will still test the resolve dependency module out but run it via our scheduled jobs.

Differential Revision: D93500381
kgpai pushed a commit to kgpai/velox-1 that referenced this pull request Feb 26, 2026
Summary:
Pull Request resolved: facebookincubator#16424

The merge to main job with resolve_dependencies fails occasionaly since it tries to get all dependencies from the internet . As we are trying to make the builds more robust and green - this change sets the job to use SYSTEM dependency. We will still test the resolve dependency module out but run it via our scheduled jobs.

Differential Revision: D93500381
kgpai pushed a commit to kgpai/velox-1 that referenced this pull request Feb 26, 2026
Summary:

The merge to main job with resolve_dependencies fails occasionaly since it tries to get all dependencies from the internet . As we are trying to make the builds more robust and green - this change sets the job to use SYSTEM dependency. We will still test the resolve dependency module out but run it via our scheduled jobs.

Differential Revision: D93500381
@kgpai
Copy link
Contributor Author

kgpai commented Feb 26, 2026

/claude-review

1 similar comment
@kgpai
Copy link
Contributor Author

kgpai commented Feb 26, 2026

/claude-review

@github-actions
Copy link

✅ Claude Code Review

Requested by @kgpai


Now let me understand what changed in the diff better. Based on the code analysis, I can see that:

  1. The PR changes the main build job to use VELOX_DEPENDENCY_SOURCE: SYSTEM instead of BUNDLED
  2. It adds a new scheduled job that tests with VELOX_DEPENDENCY_SOURCE: BUNDLED to continue testing bundled dependencies

Summary

This is a well-reasoned build system improvement that switches the merge-to-main builds from bundled dependencies (building from source) to system dependencies (using pre-installed packages) to improve build stability. The change maintains test coverage by adding bundled dependency testing to scheduled jobs.

Issues Found

None. The changes are appropriate and well-structured.

Positive Observations

Good Build Strategy: The change addresses the stated problem of occasional build failures when fetching dependencies from the internet by switching to more reliable system dependencies for critical merge builds.

Maintains Test Coverage: The PR thoughtfully preserves bundled dependency testing by moving it to scheduled jobs rather than eliminating it entirely.

Consistent Structure: The new scheduled job follows the same pattern as existing jobs, maintaining consistency in the CI pipeline structure.

Appropriate Environment Variables: Both VELOX_DEPENDENCY_SOURCE and ICU_SOURCE are correctly set in both jobs, showing attention to detail.

Proper Conditional Guard: The new scheduled job includes the proper repository check if: ${{ github.repository == 'facebookincubator/velox' }} to prevent issues in forks.

Complete Job Definition: The new job includes all necessary steps including ccache management, dependency installation, build process, and testing.

The changes align well with the goal of making builds more robust and reducing CI failures while maintaining comprehensive dependency testing coverage.


ℹ️ About this review

This review was generated by Claude Code.
It analyzed the PR diff and codebase to provide feedback.

Limitations:

  • Claude may miss context from files not in the diff
  • Large PRs may be truncated
  • Always apply human judgment to AI suggestions

To request another review, comment /claude-review on this PR.

kgpai pushed a commit to kgpai/velox-1 that referenced this pull request Mar 10, 2026
Summary:

The merge to main job with resolve_dependencies fails occasionaly since it tries to get all dependencies from the internet . As we are trying to make the builds more robust and green - this change sets the job to use SYSTEM dependency. We will still test the resolve dependency module out but run it via our scheduled jobs.

Differential Revision: D93500381
@kgpai kgpai force-pushed the export-D93500381 branch 3 times, most recently from 4f9f688 to b27ab3e Compare March 10, 2026 05:24
meta-codesync bot pushed a commit that referenced this pull request Mar 10, 2026
Summary:
- Extract standalone `install_grpc()` function from `install_gcs_sdk_cpp()` in `setup-common.sh` and refactor `install_gcs_sdk_cpp` to reuse it
- Add `install_grpc` to `setup-ubuntu.sh`'s `install_velox_deps` so gRPC is built from source and available as a SYSTEM dependency in the container
- Install `gh` and `jq` in the ubuntu-22.04 Dockerfile, required by `apache/infrastructure-actions/stash` CI action

This enables using `VELOX_DEPENDENCY_SOURCE=SYSTEM` with `VELOX_BUILD_TESTING=ON` in the ubuntu-22.04 container without needing to override `gRPC_SOURCE=BUNDLED` or install CI tools at runtime.

Pull Request resolved: #16701

Test Plan:
- [ ] Verify the Docker image builds successfully (triggered by the `scripts/` path filter)
- [ ] Verify the `install_gcs_sdk_cpp` refactor doesn't break GCS adapter builds
- [ ] Once image is published, update PR #16424 to remove `gRPC_SOURCE: BUNDLED` override and runtime `gh`/`jq` installation

Reviewed By: pratikpugalia

Differential Revision: D95972509

Pulled By: kgpai

fbshipit-source-id: c6b9f99ed221c97b8977351f3b859682be5a8198
infvg pushed a commit to IBM/velox that referenced this pull request Mar 11, 2026
…incubator#16701)

Summary:
- Extract standalone `install_grpc()` function from `install_gcs_sdk_cpp()` in `setup-common.sh` and refactor `install_gcs_sdk_cpp` to reuse it
- Add `install_grpc` to `setup-ubuntu.sh`'s `install_velox_deps` so gRPC is built from source and available as a SYSTEM dependency in the container
- Install `gh` and `jq` in the ubuntu-22.04 Dockerfile, required by `apache/infrastructure-actions/stash` CI action

This enables using `VELOX_DEPENDENCY_SOURCE=SYSTEM` with `VELOX_BUILD_TESTING=ON` in the ubuntu-22.04 container without needing to override `gRPC_SOURCE=BUNDLED` or install CI tools at runtime.

Pull Request resolved: facebookincubator#16701

Test Plan:
- [ ] Verify the Docker image builds successfully (triggered by the `scripts/` path filter)
- [ ] Verify the `install_gcs_sdk_cpp` refactor doesn't break GCS adapter builds
- [ ] Once image is published, update PR facebookincubator#16424 to remove `gRPC_SOURCE: BUNDLED` override and runtime `gh`/`jq` installation

Reviewed By: pratikpugalia

Differential Revision: D95972509

Pulled By: kgpai

fbshipit-source-id: c6b9f99ed221c97b8977351f3b859682be5a8198
meta-codesync bot pushed a commit that referenced this pull request Mar 11, 2026
Summary:
- Pre-downloads the gflags v2.2.2 source tarball into the Ubuntu Docker image at `/velox/deps-sources/gflags-v2.2.2.tar.gz`
- This allows CI jobs that use `gflags_SOURCE=BUNDLED` to set `VELOX_GFLAGS_URL` to the local copy, avoiding a GitHub download on every run
- Needed by PR #16424 which switches ubuntu-debug to SYSTEM dependencies with BUNDLED gflags

Pull Request resolved: #16713

Test Plan:
- [ ] Docker image builds successfully (triggered automatically by this PR)
- [ ] Verify the tarball exists at `/velox/deps-sources/gflags-v2.2.2.tar.gz` in the built image

Reviewed By: pratikpugalia

Differential Revision: D96080928

Pulled By: kgpai

fbshipit-source-id: d25d0435ee6f1923408bdf83c45f8ee5fc451356
@kgpai kgpai force-pushed the export-D93500381 branch from 5c67a3f to fffcf0c Compare March 11, 2026 22:51
kgpai added 2 commits March 12, 2026 07:58
Switch ubuntu-debug from BUNDLED to SYSTEM dependency resolution to
avoid flaky internet downloads during merge-to-main builds.

Move bundled-deps testing to a new scheduled job to maintain coverage
of the resolve_dependency module.
…andalone workflow

- Switch ubuntu-debug job to use ghcr.io/facebookincubator/velox-dev:ubuntu-22.04
  container with SYSTEM dependency resolution instead of BUNDLED on plain runner
- Add conditional dependency rebuild when setup scripts change (matching fedora pattern)
- Remove -DVELOX_BUILD_SHARED=ON (container has static deps)
- Increase MAX_LINK_JOBS from 2 to 6
- Move bundled-deps job from scheduled.yml to standalone ubuntu-bundled-deps.yml
  workflow, triggered only on dependency/CMake changes and daily schedule
kgpai added 10 commits March 12, 2026 07:58
The apache/infrastructure-actions/stash action requires gh and jq
which are not present in the ubuntu-22.04 container image (unlike
the centos/fedora images which install them explicitly).
…buntu

The apt package libc-ares-dev installs the library as libcares.so
(name: cares), but the Find module only searched for c-ares. Add
cares as an alternative library name.
gRPC is not installed as a system package in the ubuntu-22.04 container
and has no Find module, only a FetchContent-based build. Override just
gRPC to build from source while all other deps use SYSTEM resolution.
The build was OOM-killed (exit code 137) at 2697/2785 during the
linking phase when 6 large static test executables were being linked
simultaneously on the 16-core runner (64GB RAM). Reducing to 3
concurrent link jobs matches the release build configuration on the
same hardware.
The velox_function_dynamic_link_test fails with SYSTEM dependencies
because Ubuntu's libgflags-dev provides shared libgflags.so by default.
When folly is statically linked into both the test binary and the
dlopen'd .so plugin, both copies try to register gflags flags into the
same shared gflags registry, causing a duplicate flag error.

The --exclude-libs,ALL linker flag only hides symbols from static
archives, not from shared library dependencies like libgflags.so.

Setting VELOX_GFLAGS_TYPE=static tells find_package(gflags) to use the
static libgflags.a (also shipped by libgflags-dev), which keeps gflags
consistent with the static folly build and allows --exclude-libs,ALL to
properly isolate the plugin's symbols.
Three changes for the ubuntu-debug SYSTEM deps build:

1. Remove runtime gh/jq installation - now baked into the container
   image (commit 8a4cf83).

2. Remove gRPC_SOURCE=BUNDLED override - gRPC is now installed in the
   container image.

3. Skip velox_function_dynamic_link_test in SYSTEM builds. The test
   fails because the container's folly is built as static (libfolly.a)
   but linked against system shared gflags (libgflags.so). Folly's
   exported CMake config hardcodes gflags_shared as a transitive
   dependency. When both the test binary and dlopen'd .so plugin
   statically link folly, both copies register the same gflags flags
   into the shared gflags registry, causing a duplicate flag error.
   The --exclude-libs,ALL linker flag cannot prevent this because it
   only hides symbols from static archives, not shared libraries.
   The test is still exercised in BUNDLED builds (Fedora, macOS).
When folly is static and gflags is shared, dlopen'ing .so plugins that
also link static folly causes duplicate gflags flag registration
(both copies of folly register the same flags in the shared gflags
registry).

Fix by:
1. Building folly with -DGFLAGS_SHARED=FALSE so its exported CMake
   config references gflags_static instead of gflags_shared.
2. Using gflags_SOURCE=BUNDLED in ubuntu-debug CI to provide a
   PIC-enabled static gflags (apt's libgflags.a lacks -fPIC and
   cannot be linked into shared libraries).

This ensures --exclude-libs,ALL fully isolates gflags symbols in .so
plugins, preventing the dual-registration crash.

Verified on test machine: all 6 dynamic link tests pass.
Fedora's gflags-devel package only provides shared libraries, so
passing -DGFLAGS_SHARED=FALSE unconditionally breaks the Fedora
Docker image build. Gate the flag on VELOX_BUILD_SHARED != ON since
the dual gflags registration issue only affects static folly builds.
- CMakeLists.txt: When gflags_SOURCE=BUNDLED, check ENV{glog_SOURCE}
  before forcing glog to BUNDLED. This allows CI to keep system glog
  (which is built against the same gflags) and avoids dual glog flag
  registration at runtime.
- linux-build-base.yml: Set glog_SOURCE=SYSTEM for ubuntu-debug so
  system glog is used alongside BUNDLED gflags.
- centos-multi.dockerfile: Add LD_LIBRARY_PATH for INSTALL_PREFIX so
  thrift1 can find libgflags.so.2.2 at runtime when folly switches
  from shared to static gflags (removing the RPATH entry).
…xistence

The previous commit checked DEFINED ENV{glog_SOURCE} but did not read
its value, leaving the glog_SOURCE CMake variable empty. This caused
velox_resolve_dependency(glog) to fail with "Invalid source for glog:".

Now reads the env var value with $ENV{glog_SOURCE} so that
glog_SOURCE=SYSTEM propagates correctly to the build.
@kgpai kgpai force-pushed the export-D93500381 branch from fffcf0c to 6354c52 Compare March 12, 2026 14:59
@meta-codesync
Copy link

meta-codesync bot commented Mar 12, 2026

@kgpai has imported this pull request. If you are a Meta employee, you can view this in D93500381.

@meta-codesync
Copy link

meta-codesync bot commented Mar 12, 2026

@kgpai merged this pull request in 1f4d5ff.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported Merged meta-exported

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants