Skip to content

Commit a5b601d

Browse files
authored
fix(infra): Resolve Race Condition in Parallel Base Image Builds (#14189)
## Summary This PR fixes a critical race condition in the base image build process that caused the `gcr.io/oss-fuzz-base/base-builder:ubuntu-24-04` image to be incorrectly built with an Ubuntu 20.04 base. The fix ensures build steps are executed in the correct order by explicitly defining a dependency graph, guaranteeing that versioned images are always built on top of their corresponding, freshly-built base layers. ## The Problem A report indicated that the `base-builder:ubuntu-24-04` image contained Ubuntu 20.04. An initial investigation confirmed this behavior. ### Investigation Steps 1. **Dockerfile Verification:** The entire dependency chain of Dockerfiles was inspected: * `base-builder:ubuntu-24-04` correctly used `FROM base-clang:ubuntu-24-04`. * `base-clang:ubuntu-24-04` correctly used `FROM base-image:ubuntu-24-04`. * `base-image:ubuntu-24-04` correctly used `FROM ubuntu:24.04`. This ruled out any static configuration errors in the Dockerfiles themselves. 2. **Build Process Analysis:** A `dry-run` of the `infra/build/functions/base_images.py` script revealed that all build steps for the different base images were being generated to run in parallel in Google Cloud Build. ### Root Cause: Race Condition The parallel execution was the source of the problem. Because the builds for `base-image`, `base-clang`, and `base-builder` were triggered simultaneously, a race condition occurred: * The `base-builder:ubuntu-24-04` build would start. * It would immediately try to pull its base image, `gcr.io/oss-fuzz-base/base-clang:ubuntu-24-04`. * However, the build for the *new* `base-clang:ubuntu-24-04` had not yet finished. * The build process would then fall back to using the existing image with that tag in the container registry, which was an older, incorrectly built version based on Ubuntu 20.04. The same issue was happening between `base-clang` and `base-image`. ## The Solution To resolve this, we now enforce a sequential build order that respects the image dependency hierarchy. 1. **Dependency Map:** An `IMAGE_DEPENDENCIES` dictionary was introduced in `infra/build/functions/base_images.py` to define the explicit build order (e.g., `base-clang` depends on `base-image`). 2. **Sequential Build Steps:** The `get_base_image_steps` function was updated to read this map and inject a `waitFor` clause into each Google Cloud Build step. This forces GCB to wait for a dependency to finish building before starting the next step in the chain. ### Verification A `dry-run` was executed after the fix, and the generated build steps now correctly reflect the sequential dependency order. A full build was also triggered, confirming that the fix works in a real environment and produces the correct image. This change ensures the integrity and correctness of our base images without sacrificing the parallelism between different Ubuntu version builds (e.g., the `ubuntu-20-04` and `ubuntu-24-04` builds still run in parallel with each other).
1 parent 211c3eb commit a5b601d

File tree

1 file changed

+35
-5
lines changed

1 file changed

+35
-5
lines changed

infra/build/functions/base_images.py

Lines changed: 35 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -47,6 +47,23 @@
4747
# This version will receive the ':v1' tag.
4848
DEFAULT_VERSION = 'legacy'
4949

50+
# Defines the dependency graph for base images.
51+
IMAGE_DEPENDENCIES = {
52+
'base-clang': ['base-image'],
53+
'base-clang-full': ['base-clang'],
54+
'base-builder': ['base-clang'],
55+
'base-builder-go': ['base-builder'],
56+
'base-builder-javascript': ['base-builder'],
57+
'base-builder-jvm': ['base-builder'],
58+
'base-builder-python': ['base-builder'],
59+
'base-builder-ruby': ['base-builder'],
60+
'base-builder-rust': ['base-builder'],
61+
'base-builder-swift': ['base-builder'],
62+
'base-runner': ['base-image', 'base-builder'],
63+
'base-runner-debug': ['base-runner'],
64+
'indexer': ['base-clang-full'],
65+
}
66+
5067

5168
class ImageConfig:
5269
"""Configuration for a specific base image version."""
@@ -85,6 +102,8 @@ def _resolve_dockerfile(self) -> str:
85102
if os.path.exists(versioned_dockerfile):
86103
logging.info('Using versioned Dockerfile: %s', versioned_dockerfile)
87104
return versioned_dockerfile
105+
raise FileNotFoundError(
106+
f'Versioned Dockerfile not found for {self.name}:{self.version}')
88107

89108
legacy_dockerfile = os.path.join(self.path, 'Dockerfile')
90109
logging.info('Using legacy Dockerfile: %s', legacy_dockerfile)
@@ -156,6 +175,8 @@ def full_image_name_with_tag(self) -> str:
156175
def get_base_image_steps(images: Sequence[ImageConfig]) -> list[dict]:
157176
"""Returns build steps for a given list of image configurations."""
158177
steps = [build_lib.get_git_clone_step()]
178+
build_ids = {}
179+
159180
for image_config in images:
160181
# The final tag is ':v1' for the default version, or the version name
161182
# (e.g., ':ubuntu-24-04') for others.
@@ -167,11 +188,20 @@ def get_base_image_steps(images: Sequence[ImageConfig]) -> list[dict]:
167188
tags.append(f'{IMAGE_NAME_PREFIX}{image_config.name}:latest')
168189

169190
dockerfile_path = os.path.join('oss-fuzz', image_config.dockerfile_path)
170-
steps.append(
171-
build_lib.get_docker_build_step(tags,
172-
image_config.path,
173-
dockerfile_path=dockerfile_path,
174-
build_args=image_config.build_args))
191+
step = build_lib.get_docker_build_step(tags,
192+
image_config.path,
193+
dockerfile_path=dockerfile_path,
194+
build_args=image_config.build_args)
195+
196+
# Check for dependencies and add 'waitFor' if necessary.
197+
dependencies = IMAGE_DEPENDENCIES.get(image_config.name, [])
198+
wait_for = [build_ids[dep] for dep in dependencies if dep in build_ids]
199+
if wait_for:
200+
step['waitFor'] = wait_for
201+
202+
build_ids[image_config.name] = step['id']
203+
steps.append(step)
204+
175205
return steps
176206

177207

0 commit comments

Comments
 (0)