Skip to content

Conversation

@adityamaru
Copy link

@adityamaru adityamaru commented Nov 15, 2025

Problem

The step-checker feature was causing warnings and preventing sticky disk commits in container jobs:

Warning: Unable to check for previous step failures: _diag directory not found at /__w/web/web/_diag
Warning: Skipping sticky disk commit due to ambiguity in failure detection

Root Cause:

  • Container jobs mount the workspace at /__w/ inside the container
  • The _diag directory exists on the host at /home/runner/_diag but is not mounted into containers
  • The step-checker tried to access _diag and failed, returning an error
  • This prevented sticky disk commits from happening

Impact:

  • Affecting container jobs since the step-checker was deployed
  • Only affects jobs with container: in workflow config

Solution

Add container detection by checking /proc/1/cgroup for docker/containerd. When running inside a container, skip the step-checker gracefully since _diag is not accessible.

No changes to existing path detection logic - only adds the container check at the beginning.

Testing

  • ✅ Build successful
  • ✅ All tests pass
  • ✅ Container jobs will now skip step-checker and allow sticky disk commits
  • ✅ Non-container jobs continue to work exactly as before

Related

This is the same fix applied to setup-docker-builder in useblacksmith/setup-docker-builder#53


Note

Skip step-checker when running in containers by detecting container environments; update CI to pass github_token to Buf setup.

  • Step checker (src/step-checker.ts, dist/post/index.js):
    • Add container environment detection (/.dockerenv, /proc/1/cgroup, cwd prefix /__w/).
    • If in container, skip _diag lookup and return no failures (allows commits to proceed).
  • CI (.github/workflows/build.yaml):
    • Provide github_token to bufbuild/buf-setup-action@v1.

Written by Cursor Bugbot for commit 1afddd3. This will update automatically on new commits. Configure here.

@adityamaru adityamaru force-pushed the fix/step-checker-container-support branch from b91890f to 480d460 Compare November 15, 2025 22:16
The step-checker was causing warnings in container jobs because the _diag
directory exists on the host but is not mounted into containers.

Changes:
- Check for /.dockerenv file (docker-specific indicator)
- Check cgroup for container indicators (works with cgroup v1)
- Check if working directory starts with /__w/ (GitHub Actions container mount)
- Skip step-checker gracefully when any of these conditions are met

This handles both cgroup v1 and v2 formats and allows sticky disk commits
to proceed normally for container jobs.

Affects container jobs only - regular jobs continue to work as before.
@adityamaru adityamaru force-pushed the fix/step-checker-container-support branch from 480d460 to 015d69b Compare November 15, 2025 22:19
@adityamaru adityamaru force-pushed the fix/step-checker-container-support branch from 45db7a3 to f3fb2df Compare November 15, 2025 22:27
The buf-setup-action was hitting GitHub API rate limits for unauthenticated
requests. Adding the GITHUB_TOKEN allows authenticated requests which have
a much higher rate limit.
@adityamaru adityamaru force-pushed the fix/step-checker-container-support branch from f3fb2df to 1afddd3 Compare November 15, 2025 22:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant