build-template: add codeserver build optimizations and monitoring by ysok · Pull Request #3039 · opendatahub-io/notebooks

ysok · 2026-02-27T20:37:24Z

Summary

Codeserver hermetic builds on GHA runners are failing silently (runner loses communication) due to OOM / disk exhaustion. This PR adds:

Free disk space for codeserver targets — same treatment as rocm, cuda, pytorch, tensorflow
--layers=false to reduce peak disk usage — codeserver compiles from source every run so there's no layer cache to benefit from
--build-arg GHA_BUILD=true to reduce VS Code build parallelism — GHA runners only have 16GB RAM
Build monitoring with timestamps + free -h — so future OOM/disk-full failures leave a clear breadcrumb trail in CI logs instead of the uninformative "runner lost communication" message

These changes are needed on main so that pull_request_target workflows pick them up.

Context

Prerequisite for RHAIENG-2846 [3/3]: Hermetic Dockerfile + build patches + Tekton for codeserver #2985 (hermetic codeserver build)
Failed run: https://github.com/opendatahub-io/notebooks/actions/runs/22500172157/job/65185013844?pr=2985

Test plan

Verify CI passes on this PR (template-only change, no functional builds affected)
After merge, re-run codeserver build in PR RHAIENG-2846 [3/3]: Hermetic Dockerfile + build patches + Tekton for codeserver #2985 to confirm runner no longer OOMs

Made with Cursor

Summary by CodeRabbit

Release Notes

Chores
- Optimized codeserver target builds with adjusted caching configuration
- Enhanced build process monitoring with timestamped progress tracking and memory usage reporting for improved visibility into build operations

- Free disk space for codeserver targets (same as rocm/cuda/pytorch) - Use --layers=false to halve peak disk use (no layer cache to reuse) - Pass GHA_BUILD=true for reduced VS Code build parallelism on 16GB runners - Add timestamps + free -h to build monitoring loop for OOM/disk debugging Made-with: Cursor

coderabbitai · 2026-02-27T20:37:44Z

📝 Walkthrough

Walkthrough

Modified the GitHub Actions notebook build workflow to conditionally append layer caching disablement and a GHA build argument when codeserver targets are included, and enhanced the disk usage loop to include timestamped progress reporting and memory usage metrics.

Changes

Cohort / File(s)	Summary
Workflow Configuration `.github/workflows/build-notebooks-TEMPLATE.yaml`	Added conditional codeserver build logic that appends `--layers=false` and `GHA_BUILD=true` build-arg. Enhanced disk usage monitoring loop with timestamped progress output and memory usage reporting.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately summarizes the main changes: adding codeserver build optimizations and enhanced monitoring to the build template workflow.
Description check	✅ Passed	The description comprehensively covers the rationale, specific changes, context, and test plan, meeting all essential requirements of the template despite some merge criteria checkboxes being unchecked.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

🧹 Nitpick comments (1)

.github/workflows/build-notebooks-TEMPLATE.yaml (1)
313-320: Consider trapping and killing the background monitoring process.

The background monitoring loop is useful for debugging OOM/disk issues, but it continues running after make completes. While GitHub Actions typically cleans up orphaned processes at step boundaries, explicitly capturing the PID and killing it ensures deterministic cleanup and prevents potential interference with subsequent steps.
♻️ Optional: Add explicit cleanup
          # Print disk and memory stats every 30s so OOM/disk-full failures
          # leave a breadcrumb trail in the logs.
          (while true; do
            echo "=== $(date -u '+%H:%M:%S') ==="
            df -h | grep "${HOME}/.local/share/containers"
            free -h
            sleep 30
-         done) &
+         done) &
+         MONITOR_PID=$!
+         trap "kill $MONITOR_PID 2>/dev/null || true" EXIT

          make ${{ inputs.target }}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/build-notebooks-TEMPLATE.yaml around lines 313 - 320, The
background monitoring subshell started at the end of the step should capture its
PID and ensure it is killed when the step ends; modify the subshell invocation
that starts the loop (the "(while true; do ... done) &" block) to save the PID
(e.g. pid=$!) and add a trap/cleanup that kills $pid on EXIT (or explicitly kill
$pid after make completes) so the monitoring process is deterministically
cleaned up.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In @.github/workflows/build-notebooks-TEMPLATE.yaml:
- Around line 313-320: The background monitoring subshell started at the end of
the step should capture its PID and ensure it is killed when the step ends;
modify the subshell invocation that starts the loop (the "(while true; do ...
done) &" block) to save the PID (e.g. pid=$!) and add a trap/cleanup that kills
$pid on EXIT (or explicitly kill $pid after make completes) so the monitoring
process is deterministically cleaned up.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between 137aa9b and fc15cbc.

📒 Files selected for processing (1)

.github/workflows/build-notebooks-TEMPLATE.yaml

daniellutz · 2026-02-27T20:44:05Z

/lgtm

openshift-ci · 2026-02-27T20:54:08Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: ide-developer
Once this PR has been reviewed and has the lgtm label, please ask for approval from daniellutz. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

openshift-ci bot requested review from atheo89 and jiridanek February 27, 2026 20:37

github-actions bot added the review-requested GitHub Bot creates notification on #pr-review-ai-ide-team slack channel label Feb 27, 2026

openshift-ci bot added the size/s label Feb 27, 2026

openshift-ci bot added size/s and removed size/s labels Feb 27, 2026

coderabbitai bot reviewed Feb 27, 2026

View reviewed changes

daniellutz self-requested a review February 27, 2026 20:43

openshift-ci bot assigned daniellutz Feb 27, 2026

openshift-ci bot added the lgtm label Feb 27, 2026

ide-developer approved these changes Feb 27, 2026

View reviewed changes

openshift-ci bot assigned ide-developer Feb 27, 2026

ysok merged commit ae2bb4c into opendatahub-io:main Feb 27, 2026
14 of 16 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

build-template: add codeserver build optimizations and monitoring#3039

build-template: add codeserver build optimizations and monitoring#3039
ysok merged 1 commit intoopendatahub-io:mainfrom
ysok-red-hat-data-services:odh-template-codeserver-debug

ysok commented Feb 27, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Feb 27, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Uh oh!

coderabbitai bot left a comment

Uh oh!

daniellutz commented Feb 27, 2026

Uh oh!

openshift-ci bot commented Feb 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

ysok commented Feb 27, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Context

Test plan

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai bot commented Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

daniellutz commented Feb 27, 2026

Uh oh!

openshift-ci bot commented Feb 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ysok commented Feb 27, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 27, 2026 •

edited

Loading