Skip to content

fix(orion): retain Antares mounts and improve dicfuse startup diagnostics#2026

Merged
benjamin-747 merged 2 commits intoweb3infra-foundation:mainfrom
Ivanbeethoven:main
Mar 19, 2026
Merged

fix(orion): retain Antares mounts and improve dicfuse startup diagnostics#2026
benjamin-747 merged 2 commits intoweb3infra-foundation:mainfrom
Ivanbeethoven:main

Conversation

@Ivanbeethoven
Copy link
Copy Markdown
Collaborator

  1. keep mount points available after build and target-discovery failures for debugging
  2. add dicfuse root tree logging and optional startup test mount
  3. relax systemd tmp/mount isolation settings for FUSE runtime compatibility

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d1c064f117

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +749 to +753
tracing::info!(
"[Task {}] Build completed — mount directories retained for debugging: \
new_repo mountpoint={}; \
old_repo mountpoint={}",
id,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Restore mount cleanup for normal build completion

In build(), the post-build path now only logs that mount directories are retained and never calls unmount_antares_fs for either the new or old repo mount. In a long-lived runner this leaks at least two Antares/FUSE mounts per successful build (and additional mounts from failed target-discovery attempts), which will eventually exhaust mount/disk resources and cause later builds to fail. The previous behavior cleaned up by default and only retained mounts behind an explicit flag.

Useful? React with 👍 / 👎.

Comment on lines +187 to +188
Err(_) => true,
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Make startup test mount opt-in instead of default-on

is_test_mount_enabled() returns true when ORION_ENABLE_ANTARES_TEST_MOUNT is unset, so every default startup creates the synthetic antares_test mount. That makes the diagnostic probe mandatory and adds persistent mount state even when operators did not request it, increasing resource usage and conflicting with the “optional test mount” intent.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adjusts Orion’s Antares/Dicfuse integration and runtime environment to make FUSE mounts easier to debug by retaining mounts and adding startup diagnostics, alongside systemd sandboxing relaxations for compatibility.

Changes:

  • Retain Antares mount directories across build flow (including target-discovery failures) and log retained mountpoints.
  • Add Dicfuse startup diagnostics: workspace root tree logging and an optional “test mount” during warmup.
  • Relax systemd isolation settings (PrivateTmp, PrivateMounts) to better support FUSE runtime behavior.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

File Description
orion/systemd/orion-runner.service Disables PrivateTmp and PrivateMounts isolation to reduce FUSE mount incompatibilities.
orion/src/buck_controller.rs Removes unmount RAII guard and keeps Antares mounts around, updating logs accordingly.
orion/src/antares.rs Adds dicfuse root tree logging and a configurable startup test mount after dicfuse warmup.
Comments suppressed due to low confidence (1)

orion/src/buck_controller.rs:608

  • On target-discovery failure, the code logs and proceeds to the next attempt without unmounting the two mounts created for this attempt. This can leak up to 2 mounts per attempt and will compound under repeated failures. If mounts are meant to be retained only for debugging, consider unmounting between attempts unless retention is explicitly enabled, and include the mount IDs/paths in a structured way for postmortems.
                tracing::warn!(
                    "[Task {}] Failed to get build targets (attempt {}/{}): {}. Mounts retained for debugging (old={}, new={}).",
                    id,
                    attempt,
                    MAX_TARGETS_ATTEMPTS,
                    e,
                    old_repo_mount_point,
                    repo_mount_point,
                );
                last_targets_error = Some(e);
                if attempt == MAX_TARGETS_ATTEMPTS {
                    break;
                }
            }

Comment on lines 563 to +568
let id_for_old_repo = format!("{id}-old-{attempt}");
let (old_repo_mount_point, mount_id_old_repo) =
let (old_repo_mount_point, _mount_id_old_repo) =
mount_antares_fs(&id_for_old_repo, None).await?;
let guard_old_repo = MountGuard::new(mount_id_old_repo.clone(), id_for_old_repo);

let id_for_repo = format!("{id}-{attempt}");
let (repo_mount_point, mount_id) = mount_antares_fs(&id_for_repo, cl_arg).await?;
let guard = MountGuard::new(mount_id.clone(), id_for_repo);
let (repo_mount_point, _mount_id) = mount_antares_fs(&id_for_repo, cl_arg).await?;
Comment on lines +749 to +756
tracing::info!(
"[Task {}] Build completed — mount directories retained for debugging: \
new_repo mountpoint={}; \
old_repo mountpoint={}",
id,
mount_point,
old_repo_mount_point_saved.as_deref().unwrap_or("<unknown>"),
);
let v = v.trim().to_ascii_lowercase();
!(v == "0" || v == "false" || v == "no" || v == "off")
}
Err(_) => true,
Comment on lines +210 to +236
fn log_dicfuse_root_tree() {
let root = PathBuf::from(scorpiofs::util::config::workspace());
let max_depth = std::env::var("ORION_DICFUSE_ROOT_TREE_DEPTH")
.ok()
.and_then(|v| v.parse::<usize>().ok())
.unwrap_or(2);
let max_entries = std::env::var("ORION_DICFUSE_ROOT_TREE_MAX_ENTRIES")
.ok()
.and_then(|v| v.parse::<usize>().ok())
.unwrap_or(200);

tracing::info!(
root = %root.display(),
max_depth,
max_entries,
"Dicfuse init: printing workspace root tree"
);

if !root.exists() {
tracing::warn!("Dicfuse workspace path does not exist: {}", root.display());
return;
}

let mut printed = 0usize;
tracing::info!("[dicfuse-root] /");
log_tree_recursive(&root, &root, 0, max_depth, max_entries, &mut printed);

@benjamin-747 benjamin-747 added this pull request to the merge queue Mar 19, 2026
Merged via the queue into web3infra-foundation:main with commit f6d8cf5 Mar 19, 2026
11 of 12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants