fix(scorpio): refresh Antares mounts and add CL overlay support#1819
Conversation
There was a problem hiding this comment.
Pull request overview
This PR fixes stale cache issues in Antares FUSE mounts and adds support for CL (changelist) overlays to enable viewing uncommitted changes in mounted filesystems. The implementation introduces TTL-based directory cache refresh, remounting logic for CL operations, and Buck2 daemon retry mechanisms to handle connection failures.
Changes:
- Adds TTL-based directory cache refresh in dicfuse to prevent stale reads after unmount/remount cycles
- Implements CL overlay support with HTTP-based file fetching, whiteout creation for deletions, and atomic remounting on build_cl/clear_cl operations
- Enhances Buck2 integration with daemon connection retry logic and explicit mount lifecycle management
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| scorpio/src/dicfuse/store.rs | Adds dir_needs_refresh function and TTL checks to ensure directories are reloaded when stale |
| scorpio/src/dicfuse/async_io.rs | Modifies lookup to pre-refresh parent directories before path resolution |
| scorpio/src/daemon/antares.rs | Implements CL layer building (fetch/download/whiteout), path normalization, and remount logic for build_cl/clear_cl operations |
| scorpio/src/antares/fuse.rs | Reorders overlay layers to place CL above dicfuse for proper file override semantics |
| orion/src/buck_controller.rs | Adds explicit unmount method to MountGuard with atomic tracking to prevent duplicate unmount attempts |
| orion/buck/run.rs | Implements Buck2 daemon retry logic with automatic daemon kill and restart on connection failures |
| fn normalize_abs_path(path: &str) -> String { | ||
| let trimmed = path.trim(); | ||
| if trimmed.is_empty() { | ||
| return "/".to_string(); | ||
| } | ||
| if trimmed == "/" { | ||
| return "/".to_string(); | ||
| } | ||
| let mut normalized = if trimmed.starts_with('/') { | ||
| trimmed.to_string() | ||
| } else { | ||
| format!("/{trimmed}") | ||
| }; | ||
| normalized = normalized.trim_end_matches('/').to_string(); | ||
| if normalized.is_empty() { | ||
| "/".to_string() | ||
| } else { | ||
| normalized | ||
| } | ||
| } |
There was a problem hiding this comment.
The normalize_abs_path function is identical to normalize_mount_path (lines 549-568). This is code duplication. Consider consolidating these into a single function or having one call the other to maintain DRY principles.
| })?; | ||
|
|
||
| let mode = libc::S_IFCHR | 0o000; | ||
| let dev = 0; |
There was a problem hiding this comment.
The unsafe block calling libc::mknod requires a SAFETY comment explaining why this is safe. According to the repository's coding guidelines, unsafe code must include a comment with rationale. The CString conversion above ensures the path is nul-terminated and valid, but this should be documented explicitly.
| let dev = 0; | |
| let dev = 0; | |
| // SAFETY: `c_path` is a live `CString`, so `as_ptr()` yields a valid, NUL-terminated | |
| // C string pointer with no interior NUL bytes for the duration of this call. The | |
| // `mode` and `dev` arguments are plain values, and we check the return code for | |
| // errors, propagating any OS error via `last_os_error`. |
| let repo_path = { | ||
| let trimmed = repo.trim(); | ||
| if trimmed.is_empty() { | ||
| "/".to_string() | ||
| } else if trimmed == "/" { | ||
| "/".to_string() | ||
| } else { | ||
| let mut normalized = if trimmed.starts_with('/') { | ||
| trimmed.to_string() | ||
| } else { | ||
| format!("/{trimmed}") | ||
| }; | ||
| normalized = normalized.trim_end_matches('/').to_string(); | ||
| if normalized.is_empty() { | ||
| "/".to_string() | ||
| } else { | ||
| normalized | ||
| } | ||
| return Err(e.into()); | ||
| } | ||
| }; |
There was a problem hiding this comment.
The path normalization code is duplicated across both the buck_controller and antares modules. Consider extracting this logic to a shared utility function to avoid duplication and ensure consistent behavior. The same normalization logic appears in scorpio/src/daemon/antares.rs (lines 549-589) and here.
| AntaresFuse::new( | ||
| mountpoint.clone(), | ||
| self.dicfuse.clone(), | ||
| upper_dir.clone(), | ||
| existing_cl_dir.clone(), | ||
| ) | ||
| .await | ||
| .map_err(|e| { | ||
| ServiceError::Internal(format!("failed to create placeholder fuse: {}", e)) | ||
| })? | ||
| }); |
There was a problem hiding this comment.
The placeholder fuse is created with self.dicfuse.clone() instead of the dicfuse instance extracted from the original entry. This could cause issues during rollback if unmount or build_cl_layer fails, as the placeholder would reference the wrong dicfuse instance (potentially the global root instance instead of a subdirectory-specific instance for this mount). Use 'dicfuse' instead of 'self.dicfuse.clone()' for consistency with the final remount.
| AntaresFuse::new( | ||
| mountpoint.clone(), | ||
| self.dicfuse.clone(), | ||
| upper_dir.clone(), | ||
| existing_cl_dir.clone(), | ||
| ) | ||
| .await | ||
| .map_err(|e| { | ||
| ServiceError::Internal(format!("failed to create placeholder fuse: {}", e)) | ||
| })? | ||
| }); |
There was a problem hiding this comment.
The placeholder fuse is created with self.dicfuse.clone() instead of the dicfuse instance extracted from the original entry. This could cause issues during rollback if unmount or clear operations fail, as the placeholder would reference the wrong dicfuse instance (potentially the global root instance instead of a subdirectory-specific instance for this mount). Use 'dicfuse' instead of 'self.dicfuse.clone()' for consistency with the final remount.
| async fn fetch_cl_files(&self, cl_link: &str) -> Result<Vec<ClFileEntry>, ServiceError> { | ||
| let base_url = crate::util::config::base_url(); | ||
| let url = format!("{base_url}/api/v1/cl/{cl_link}/files-list"); | ||
| let client = Self::http_client()?; | ||
| let resp = client | ||
| .get(url) | ||
| .send() | ||
| .await | ||
| .map_err(|e| ServiceError::Internal(format!("failed to fetch CL files: {}", e)))?; | ||
| if !resp.status().is_success() { | ||
| return Err(ServiceError::Internal(format!( | ||
| "failed to fetch CL files: HTTP {}", | ||
| resp.status() | ||
| ))); | ||
| } | ||
| let body: CommonResult<Vec<ClFileEntry>> = resp.json().await.map_err(|e| { | ||
| ServiceError::Internal(format!("failed to parse CL files response: {}", e)) | ||
| })?; | ||
| if !body.req_result { | ||
| return Err(ServiceError::Internal(format!( | ||
| "CL files response error: {}", | ||
| body.err_message | ||
| ))); | ||
| } | ||
| Ok(body.data.unwrap_or_default()) | ||
| } | ||
|
|
||
| async fn download_blob_to_path( | ||
| &self, | ||
| client: &Client, | ||
| oid: &str, | ||
| dest: &Path, | ||
| ) -> Result<(), ServiceError> { | ||
| let base_url = crate::util::config::base_url(); | ||
| let clean_oid = oid.trim_start_matches("sha1:"); | ||
| let url = format!("{base_url}/api/v1/file/blob/{clean_oid}"); | ||
| let resp = client | ||
| .get(url) | ||
| .send() | ||
| .await | ||
| .map_err(|e| ServiceError::Internal(format!("failed to download blob: {}", e)))?; | ||
| if !resp.status().is_success() { | ||
| return Err(ServiceError::Internal(format!( | ||
| "failed to download blob {}: HTTP {}", | ||
| clean_oid, | ||
| resp.status() | ||
| ))); | ||
| } | ||
|
|
||
| if let Some(parent) = dest.parent() { | ||
| tokio::fs::create_dir_all(parent).await.map_err(|e| { | ||
| ServiceError::Internal(format!( | ||
| "failed to create CL parent dir {:?}: {}", | ||
| parent, e | ||
| )) | ||
| })?; | ||
| } | ||
|
|
||
| let mut file = tokio::fs::File::create(dest).await.map_err(|e| { | ||
| ServiceError::Internal(format!("failed to create file {:?}: {}", dest, e)) | ||
| })?; | ||
| let bytes = resp | ||
| .bytes() | ||
| .await | ||
| .map_err(|e| ServiceError::Internal(format!("failed to read blob data: {}", e)))?; | ||
| file.write_all(&bytes).await.map_err(|e| { | ||
| ServiceError::Internal(format!("failed to write file {:?}: {}", dest, e)) | ||
| })?; | ||
| Ok(()) | ||
| } | ||
|
|
||
| fn create_whiteout(path: &Path) -> Result<(), ServiceError> { | ||
| if let Some(parent) = path.parent() { | ||
| if let Err(e) = std::fs::create_dir_all(parent) { | ||
| return Err(ServiceError::Internal(format!( | ||
| "failed to create whiteout parent {:?}: {}", | ||
| parent, e | ||
| ))); | ||
| } | ||
| } | ||
|
|
||
| if path.exists() { | ||
| let _ = std::fs::remove_file(path); | ||
| let _ = std::fs::remove_dir_all(path); | ||
| } | ||
|
|
||
| let c_path = CString::new(path.as_os_str().as_bytes()).map_err(|e| { | ||
| ServiceError::Internal(format!("invalid whiteout path {:?}: {}", path, e)) | ||
| })?; | ||
|
|
||
| let mode = libc::S_IFCHR | 0o000; | ||
| let dev = 0; | ||
| let res = unsafe { libc::mknod(c_path.as_ptr(), mode, dev) }; | ||
| if res != 0 { | ||
| return Err(ServiceError::Internal(format!( | ||
| "failed to create whiteout {:?}: {}", | ||
| path, | ||
| std::io::Error::last_os_error() | ||
| ))); | ||
| } | ||
| Ok(()) | ||
| } | ||
|
|
||
| async fn build_cl_layer( | ||
| &self, | ||
| mount_path: &str, | ||
| cl_link: &str, | ||
| cl_dir: &Path, | ||
| ) -> Result<(), ServiceError> { | ||
| if cl_link.trim().is_empty() { | ||
| return Err(ServiceError::InvalidRequest( | ||
| "cl link cannot be empty".to_string(), | ||
| )); | ||
| } | ||
|
|
||
| if cl_dir.exists() { | ||
| tokio::fs::remove_dir_all(cl_dir).await.map_err(|e| { | ||
| ServiceError::Internal(format!("failed to clear CL dir {:?}: {}", cl_dir, e)) | ||
| })?; | ||
| } | ||
| tokio::fs::create_dir_all(cl_dir).await.map_err(|e| { | ||
| ServiceError::Internal(format!("failed to create CL dir {:?}: {}", cl_dir, e)) | ||
| })?; | ||
|
|
||
| let files = self.fetch_cl_files(cl_link).await?; | ||
| if files.is_empty() { | ||
| return Ok(()); | ||
| } | ||
|
|
||
| let client = Self::http_client()?; | ||
| for file in files { | ||
| let rel_path = match Self::relative_path_for_mount(&file.path, mount_path) { | ||
| Some(p) => p, | ||
| None => continue, | ||
| }; | ||
| let dest = cl_dir.join(rel_path); | ||
| match file.action.as_str() { | ||
| "new" | "modified" => { | ||
| self.download_blob_to_path(&client, &file.sha, &dest) | ||
| .await?; | ||
| } | ||
| "deleted" => { | ||
| Self::create_whiteout(&dest)?; | ||
| } | ||
| other => { | ||
| tracing::warn!( | ||
| "Unknown CL action '{}' for path {}, skipping", | ||
| other, | ||
| file.path | ||
| ); | ||
| } | ||
| } | ||
| } | ||
|
|
||
| Ok(()) | ||
| } |
There was a problem hiding this comment.
The new CL layer building functionality (build_cl_layer, fetch_cl_files, download_blob_to_path, create_whiteout) lacks test coverage. While there are tests for the build_cl API endpoint, they use a mock service that doesn't exercise the actual HTTP fetching, blob downloading, and whiteout creation logic. Consider adding integration tests that verify the CL layer building, especially the path normalization, whiteout creation, and error handling.
21e9ecd to
2457469
Compare
This commit fixes stale Antares mount state by invalidating cache on unmount and adds CL overlay support for Antares mounts. CL build/clear now remounts to ensure layer changes take effect and whiteouts hide deleted files. Changes: - Invalidate Antares mount cache on unmount/cleanup paths - Add CL overlay mount support with upper/cl layers - Remount on build_cl/clear_cl to refresh view - Refresh buck controller mount state before builds - Fix clippy warnings and apply rustfmt in Antares CL path validation Signed-off-by: jerry609 <1772030600@qq.com>
2457469 to
39daba0
Compare
eddd7f7
This commit fixes stale Antares mount state by invalidating cache on unmount and adds CL overlay support for Antares mounts. CL build/clear now remounts to ensure layer changes take effect and whiteouts hide deleted files.
Changes: