Skip to content

Commit 87d3ff7

Browse files
authored
[GSoC 2024] Student report for CernVM-FS - Yuriy Belikov (#1592)
* Create blog_CVMS_YuriyBelikov.md * Update blog_CVMS_YuriyBelikov.md * Fix a "libary" typo
1 parent 113f42e commit 87d3ff7

File tree

1 file changed

+94
-0
lines changed

1 file changed

+94
-0
lines changed
Lines changed: 94 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,94 @@
1+
---
2+
project: HSF CernVM-FS
3+
title: CernVM-FS - Integration of FUSE-T library for macOS laptop clients. Levaraging modern overlay FS features in cvmfs_server
4+
author: Yuriy Belikov
5+
date: 25.09.2024
6+
year: 2024
7+
layout: blog_post
8+
logo:
9+
intro: |
10+
Worked on integration of additional overlay FS features in CVMFS server part and replacement of macFUSE kernel extension with FUSE-T user-space library for macOS laptop clients.
11+
At this point a backbone for metadata-only copying and zero-copy directory renames is implemented for Linux server part and still goes through a refinement and adjustment
12+
due to various differences in overlay FS behaviour discovered during the course of the project. FUSE-T support for macOS clients is primarily done apart several issues that we
13+
classified as critical for CVMFS client stability. I helped with creation of GitHub Actions CI pipeline for macOS clients, prepared a table of FUSE-T issues that were encountered
14+
during this project. Currently, I am continuing my work on adding an ability for users to switch back to macFUSE kext.
15+
---
16+
17+
# Introduction
18+
When the accepted projects were announced, I already started my participation in CVMFS project under Ukrainian Remote Student Internship and was already working
19+
on integration of capabilities provided by metadata-only copying and zero-copy directory renames in overlay FS at the core of cvmfs_server utility.
20+
However, we outlined replacement of macFUSE kernel extension with FUSE-T user-space library as the main goal for GSoC. The motivation behind this is a potential
21+
simplification of macOS client installation process:
22+
- Currently, installing third-party kernel extensions require double reboot
23+
- Manual degrading of macOS security protection mechanisms
24+
25+
On top of that, FUSE-T potentially creates a foundation for making a brew package for CVMFS client. Additionally it allows creation of GitHub Actions CI pipeline for macOS client.
26+
However, I proposed continuation of my work with overlay FS updates during the period of my participation in GSoC, what was also welcomed.
27+
Since I had already been in touch with Valentin as with my internship supervisor and other members of CVMFS team there were no need in Community Bonding Period.
28+
Thus, I started working on these tasks.
29+
30+
# Goals and objectives for FUSE-T project
31+
- Determine key components of FUSE-T installation: headers, libraries, any helper binaries
32+
- Update CMake build files accordingly (later we decided to keep backward compatibility with macFUSE kext)
33+
- Update bash scripts that perform various commands on behalf of macOS client accordingly
34+
- Update parts that test the presence of macOS fuse-related libraries
35+
- Change integration tests if necessary to comply with FUSE-T
36+
37+
# Motivation behind cmvfs_server modification
38+
On **cvmfs_server publish** command, the utility traverses scratch area and stores information about the contents of repository in file catalogs and the content itself in a compressed manner.
39+
Since compression, hashing and updating file catalogs is performed for full copies of the modified files modern overlay FS features have a potential to improve performance of cvmfs_server transactions via avoiding the unnecessary data copying to scratch area.
40+
41+
However, integrating zero-copy directory renames appeared not as trivial as was firstly expected:
42+
As initial logic of cvmfs transactions relies on the fact that scratch area contains full copies of file system objects zero-copy rename leads to wiping out old directory with all its contents
43+
Subdirectories removal creates a different footprint in the upper-layer directory: a whiteout appears instead of the removed subdirectory which is not the case for a usual setup
44+
45+
# Goals and objectives for overlay FS project
46+
- Enable metacopy and redirect_dir features for CVMFS repositories mounting
47+
- Update scratch area traversal routine accordingly
48+
- Implement catalog entries renaming (avoid remove + add sequence)
49+
- Implement metadata-only update for catalog entries
50+
- Expand integration tests with new cases that cover various renaming scenarios
51+
- Cover new functionality under a separate option that would make it possible to enable/disable this feature
52+
53+
# What was done for overlay FS project:
54+
- Upgrades for scratch area traversal: made it 3-steps currently for proper separation of whiteouts left after a removal and after renaming.
55+
- Upgrades for catalog manager: added object that manage new SQL queries.
56+
- Changes in FS synchronization mechanism to properly handle nested whiteouts.
57+
- New integration test that deal with directories renamings and various related cases: partial modification of a nested content, removal of nested files and directories.
58+
- Tracking the files created with metadata-only xattr.
59+
- Overlay FS documentation got [a tiny patch](https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/Documentation/filesystems/overlayfs.rst?id=930b7c32ea2b514fb2c37aa3d4b946d954ee7fa2) clarifying metacopy section.
60+
61+
# What was done for FUSE-T project
62+
- CMake update to overcome issues with dyld failing to find dynamic libraries.
63+
- Updated CMake build files for FUSE-T usage.
64+
- Updated FUSE-T installation check.
65+
- GitHub Actions pipeline with updated macOS CI support (introduced small fixes for GitHub Actions setup created by Valentin).
66+
- Achieved integration tests passing on a reduced tests set.
67+
68+
# Questions left for overlay FS (and yet to be resolved)
69+
- Benchmarking: since the changes required to make replace one-step scratch area traversal with the three-step (could it be reduced to two-step?).
70+
- How to separate new files from updated files in renamed directories (in principle such entries are absent in readonly layer).
71+
- Is it possible to update only file content hash instead of the whole entry in a catalog DB table?
72+
- Nested catalogs support.
73+
74+
# Encountered issues with FUSE-T (yet to be resolved)
75+
- FUSE-T invokes listxattr before calling getxattr:
76+
- If your filesystem doesn’t expect this you are in trouble: in our case magic attributes doesn’t work properly.
77+
- Hidden extended attributes are not supported; not a big issue since they are utilized by cvmfs_server (which is not supported on macOS)
78+
- Mounting takes a time frame (usually a few seconds) long enough to fail immediate subsequent commands (such as calling **ls \<repo\>** right after mount)
79+
- Sometimes we get directory “loops” inside directories where usually regular files are stored: nested directory refers a parent directory
80+
81+
# Conclusions
82+
That was quite an interesting journey that led to conducting two talks: one during [CERN-SFT group meeting](https://indico.cern.ch/event/1402909/) where I presented my preliminary benchmarks
83+
made for overlay FS redirect_dir and metacopy options on a toy setup; the other one was during [CERN-VM FS Workshop 2024](https://indico.cern.ch/event/1347727/timetable/) where I presented the detailed
84+
outcomes of my GSoC project in more detailed way.
85+
I am grateful for this opportunity to collaborate on this project and contribute in such a huge organisation as CERN. I want to thank my supervisor
86+
Valentin Volkl for his guidance and support all the way through this period.
87+
There is still work to do: FUSE-T integration appeared not as easy as we were expecting as it brings some peculiar issues
88+
on repositories reloads and IPC management that are yet to be resolved.
89+
As well overlay FS part contained various not-so-trivial parts such as nested whiteouts and partial content updates for directories.
90+
I hope I will help with resolution of the remaining issues and my work will be good foundation for the future release of CVMFS.
91+
92+
**Related repositories:** [CVMFS fork](https://github.com/YBelikov/cvmfs), [CVMFS origin](https://github.com/cvmfs/cvmfs)
93+
94+
**Check out my work here:** [FUSE-T](https://github.com/cvmfs/cvmfs/pull/3587), [Overlay FS](https://github.com/cvmfs/cvmfs/pull/3547)

0 commit comments

Comments
 (0)