Skip to content

Comments

Opportunistic CoW for platforms that support it#669

Open
jaybosamiya-ms wants to merge 1 commit intomainfrom
jayb/opportunistic-cow
Open

Opportunistic CoW for platforms that support it#669
jaybosamiya-ms wants to merge 1 commit intomainfrom
jayb/opportunistic-cow

Conversation

@jaybosamiya-ms
Copy link
Member

@jaybosamiya-ms jaybosamiya-ms commented Feb 19, 2026

This PR improves LiteBox performance by shaving off ~65% of our execution time by introducing opportunistic CoW:

 _____
< MOO >
 -----
        \   ^__^
         \  (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||

Concretely, this PR introduces a new platform interface to PageManagementProvider called try_allocate_cow_pages. This defaults to UnsupportedByPlatform, but any platform that has the ability to support copy-on-write page allocation can override this method in order to automatically obtain improved performance.

The improved performance itself comes from mmap in the Linux shim, where file-based mapping can be sped up.

This is not a trivial "pass through" since the files inside LiteBox need not correspond to files on the host (so even for Linux-on-Linux, there is work that must be done). This motivates the actual design of try_allocate_cow_pages that I've chosen, which is closer to a memcpy-like interface, rather than anything file-oriented. Platforms that have file-oriented support for it are welcome to keep those platform-specific details to themselves. What this means is that mmap on the shim actually first looks up the file, then it checks to get a backing store memory of it; if one exists, then a CoW is attempted, at which point the platform itself does the lookup of the file (on Linux platform) and decides to make a CoW map of the page.

Currently, the biggest gains for us come from the actual ELF load time, since that is all that I have set up as supporting CoW for now (it was our biggest bottleneck; thus the 65% savings in execution time). However, now that the core framework has been set up by this PR, future PRs should more easily be able to unlock performance wherever CoW might help.

There are some small bits I am not yet fully happy with the PR that I've marked with TODO(jb) etc., for example, it'd be good to store pre-loaded FDs rather than constantly opening/closing them. However, that can/will be handled in a future PR.

@jaybosamiya-ms jaybosamiya-ms force-pushed the jayb/opportunistic-cow branch 2 times, most recently from 18b9116 to fe79cd5 Compare February 19, 2026 01:31
@github-actions
Copy link

🤖 SemverChecks 🤖 No breaking API changes detected

Note: this does not mean API is unchanged, or even that there are no breaking changes; simply, none of the detections triggered.

@jaybosamiya-ms jaybosamiya-ms marked this pull request as ready for review February 19, 2026 02:02
@wdcui wdcui requested review from CvvT and Copilot and removed request for Copilot February 19, 2026 21:20
@wdcui wdcui review requested due to automatic review settings February 19, 2026 21:20
@wdcui
Copy link
Member

wdcui commented Feb 19, 2026

I think @CvvT should review this PR.

Copy link
Contributor

@CvvT CvvT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the change! I have a few questions.

anyhow::bail!("Expected a .tar file, found {}", tar_file.display());
}
mmapped_file_data(tar_file)?
mmapped_file(tar_file)?.data
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to consider files in the tar file?

Copy link
Member Author

@jaybosamiya-ms jaybosamiya-ms Feb 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tar files don't actually easily support the type of mapping we want, so more work needs to be done there; I have explicitly left this to a future PR, since it is almost an orthogonal change to the key changes in this PR (and also needs further design effort).

Comment on lines +134 to +139
// TODO(jb): Do we ever need to do NoReplace?
let fixed_behavior = if flags.contains(MapFlags::MAP_FIXED) {
FixedAddressBehavior::Replace
} else {
FixedAddressBehavior::Hint
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not check NoReplace? The flag is controlled by user?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd left in this TODO for some reason, I don't fully recall what the particular constraint here was, I'll look at it again shortly, thanks for pointing it out!

Comment on lines +158 to +180
match <_ as PageManagementProvider<{ PAGE_SIZE }>>::try_allocate_cow_pages(
litebox_platform_multiplex::platform(),
suggested_addr.unwrap_or(0),
&static_data[offset..offset + len],
permissions,
fixed_behavior,
) {
Ok(ptr) => {
let range =
PageRange::new(ptr.as_usize(), ptr.as_usize().checked_add(len).unwrap())
.unwrap();
// SAFETY: ptr is the freshly CoW-mapped region of exactly `len` bytes with
// `permissions`.
unsafe {
self.global.pm.register_existing_mapping(
range,
permissions,
true,
fixed_behavior == FixedAddressBehavior::Replace,
)
}
.unwrap();
Some(Ok(ptr))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential race issue? Should we call try_allocate_cow_pages and register_existing_mapping in a single function that takes lock on PageManager::vmem so that other page allocation calls won't content on the same address?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I think you are correct in terms of technically there is the possibility of a race but I am struggling to think of a scenario where it can be relevant. As I understand it, it is only really relevant if two different threads try to race to set the fixed address with replacement, because otherwise the underlying host access will just either pick a diff address for us (if non-fixed) or will reject it (if non-replacing). And if a program is trying to race with replacement, then the program anyways will register the same mapping, right?

I will need to give a bit more thought on how to expose the relevant lock here, possibly some type of begin-attempt-commit type thing.

Thanks for noticing this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants