Opportunistic CoW for platforms that support it#669
Opportunistic CoW for platforms that support it#669jaybosamiya-ms wants to merge 1 commit intomainfrom
Conversation
18b9116 to
fe79cd5
Compare
fe79cd5 to
d03cf6a
Compare
|
🤖 SemverChecks 🤖 No breaking API changes detected Note: this does not mean API is unchanged, or even that there are no breaking changes; simply, none of the detections triggered. |
|
I think @CvvT should review this PR. |
CvvT
left a comment
There was a problem hiding this comment.
Thanks for the change! I have a few questions.
| anyhow::bail!("Expected a .tar file, found {}", tar_file.display()); | ||
| } | ||
| mmapped_file_data(tar_file)? | ||
| mmapped_file(tar_file)?.data |
There was a problem hiding this comment.
Do we need to consider files in the tar file?
There was a problem hiding this comment.
The tar files don't actually easily support the type of mapping we want, so more work needs to be done there; I have explicitly left this to a future PR, since it is almost an orthogonal change to the key changes in this PR (and also needs further design effort).
| // TODO(jb): Do we ever need to do NoReplace? | ||
| let fixed_behavior = if flags.contains(MapFlags::MAP_FIXED) { | ||
| FixedAddressBehavior::Replace | ||
| } else { | ||
| FixedAddressBehavior::Hint | ||
| }; |
There was a problem hiding this comment.
Why not check NoReplace? The flag is controlled by user?
There was a problem hiding this comment.
I'd left in this TODO for some reason, I don't fully recall what the particular constraint here was, I'll look at it again shortly, thanks for pointing it out!
| match <_ as PageManagementProvider<{ PAGE_SIZE }>>::try_allocate_cow_pages( | ||
| litebox_platform_multiplex::platform(), | ||
| suggested_addr.unwrap_or(0), | ||
| &static_data[offset..offset + len], | ||
| permissions, | ||
| fixed_behavior, | ||
| ) { | ||
| Ok(ptr) => { | ||
| let range = | ||
| PageRange::new(ptr.as_usize(), ptr.as_usize().checked_add(len).unwrap()) | ||
| .unwrap(); | ||
| // SAFETY: ptr is the freshly CoW-mapped region of exactly `len` bytes with | ||
| // `permissions`. | ||
| unsafe { | ||
| self.global.pm.register_existing_mapping( | ||
| range, | ||
| permissions, | ||
| true, | ||
| fixed_behavior == FixedAddressBehavior::Replace, | ||
| ) | ||
| } | ||
| .unwrap(); | ||
| Some(Ok(ptr)) |
There was a problem hiding this comment.
Potential race issue? Should we call try_allocate_cow_pages and register_existing_mapping in a single function that takes lock on PageManager::vmem so that other page allocation calls won't content on the same address?
There was a problem hiding this comment.
Hmm, I think you are correct in terms of technically there is the possibility of a race but I am struggling to think of a scenario where it can be relevant. As I understand it, it is only really relevant if two different threads try to race to set the fixed address with replacement, because otherwise the underlying host access will just either pick a diff address for us (if non-fixed) or will reject it (if non-replacing). And if a program is trying to race with replacement, then the program anyways will register the same mapping, right?
I will need to give a bit more thought on how to expose the relevant lock here, possibly some type of begin-attempt-commit type thing.
Thanks for noticing this!
This PR improves LiteBox performance by shaving off ~65% of our execution time by introducing opportunistic CoW:
Concretely, this PR introduces a new platform interface to
PageManagementProvidercalledtry_allocate_cow_pages. This defaults toUnsupportedByPlatform, but any platform that has the ability to support copy-on-write page allocation can override this method in order to automatically obtain improved performance.The improved performance itself comes from
mmapin the Linux shim, where file-based mapping can be sped up.This is not a trivial "pass through" since the files inside LiteBox need not correspond to files on the host (so even for Linux-on-Linux, there is work that must be done). This motivates the actual design of
try_allocate_cow_pagesthat I've chosen, which is closer to a memcpy-like interface, rather than anything file-oriented. Platforms that have file-oriented support for it are welcome to keep those platform-specific details to themselves. What this means is thatmmapon the shim actually first looks up the file, then it checks to get a backing store memory of it; if one exists, then a CoW is attempted, at which point the platform itself does the lookup of the file (on Linux platform) and decides to make a CoW map of the page.Currently, the biggest gains for us come from the actual ELF load time, since that is all that I have set up as supporting CoW for now (it was our biggest bottleneck; thus the 65% savings in execution time). However, now that the core framework has been set up by this PR, future PRs should more easily be able to unlock performance wherever CoW might help.
There are some small bits I am not yet fully happy with the PR that I've marked with
TODO(jb)etc., for example, it'd be good to store pre-loaded FDs rather than constantly opening/closing them. However, that can/will be handled in a future PR.