Skip to content

Mapped Arc and UniqueArc #700

@LeoPloutno

Description

@LeoPloutno

Proposal

Problem statement

It would be very convenient to be able to share individual subfields of an object allocated by an Arc
without needing to resort to scoped threads.

Motivating examples or use cases

Suppose you have multiple threads, each requiring access to a heap allocated slice.
In current Rust you can:

  • Allocate one large slice and send mutable references to subslices to scoped threads.
  • Allocate the desired slice inside each thread, hoping cache locality wonn't be an issue.

For instance, you do some computation which requires a buffer, and you do many such computations
in parallel. You can do

let mut huge_buffer: Box<[_]> = ...;

thread::scope(|s| {
    let mut handles = Vec::with_capacity(n_threads);
    
    for buffer in huge_buffer.chunks_exact_mut(chunk_size) {
        handles.push(s.spawn(move || {
            some_computation_with_buffer(buffer, ...);
        }));
    }
});

or

let mut handles = Vec::with_capacity(n_threads);

for _ in 0..n_threads {
    handles.push(thread::spawn(|| {
        let buffer: Box<[_]> = ...;
        
        some_computation_with_buffer(buffer, ...);
    }));
}

The first solution is not applicable in contexts where scoped threads cannot be used (for example - async).
The second one fragments the data and allocates many times in different places
in memory, which is chache-unfriendly. What if tasks that require the bufferrs are spawned constantly?
Constant allocation and deallocation is time-consuming, for every call to alloc and free
invoke syscalls.

Solution sketch

Shared subfields

An Arc allows the user to share an allocation between execution boudaries without
worrying that it might get deallocated the instant the object that allocated it dropped.
Similarly, one could have a type that points to a subfield of that allocation and keeps
it alive until it is the last pointer, at which point it deallocates the object.
Such an object may be cloned and passed between threads freely, for it only grants immutable
access to the subfield it points to.

Exclusive subfields

Unlike shared mapped pointers, exclusive ones (will also be refered to as 'unique (mapped) pointers') can grant
mutable access to the subfield it points to. For this reason, they cannot be cloned.

Poisoning

If multiple subfields of the same object may be modified concurrently with the proposed API,
when all unique pointers go out of scope (get dropped), immutable access to the whole object
may be re-acquired (similarly to UniqueArc::into_arc). Hence, inconsistant state of a subfield
which has been written to by a panicking thread may be read by another thread.
To prevent this from happening without the user knowing of it, the pointee shall be poisoned
if a unique pointer is destroyed during a panic.
This means a PoisonTag will need to be added to InnerArc, though a non-poisoning
version would not defer from the current layout.

Weak pointers

Weak pointers should not be upgradeable as long as any unique pointers exist.
In current Rust, this is achieved by setting the strong count to zero when creating a
UniqueArc - since no other strong pointer, which would be responsible for deallocation, exists,
the counter can have a sentinel value.

If we introduce mapped unique pointers, which can be split into multiple ones indefinately,
this algorithm does not work anymore. The strong count should count the number of unique and
shared pointers independantly (dedicating another atomic counter would complicate the synchronization).
I propose the first half of the counter (32 bits on 64-bit platforms) counts the shared pointers,
while the other half - the unique ones. This way, incrementing and decrementing either sub-counter is trivial
and can be done using a single fetch_(add/sub) operation with the right argument -
1 for shared pointers and 1 << (usize::NBITS / 2) for unique pointers.

The logic for upgrading the weak pointers must be updated:
a Weak can be upgraded to an Arc if and only if there is at least one strong shared pointer
and there are no unique pointers.

Rough sketch

Shared strong pointers:

// An atomically reference-counted pointer to a subfield of the allocated object.
// Grants immutable access to the subfield.
// May be freely cloned.
struct MappedArc<T: ?Sized, U: ?Sized, A: Allocator = Global> {
    data: NonNull<T>,
    allocation: NonNull<InnerArc<U>>,
    allocator: A
}

impl<T: ?Sized, A: Allocator> Arc<T, A> {
    // Converts into a mapped pointer to the whole object as its own subfield.
    pub fn into_mapped(self) -> MappedArc<T, T, A> { ... }
}

impl<T: ?Sized, U: ?Sized, A: Allocator> MappedArc<T, U, A> {
    // Converts this shared pointer into a pointer to the whole object.
    // Returns the original pointer wrapped in `Err` if there are any unique pointers.
    // If the object has been poisoned by a unique mapped pointer,
    // returns the `Arc` inside a `PoisonError`.
    pub fn into_arc(self) -> Result<LockResult<Arc<U, A>>, MappedArc<T, U, A>> { ... }
    
    // Converts this shared pointer into a unique pointer to the same subfield
    // if there are no other shared pointers. Since this pointer exists, 
    // there can only be unique pointers to other subfields, so their number does not matter.
    // Otherwise, returns the original pointer wrapped in `Err`.
    // Upon success, decrements the strong unique counter and increments the strong shared counter.
    pub fn into_mapped_unique(self) -> Result<MappedUniqueArc<T, U, A>, MappedArc<T, U, A>> { ... }
    
    // Maps this shared pointer into a shared pointer to a subfield of its pointee.
    pub fn map<V, F>(orig: Self, f: F) -> MappedArc<V, U, A> 
    where
        F: FnOnce(&T) -> &V,
        V: ?Sized
    { ... }
    
    // Optionally maps this shared pointer into a shared pointer to a subfield of its pointee.
    // If the closure returns `None`, returns the original pointer wrapped in `Err`.
    pub fn filter_map<V, F>(orig: Self, f: F) -> Result<MappedArc<V, U, A>, MappedArc<T, U, A>> 
    where
        F: FnOnce(&T) -> Option<&V>,
        V: ?Sized
    { ... }
    
    // Attempts to map this shared pointer to a shared pointer to a subfield of its pointee.
    // If the closure returns an error, returns the original pointer along with it.
    pub fn try_map<V, E, F>(orig: Self, f: F) -> Result<MappedArc<V, U, A>, (MappedArc<T, U, A>, E)> 
    where
        F: FnOnce(&T) -> Result<&V, E>,
        V: ?Sized
    { ... }
    
    // Splits this shared pointer into two that point to subfields of its pointee.
    pub fn map_split<V, W, F>(orig: Self, f: F) -> (MappedArc<V, U, A>, MappedArc<W, U, A>)
    where
        F: FnOnce(&T) -> (&V, &W),
        V: ?Sized,
        W: ?Sized,
        A: Clone
    { ... }
}

impl<T: ?Sized, U: ?Sized, A: Allocator + Clone> Clone for MappedArc<T, U, A> { ... }

impl<T: ?Sized, U: ?Sized, A: Allocator> Deref for MappedArc<T, U, A> {
    type Target = T;
    ...
}

// Decrements the strong shared counter, possibly dropping and/or deallocating the pointee.
impl<T: ?Sized, U: ?Sized, A: Allocator> Drop for MappedArc<T, U, A> { ... }

// `U` need not be `Send` nor `Sync` - the mapped pointer does not
// grant access to it, only deallocates it if needed.
unsafe impl<T: ?Sized + Send + Sync, U: ?Sized, A: Allocator> Send for MappedArc<T, U, A> {}

// `U` need not be `Send` nor `Sync` - the mapped pointer does not
// grant access to it, only deallocates it if needed.
// `T` needs to be `Sync` since the mapped pointer may grant shared access to it
// across threads.
unsafe impl<T: ?Sized + Send + Sync, U: ?Sized, A: Allocator> Sync for MappedArc<T, U, A> {}

Unique strong pointers:

// An atomically reference-counted unique pointer to a subfield of the allocated object.
// Grants mutable access to the subfield.
// Cannot be cloned.
struct MappedUniqueArc<T: ?Sized, U: ?Sized, A: Allocator = Global> {
    data: NonNull<T>,
    allocation: NonNull<InnerArc<U>>,
    allocator: A
}

impl<T: ?Sized, A: Allocator> UniqueArc<T, A> {
    // Converts into a mapped unique pointer to the whole object as its own subfield.
    pub fn into_mapped_unique(self) -> MappedUniqueArc<T, T, A> { ... }
}

impl<T: ?Sized, U: ?Sized, A: Allocator> MappedUniqueArc<T, U, A> {
    // Converts this unique pointer into a pointer to the whole object.
    // Returns the original pointer wrapped in `Err` if there are any other unique pointers.
    pub fn into_arc(self) -> Result<LockResult<Arc<U, A>>, MappedUniqueArc<T, U, A>> { ... }
    
    // Converts this unique pointer into a shared pointer to the same subfield.
    // Decrements the strong unique counter and increments the strong shared counter.
    pub fn into_mapped(self) -> MappedArc<T, U, A> { ... }
    
    // Optionally maps this unique pointer into a unique pointer to a subfield of its pointee.
    pub fn map<V, F>(orig: Self, f: F) -> MappedUniqueArc<V, U, A> 
    where
        F: FnOnce(&mut T) -> &mut V,
        V: ?Sized
    { ... }
    
    // Maps this unique pointer into a unique pointer to a subfield of its pointee.
    // If the closure returns `None`, returns the original pointer wrapped in `Err`.
    pub fn filter_map<V, F>(orig: Self, f: F) -> Result<MappedUniqueArc<V, U, A>, MappedUniqueArc<T, U, A>> 
    where
        F: FnOnce(&mut T) -> Option<&mut V>,
        V: ?Sized
    { ... }
    
    // Attempts to map this unique pointer to a unique pointer to a subfield of its pointee.
    // If the closure returns an error, returns the original pointer along with it.
    pub fn try_map<V, E, F>(orig: Self, f: F) -> Result<MappedUniqueArc<V, U, A>, (MappedUniqueArc<T, U, A>, E)> 
    where
        F: FnOnce(&mut T) -> Result<&mut V, E>,
        V: ?Sized
    { ... }
    
    // Splits this unique pointer into two that point to subfields of its pointee.
    pub fn map_split<V, W, F>(orig: Self, f: F) -> (MappedUniqueArc<V, U, A>, MappedUniqueArc<W, U, A>)
    where
        F: FnOnce(&mut T) -> (&mut V, &mut W),
        V: ?Sized,
        W: ?Sized,
        A: Clone
    { ... }
}

impl<T: ?Sized, U: ?Sized, A: Allocator> Deref for MappedUniqueArc<T, U, A> {
    type Target = T;
    ...
}

impl<T: ?Sized, U: ?Sized, A: Allocator> DerefMut for MappedUniqueArc<T, U, A> { ... }

// Decrements the strong unique counter, possibly dropping and/or deallocating the pointee.
// Marks the pointee as poisoned if the drop occured during a panic.
impl<T: ?Sized, U: ?Sized, A: Allocator> Drop for MappedUniqueArc<T, U, A> { ... }

// `U` need not be `Send` nor `Sync` - the mapped pointer does not
// grant access to it, only deallocates it if needed.
unsafe impl<T: ?Sized + Send + Sync, U: ?Sized, A: Allocator> Send for MappedArc<T, U, A> {}

// `U` need not be `Send` nor `Sync` - the mapped pointer does not
// grant access to it, only deallocates it if needed.
// `T` does not need to be `Sync`, since this pointer provides exclusive access
// to the subfield.
unsafe impl<T: ?Sized + Send, U: ?Sized, A: Allocator> Sync for MappedArc<T, U, A> {}

Slices

Slices can be trivially fragmented into subfields, so to reduce boilerplate, I propose
the addition of the following methods:

impl<T, A: Alloctor> MappedArc<[T], [T], A> {
    fn iter(orig: Self) -> impl Iterator<Item = MappedArc<T, [T], A>> { ... }
    
    fn chunks(orig: Self) -> impl Iterator<Item = MappedArc<[T], [T], A>> { ... }
    
    fn chunks_exact(orig: Self) -> impl Iterator<Item = MappedArc<[T], [T], A>> { ... }
    
    // Other iterator methods...
    
    fn split_at(orig: Self, mid: usize) -> (MappedArc<[T], [T], A>, MappedArc<[T], [T], A>) { ... }
    
    fn split_first(orig: Self) -> (MappedArc<T, [T], A>, MappedArc<[T], [T], A>) { ... }
    
    // Other splitting methods and their `checked` variants...
    
    fn as_chunks<const N: usize>(orig: Self) -> (MappedArc<[[T; N]], [T], A>, MappedArc<[T], [T], A>) { ... }
    
    // And so on
} 

MappedUniqueArc should implement the _mut variants only - it may always trivially be converted
into MappedArc via MappedUniqueArc::into_mapped.

The iterators would also need to have access to the reference counter to keep the allocation alive
while they iterate over it.

Example

The example from before may now be rewritten as follows:

let mut huge_buffer: Arc<[MaybeUninit<_>]> = ...;

for item in huge_buffer.get_mut().unwrap().iter_mut() {
    // Initialize contents.
    ...
}

let mut huge_buffer: MappedUniqueArc<[_], [_]> = huge_buffer
    .assume_init()
    .into_unique()
    .unwrap()
    .into_mapped_unique();

let mut handles = Ven::with_capacity(n_threads);

for buffer in MappedUniqueArc::chunks_exact_mut(chunk_size) {
    handles.push(thread::spawn(move || {
        some_computation_with_buffer(&mut *buffer, ...);
    }));
}

Rc

The same principles can be applied to Rc, except for poisoning.
Also, in the case of Rc there can be no data races when changing the reference counters,
so introducing a separate shared_strong_count is fine.

Alternatives

  • Instead of cramming two strong counters into a single usize - add a new one. This approach
    would overcomplicate locking - instead of synchronizing two atomic variables, we'ld need to
    take care of three.
  • Instead of changing inner implementation of a well established type - Arc - add a new one.
    This new type would do everything an Arc already does, so it is unnecessary, in my opinion.
  • Keep the status quo - use scoped threads where possible and/or unsafe code with external synchronization.

Because the concepts discussed in this ACP relate to so many types from the standard library,
I doubt the proposed API can be implemented as a library on crates.io an should be integral to the
language.

Links and related work

What happens now?

This issue contains an API change proposal (or ACP) and is part of the libs-api team feature lifecycle. Once this issue is filed, the libs-api team will review open proposals as capability becomes available. Current response times do not have a clear estimate, but may be up to several months.

Possible responses

The libs team may respond in various different ways. First, the team will consider the problem (this doesn't require any concrete solution or alternatives to have been proposed):

  • We think this problem seems worth solving, and the standard library might be the right place to solve it.
  • We think that this probably doesn't belong in the standard library.

Second, if there's a concrete solution:

  • We think this specific solution looks roughly right, approved, you or someone else should implement this. (Further review will still happen on the subsequent implementation PR.)
  • We're not sure this is the right solution, and the alternatives or other materials don't give us enough information to be sure about that. Here are some questions we have that aren't answered, or rough ideas about alternatives we'd want to see discussed.

Metadata

Metadata

Assignees

No one assigned

    Labels

    T-libs-apiapi-change-proposalA proposal to add or alter unstable APIs in the standard libraries

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions