Skip to content

Commit cde96e5

Browse files
committed
Revise MaybeUninit validity documentation
Let's rewrite this for better clarity. In particular, let's document our language guarantees upfront and in positive form. We'll then list the caveats and the non-guarantees after.
1 parent 7971b2a commit cde96e5

File tree

1 file changed

+66
-32
lines changed

1 file changed

+66
-32
lines changed

library/core/src/mem/maybe_uninit.rs

Lines changed: 66 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -255,51 +255,85 @@ use crate::{fmt, intrinsics, ptr, slice};
255255
///
256256
/// # Validity
257257
///
258-
/// A `MaybeUninit<T>` has no validity requirement – any sequence of
259-
/// [bytes][reference-byte] of the appropriate length, initialized or
260-
/// uninitialized, are a valid representation of `MaybeUninit<T>`.
258+
/// `MaybeUninit<T>` has no validity requirements –- any sequence of [bytes] of
259+
/// the appropriate length, initialized or uninitialized, are a valid
260+
/// representation.
261261
///
262-
/// However, "round-tripping" via `MaybeUninit` does not always result in the
263-
/// original value. `MaybeUninit` can have padding, and the contents of that
264-
/// padding are not preserved. Concretely, given distinct `T` and `U` where
265-
/// `size_of::<T>() == size_of::<U>()`, the following code is not guaranteed to
266-
/// be sound:
262+
/// Moving or copying a value of type `MaybeUninit<T>` (i.e., performing a
263+
/// "typed copy") will exactly preserve the contents of all non-padding bytes of
264+
/// type `T` in the value including the [provenance] of those bytes.
265+
///
266+
/// Therefore `MaybeUninit` can be used to perform a round trip from type `T` to
267+
/// type `MaybeUninit<U>` then back to type `T`, while preserving the original
268+
/// value, if two conditions are met. One, type `U` must have the same size as
269+
/// type `T`. Two, for all byte offsets where type `U` has padding, the
270+
/// corresponding bytes in the representation of the value must be
271+
/// uninitialized.
272+
///
273+
/// For example, due to the fact that the type `[u8; size_of::<T>]` has no
274+
/// padding, the following is sound for any type `T` and will return the
275+
/// original value:
267276
///
268277
/// ```rust,no_run
269278
/// # use core::mem::{MaybeUninit, transmute};
270-
/// # struct T; struct U;
279+
/// # struct T;
271280
/// fn identity(t: T) -> T {
272281
/// unsafe {
282+
/// let u: MaybeUninit<[u8; size_of::<T>()]> = transmute(t);
283+
/// transmute(u) // OK.
284+
/// }
285+
/// }
286+
/// ```
287+
///
288+
/// Note: Copying a value that contains references may implicitly reborrow them
289+
/// causing the provenance of the returned value to differ from that of the
290+
/// original. This applies equally to the trivial identity function:
291+
///
292+
/// ```rust,no_run
293+
/// fn trivial_identity<T>(t: T) -> T { t }
294+
/// ```
295+
///
296+
/// Note: Moving or copying a value whose representation has initialized bytes
297+
/// at byte offsets where the type has padding may lose the value of those
298+
/// bytes, so while the original value will be preserved, the original
299+
/// *representation* of that value as bytes may not be. Again, this applies
300+
/// equally to `trivial_identity`.
301+
///
302+
/// Note: Performing this round trip when type `U` has padding at byte offsets
303+
/// where the representation of the original value has initialized bytes may
304+
/// produce undefined behavior or a different value. For example, the following
305+
/// is unsound since `T` requires all bytes to be initialized:
306+
///
307+
/// ```rust,no_run
308+
/// # use core::mem::{MaybeUninit, transmute};
309+
/// #[repr(C)] struct T([u8; 4]);
310+
/// #[repr(C)] struct U(u8, u16);
311+
/// fn unsound_identity(t: T) -> T {
312+
/// unsafe {
273313
/// let u: MaybeUninit<U> = transmute(t);
274-
/// transmute(u)
314+
/// transmute(u) // UB.
275315
/// }
276316
/// }
277317
/// ```
278318
///
279-
/// If the representation of `t` contains initialized bytes at byte offsets
280-
/// where `U` contains padding bytes, these may not be preserved in
281-
/// `MaybeUninit<U>`. Transmuting `u` back to `T` (i.e., `transmute(u)` above)
282-
/// may thus be undefined behavior or yield a value different from `t` due to
283-
/// those bytes being lost. This is an active area of discussion, and this code
284-
/// may become sound in the future.
285-
///
286-
/// However, so long as no such byte offsets exist, then the preceding
287-
/// `identity` example *is* sound. In particular, since `[u8; N]` has no padding
288-
/// bytes, transmuting `t` to `MaybeUninit<[u8; size_of::<T>]>` and back will
289-
/// always produce the original value `t` again. This is true even if `t`
290-
/// contains [provenance]: the resulting value will have the same provenance as
291-
/// the original `t`.
292-
///
293-
/// Note a potential footgun: if `t` contains a reference, then there may be
294-
/// implicit reborrows of the reference any time it is copied, which may alter
295-
/// its provenance. In that case, the value returned by `identity` may not be
296-
/// exactly the same as its argument. However, even in this case, it remains
297-
/// true that `identity` behaves the same as a function that just returns `t`
298-
/// immediately (i.e., `fn identity<T>(t: T) -> T { t }`).
319+
/// Conversely, the following is sound since `T` allows uninitialized bytes in
320+
/// the representation of a value, but the round trip may alter the value:
299321
///
300-
/// [provenance]: crate::ptr#provenance
322+
/// ```rust,no_run
323+
/// # use core::mem::{MaybeUninit, transmute};
324+
/// #[repr(C)] struct T(MaybeUninit<[u8; 4]>);
325+
/// #[repr(C)] struct U(u8, u16);
326+
/// fn non_identity(t: T) -> T {
327+
/// unsafe {
328+
/// // May lose an initialized byte.
329+
/// let u: MaybeUninit<U> = transmute(t);
330+
/// transmute(u)
331+
/// }
332+
/// }
333+
/// ```
301334
///
302-
/// [reference-byte]: ../../reference/memory-model.html#bytes
335+
/// [bytes]: ../../reference/memory-model.html#bytes
336+
/// [provenance]: crate::ptr#provenance
303337
#[stable(feature = "maybe_uninit", since = "1.36.0")]
304338
// Lang item so we can wrap other types in it. This is useful for coroutines.
305339
#[lang = "maybe_uninit"]

0 commit comments

Comments
 (0)