Skip to content

Commit ce3a5ce

Browse files
committed
Update str safety docs to refer to the Invariant section of its docs.
* core::str::from_utf8_unchecked: doc: do not declare that invalid UTF-8 is immediate UB, analogous to `String::from_utf8_unchecked`. * core::str::from_utf8_unchecked(_mut): update internal safety comments, and use pointer cast instead of transmute * str::as_bytes_mut: doc: do not declare that invalid UTF-8 after borrow ends is immediate UB, analogous to `String::as_mut_vec`.
1 parent 5f23ef7 commit ce3a5ce

File tree

2 files changed

+18
-12
lines changed

2 files changed

+18
-12
lines changed

library/core/src/str/converts.rs

Lines changed: 13 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
33
use super::Utf8Error;
44
use super::validations::run_utf8_validation;
5-
use crate::{mem, ptr};
5+
use crate::ptr;
66

77
/// Converts a slice of bytes to a string slice.
88
///
@@ -145,7 +145,10 @@ pub const fn from_utf8_mut(v: &mut [u8]) -> Result<&mut str, Utf8Error> {
145145
///
146146
/// # Safety
147147
///
148-
/// The bytes passed in must be valid UTF-8.
148+
/// This function is unsafe because it does not check that the bytes passed
149+
/// to it are valid UTF-8. If this constraint is violated, it may cause
150+
/// memory unsafety issues with future users of the `str`, as the rest of
151+
/// the standard library [assumes that `str`s are valid UTF-8](str#invariant).
149152
///
150153
/// # Examples
151154
///
@@ -169,9 +172,11 @@ pub const fn from_utf8_mut(v: &mut [u8]) -> Result<&mut str, Utf8Error> {
169172
#[rustc_const_stable(feature = "const_str_from_utf8_unchecked", since = "1.55.0")]
170173
#[rustc_diagnostic_item = "str_from_utf8_unchecked"]
171174
pub const unsafe fn from_utf8_unchecked(v: &[u8]) -> &str {
172-
// SAFETY: the caller must guarantee that the bytes `v` are valid UTF-8.
173-
// Also relies on `&str` and `&[u8]` having the same layout.
174-
unsafe { mem::transmute(v) }
175+
// SAFETY: the pointer dereference is safe because that pointer
176+
// comes from a reference which is guaranteed to be valid for reads.
177+
// If the input bytes are not valid UTF-8, then the returned `&str` will
178+
// have invalid UTF-8, which is unsafe but not immediate UB.
179+
unsafe { &*(v as *const [u8] as *const str) }
175180
}
176181

177182
/// Converts a slice of bytes to a string slice without checking
@@ -197,10 +202,10 @@ pub const unsafe fn from_utf8_unchecked(v: &[u8]) -> &str {
197202
#[rustc_const_stable(feature = "const_str_from_utf8_unchecked_mut", since = "1.83.0")]
198203
#[rustc_diagnostic_item = "str_from_utf8_unchecked_mut"]
199204
pub const unsafe fn from_utf8_unchecked_mut(v: &mut [u8]) -> &mut str {
200-
// SAFETY: the caller must guarantee that the bytes `v`
201-
// are valid UTF-8, thus the cast to `*mut str` is safe.
202-
// Also, the pointer dereference is safe because that pointer
205+
// SAFETY: the pointer dereference is safe because that pointer
203206
// comes from a reference which is guaranteed to be valid for writes.
207+
// If the input bytes are not valid UTF-8, then the returned `&mut str` will
208+
// have invalid UTF-8, which is unsafe but not immediate UB.
204209
unsafe { &mut *(v as *mut [u8] as *mut str) }
205210
}
206211

library/core/src/str/mod.rs

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -309,10 +309,11 @@ impl str {
309309
///
310310
/// # Safety
311311
///
312-
/// The caller must ensure that the content of the slice is valid UTF-8
313-
/// before the borrow ends and the underlying `str` is used.
314-
///
315-
/// Use of a `str` whose contents are not valid UTF-8 is undefined behavior.
312+
/// This function is unsafe because the returned `&mut [u8]` allows writing
313+
/// bytes which are not valid UTF-8. If this constraint is violated, using
314+
/// the original `str` after the `&mut [u8]` borrow expires may violate memory
315+
/// safety, as the rest of the standard library [assumes that `str`s are
316+
/// valid UTF-8](str#invariant).
316317
///
317318
/// # Examples
318319
///

0 commit comments

Comments
 (0)