-
Notifications
You must be signed in to change notification settings - Fork 13.8k
specialize slice::fill
to use memset
#147294
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from all commits
b7cae19
115aeee
ebc94fb
cea7f2a
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -2882,6 +2882,8 @@ pub const unsafe fn copy_nonoverlapping<T>(src: *const T, dst: *mut T, count: us | |
pub const unsafe fn copy<T>(src: *const T, dst: *mut T, count: usize); | ||
|
||
/// This is an accidentally-stable alias to [`ptr::write_bytes`]; use that instead. | ||
/// | ||
/// `val` must be 1 byte wide, it is allowed to be uninit. | ||
// Note (intentionally not in the doc comment): `ptr::write_bytes` adds some extra | ||
// debug assertions; if you are writing compiler tests or code inside the standard library | ||
// that wants to avoid those debug assertions, directly call this intrinsic instead. | ||
|
@@ -2890,7 +2892,7 @@ pub const unsafe fn copy<T>(src: *const T, dst: *mut T, count: usize); | |
#[rustc_const_stable(feature = "const_intrinsic_copy", since = "1.83.0")] | ||
#[rustc_nounwind] | ||
#[rustc_intrinsic] | ||
pub const unsafe fn write_bytes<T>(dst: *mut T, val: u8, count: usize); | ||
pub const unsafe fn write_bytes<T, B>(dst: *mut T, val: B, count: usize); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Whether we can allow this for arbitrary 1-sized types depends on whether LLVM intends to allow memset on arbitrary bytes. @nikic in a future where LLVM has a byte type or something else that has a size of 8 bits but can hold non-integral things such as provenance or poison/undef, do you expect memset will work on such values? It's argument type might have to change for that... There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would not expect the memset argument type to change, but there is a separate memset.pattern intrinsic which accepts an arbitrary argument type and could be used for that purpose. It's not ready for general usage yet though. And there was some discussion about generalizing memset to effectively become memset.pattern in the future. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Okay, so by landing this we'd be making the bet that
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why does it matter what LLVM does in the future? We can just adapt our toolchain, right? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. LLVM currently has no explicitly documented way to There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If LLVM were to require that the byte for memset must be initialized, couldn't we just change the lowering of this intrinsic to not call memset, and do something else, like a loop? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sure, we could provide our own implementation. That doesn't sound great for performance. We can also revert the PR / go with the libs-only approach then, as long as we keep the intrinsic private and don't publicly expose its new capabilities. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes it wouldn't be great for performance but that's a problem we could also fix, and in any case I don't think it's useful to think about what internal APIs we should write on the basis of "maybe in the future LLVM decides to be bad at optimizing this". If this discussion is being driven by the fact that this intrinsic was accidentally exposed, I wonder what libs-api would have to say about that. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This subthread is just about the LLVM concerns, since we're reaching slightly deeper into an area that is poorly defined there, which we should only do deliberately IMO. Now that we have deliberated I'm okay with proceeding. The discussion for the signature change should happen here. |
||
|
||
/// Returns the minimum (IEEE 754-2008 minNum) of two `f16` values. | ||
/// | ||
|
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
|
@@ -15,9 +15,43 @@ impl<T: Clone> SpecFill<T> for [T] { | |||||
} | ||||||
|
||||||
impl<T: Copy> SpecFill<T> for [T] { | ||||||
fn spec_fill(&mut self, value: T) { | ||||||
default fn spec_fill(&mut self, value: T) { | ||||||
if size_of::<T>() == 1 { | ||||||
// SAFETY: The pointer is derived from a reference, so it's writable. | ||||||
// And we checked that T is 1 byte wide. | ||||||
unsafe { | ||||||
// use the intrinsic since it allows any T as long as it's 1 byte wide | ||||||
crate::intrinsics::write_bytes(self.as_mut_ptr(), value, self.len()); | ||||||
} | ||||||
return; | ||||||
} | ||||||
for item in self.iter_mut() { | ||||||
*item = value; | ||||||
} | ||||||
} | ||||||
} | ||||||
|
||||||
macro spec_fill_int { | ||||||
($($type:ty)*) => {$( | ||||||
impl SpecFill<$type> for [$type] { | ||||||
#[inline] | ||||||
fn spec_fill(&mut self, value: $type) { | ||||||
if crate::intrinsics::is_val_statically_known(value) { | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
let bytes = value.to_ne_bytes(); | ||||||
if value == <$type>::from_ne_bytes([bytes[0]; size_of::<$type>()]) { | ||||||
// SAFETY: The pointer is derived from a reference, so it's writable. | ||||||
unsafe { | ||||||
crate::intrinsics::write_bytes(self.as_mut_ptr(), bytes[0], self.len()); | ||||||
} | ||||||
return; | ||||||
} | ||||||
} | ||||||
for item in self.iter_mut() { | ||||||
*item = value; | ||||||
} | ||||||
} | ||||||
} | ||||||
)*} | ||||||
} | ||||||
|
||||||
spec_fill_int! { u16 i16 u32 i32 u64 i64 u128 i128 usize isize } | ||||||
Comment on lines
+34
to
+57
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Couldn't this approach (for There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sure that'd work for the particular reported case but not for newtypes. slice::fill is generic, so it's nice if it can handle anything that's has a scalar abi, but we can't detect "scalar but always-initialized" in the library, so it has to be more general. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That sounds like the motivation you have in mind goes beyond what is spelled out in the PR description. (You link to #87891 but not the original PR that landed the change in the first place. So without digging through the history of this code it's not clear what you want this to do and why, apart from the Miri perf issue that you linked. Would be nice to make all that context easily accessible from the PR description.) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Found the original PR at #83245, but that also doesn't say anything, so I guess it's mostly "because we can". 🤷 The extended intrinsic isn't pretty but it's not terrible either. So if t-libs says this is worth a special case in the compiler that's fine. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Previously it only covered u8, i8 and bool. That leaves out other cases where people may want to initialize large chunks of AtomicU8, |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,28 @@ | ||
//@ compile-flags: -Copt-level=3 | ||
#![crate_type = "lib"] | ||
|
||
use std::mem::MaybeUninit; | ||
|
||
// CHECK-LABEL: @slice_fill_pass_undef | ||
#[no_mangle] | ||
pub fn slice_fill_pass_undef(s: &mut [MaybeUninit<u8>], v: MaybeUninit<u8>) { | ||
// CHECK: tail call void @llvm.memset.{{.*}}(ptr nonnull align 1 %s.0, i8 %v, {{.*}} %s.1, i1 false) | ||
// CHECK: ret | ||
s.fill(v); | ||
} | ||
|
||
// CHECK-LABEL: @slice_fill_uninit | ||
#[no_mangle] | ||
pub fn slice_fill_uninit(s: &mut [MaybeUninit<u8>]) { | ||
// CHECK-NOT: call | ||
// CHECK: ret void | ||
s.fill(MaybeUninit::uninit()); | ||
} | ||
|
||
// CHECK-LABEL: @slice_wide_memset | ||
#[no_mangle] | ||
pub fn slice_wide_memset(s: &mut [u16]) { | ||
// CHECK: tail call void @llvm.memset.{{.*}}(ptr nonnull align 2 %s.0, i8 -1 | ||
// CHECK: ret | ||
s.fill(0xFFFF); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does have a small chance of breaking stable code as the intrinsic is (accidentally) exposed on stable. Such code would have seen deprecation warnings since Rust 1.86.