Skip to content

Commit 0dbdb7c

Browse files
authored
Fix slow vec allocation during start-up. (#737)
Use alloc::alloc_zeroed to manually allocate the buffer for the vector of `SpaceDescriptor(usize)` to elide the storing of zeroes to its elements during start-up. This is one of the two parts for solving the problem of slow start-up time. Related issues: - #669 - mmtk/mmtk-ruby#13 The problem is, Rust has specialised Vec creation mechanisms for built-in types, such as `usize`. This means `vec![0usize; 33554432]` will use pre-zeroed buffers to create the Vec, and will not store `0` values to its elements. This is very fast. However, Rust cannot do the same optimisation for `vec![SpaceDescriptor(0); 33554432]`. It will have to store a `SpaceDescriptor(0)` to each of its 33554432 elements, and that'll be very slow. This PR implements our own Vec creation function using `std::alloc::alloc_zeroed`. This change alone can reduce the execution time of `MMTK::new()` from 140ms to 60ms on my machine.
1 parent 844aa0e commit 0dbdb7c

File tree

2 files changed

+61
-1
lines changed

2 files changed

+61
-1
lines changed

src/util/heap/layout/map64.rs

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@ use crate::util::heap::layout::heap_parameters::*;
77
use crate::util::heap::layout::vm_layout_constants::*;
88
use crate::util::heap::space_descriptor::SpaceDescriptor;
99
use crate::util::raw_memory_freelist::RawMemoryFreeList;
10+
use crate::util::rust_util::zeroed_alloc::new_zeroed_vec;
1011
use crate::util::Address;
1112
use std::sync::atomic::{AtomicUsize, Ordering};
1213

@@ -41,7 +42,12 @@ impl Map for Map64 {
4142
}
4243

4344
Self {
44-
descriptor_map: vec![SpaceDescriptor::UNINITIALIZED; MAX_CHUNKS],
45+
// Note: descriptor_map is very large. Although it is initialized to
46+
// SpaceDescriptor(0), the compiler and the standard library are not smart enough to
47+
// elide the storing of 0 for each of the element. Using standard vector creation,
48+
// such as `vec![SpaceDescriptor::UNINITIALIZED; MAX_CHUNKS]`, will cause severe
49+
// slowdown during start-up.
50+
descriptor_map: unsafe { new_zeroed_vec::<SpaceDescriptor>(MAX_CHUNKS) },
4551
high_water,
4652
base_address,
4753
fl_page_resources: vec![None; MAX_SPACES],

src/util/rust_util.rs

Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,7 @@
1+
//! This module works around limitations of the Rust programming language, and provides missing
2+
//! functionalities that we may expect the Rust programming language and its standard libraries
3+
//! to provide.
4+
15
/// Const function for min value of two usize numbers.
26
pub const fn min_of_usize(a: usize, b: usize) -> usize {
37
if a > b {
@@ -123,3 +127,53 @@ mod initialize_once_tests {
123127
assert_eq!(INITIALIZE_COUNT.load(Ordering::SeqCst), 1);
124128
}
125129
}
130+
131+
/// This module is for allocating large arrays or vectors with initial zero values.
132+
///
133+
/// Note: The standard library uses the `IsZero` trait to specialize the intialization of `Vec<T>`
134+
/// if the initial element values are zero. Primitive type, such as `i8`, `usize`, `f32`, as well
135+
/// as types with known representations such as `Option<NonZeroUsize>` implement the `IsZero`
136+
/// trait. However, it has several limitations.
137+
///
138+
/// 1. Composite types, such as `SpaceDescriptor(usize)`, doesn't implement the `IsZero` trait,
139+
/// even if it has the `#[repr(transparent)]` annotation.
140+
/// 2. The `IsZero` trait is private to the `std` module, and we cannot use it.
141+
///
142+
/// Therefore, `vec![0usize; 33554432]` takes only 4 **microseconds**, while
143+
/// `vec![SpaceDescriptor(0); 33554432]` will take 22 **milliseconds** to execute on some machine.
144+
/// If such an allocation happens during start-up, the delay will be noticeable to light-weight
145+
/// scripting languages, such as Ruby.
146+
///
147+
/// We implement our own fast allocation of large zeroed vectors in this module. If one day Rust
148+
/// provides a standard way to optimize for zeroed allocation of vectors of composite types, we
149+
/// can switch to the standard mechanism.
150+
pub mod zeroed_alloc {
151+
152+
use std::alloc::{alloc_zeroed, Layout};
153+
154+
/// Allocate a `Vec<T>` of all-zero values.
155+
///
156+
/// This intends to be a faster alternative to `vec![T(0), size]`. It will allocate pre-zeroed
157+
/// buffer, and not store zero values to its elements as part of initialization.
158+
///
159+
/// It is useful when creating large (hundreds of megabytes) Vecs when the execution time is
160+
/// critical (such as during start-up, where a 100ms delay is obvious to small applications.)
161+
/// However, because of its unsafe nature, it should only be used when necessary.
162+
///
163+
/// Arguments:
164+
///
165+
/// - `T`: The element type.
166+
/// - `size`: The length and capacity of the created vector.
167+
///
168+
/// Returns the created vector.
169+
///
170+
/// # Unsafe
171+
///
172+
/// This function is unsafe. It will not call any constructor of `T`. The user must ensure
173+
/// that a value with all bits being zero is meaningful for type `T`.
174+
pub(crate) unsafe fn new_zeroed_vec<T>(size: usize) -> Vec<T> {
175+
let layout = Layout::array::<T>(size).unwrap();
176+
let ptr = alloc_zeroed(layout) as *mut T;
177+
Vec::from_raw_parts(ptr, size, size)
178+
}
179+
}

0 commit comments

Comments
 (0)