diff --git a/src/SUMMARY.md b/src/SUMMARY.md
index 1dca1f14df59..3b0a59677ce8 100644
--- a/src/SUMMARY.md
+++ b/src/SUMMARY.md
@@ -454,6 +454,24 @@
- [Data structures are safe](unsafe-deep-dive/foundations/data-structures-are-safe.md)
- [Actions might not be](unsafe-deep-dive/foundations/actions-might-not-be.md)
- [Less powerful than it seems](unsafe-deep-dive/foundations/less-powerful.md)
+- [Understanding Unsafety](unsafe-deep-dive/understanding-unsafety.md)
+ - [Undefined behavior](unsafe-deep-dive/understanding-unsafety/undefined-behavior.md)
+ - [Out of bounds](unsafe-deep-dive/understanding-unsafety/out-of-bounds.md)
+ - [Initialization](unsafe-deep-dive/understanding-unsafety/initialization.md)
+- [Mechanics](unsafe-deep-dive/mechanics.md)
+ - [Example: Representing Booleans](unsafe-deep-dive/mechanics/representing-booleans.md)
+ - [Extension: Representing Char](unsafe-deep-dive/mechanics/representing-char.md)
+ - [Example: FFI](unsafe-deep-dive/mechanics/example-ffi.md)
+ - [Case Study: RawVec](unsafe-deep-dive/mechanics/case-study-rawvec.md)
+ - [Case Study: std::mem](unsafe-deep-dive/mechanics/case-study-std-mem.md)
+ - [Case Study: UnsafeCell](unsafe-deep-dive/mechanics/case-study-unsafe-cell.md)
+ - [Guidelines](unsafe-deep-dive/mechanics/guidelines.md)
+ - [Portal types](unsafe-deep-dive/mechanics/guideline-portal-types.md)
+ - [Smart constructors](unsafe-deep-dive/mechanics/guideline-smart-constructors.md)
+ - [Reuse pre-existing code](unsafe-deep-dive/mechanics/guideline-reuse-preexisting.md)
+ - [Narrow scope](unsafe-deep-dive/mechanics/guideline-narrow-scope.md)
+ - [Safety comments](unsafe-deep-dive/mechanics/guideline-safety-comments.md)
+ - [Invariant checklist](unsafe-deep-dive/mechanics/guideline-invariant-checklist.md)
---
diff --git a/src/running-the-course/course-structure.md b/src/running-the-course/course-structure.md
index cb2a63aaf60c..759fe35cfbb4 100644
--- a/src/running-the-course/course-structure.md
+++ b/src/running-the-course/course-structure.md
@@ -85,9 +85,9 @@ You should be familiar with the material in
### Unsafe (Work in Progress)
The [Unsafe](../unsafe-deep-dive/welcome.md) deep dive is a two-day class on the
-_unsafe_ Rust language. It covers the fundamentals of Rust's safety guarantees,
-the motivation for `unsafe`, review process for `unsafe` code, FFI basics, and
-building data structures that the borrow checker would normally reject.
+_unsafe_ Rust language. It covers the fundamentals of what Rust's safety
+guarantees are and why `unsafe` is needed, reviewing `unsafe` code, using FFI,
+and building data structures that the borrow checker would normally reject.
{{%course outline Unsafe}}
diff --git a/src/unsafe-deep-dive/mechanics.md b/src/unsafe-deep-dive/mechanics.md
new file mode 100644
index 000000000000..f2527ad9c667
--- /dev/null
+++ b/src/unsafe-deep-dive/mechanics.md
@@ -0,0 +1,13 @@
+# Mechanics
+
+We've seen a number of unsafe blocks. We'll now discuss what to look for in
+well-written unsafe code.
+
+{{% segment outline}}
+
+
+
+Inform the class that we will be doing more coding and group work in this
+segment, rather than simply reading slides.
+
+
diff --git a/src/unsafe-deep-dive/mechanics/case-study-rawvec.md b/src/unsafe-deep-dive/mechanics/case-study-rawvec.md
new file mode 100644
index 000000000000..ae1fc19a6928
--- /dev/null
+++ b/src/unsafe-deep-dive/mechanics/case-study-rawvec.md
@@ -0,0 +1,62 @@
+---
+minutes: 15
+---
+
+# Case Study: RawVec
+
+> WORK IN PROGRESS
+>
+> This section is likely to receive significant alterations before completion
+> and may even be removed entirely.
+
+Many important collections in the standard library, such as `Vec`, `String`
+and `Deque` rely on a private inner type called `RawVec`.
+
+Why is that inner type used?
+
+```rust,ignore
+// https://doc.rust-lang.org/src/alloc/vec/mod.rs.html
+// std::alloc
+pub struct Vec {
+ buf: RawVec,
+ len: usize,
+}
+```
+
+```rust,ignore
+// std::raw_vec
+pub(crate) struct RawVec {
+ inner: RawVecInner,
+ _marker: PhantomData,
+}
+
+struct RawVecInner {
+ ptr: Unique,
+ /// Never used for ZSTs; it's `capacity()`'s responsibility to return usize::MAX in that case.
+ ///
+ /// # Safety
+ ///s
+ /// `cap` must be in the `0..=isize::MAX` range.
+ cap: Cap,
+ alloc: A,
+}
+```
+
+The [implementation of `RawVec` is described in the Rustonomicon][rv].
+
+[rv]: https://doc.rust-lang.org/nomicon/vec/vec-raw.html
+
+
+
+`Vec` is normally described as being a struct with three fields: length,
+capacity, and pointer to an underlying buffer. Once you dig into the
+implementation details, you'll notice that things are much more complicated.
+
+`RawVec` provides a barrier between Safe and Unsafe.
+
+`RawVec`
+
+`RawVecInner` contains the actual pointer and capacity of the underlying
+buffer.
+
+
diff --git a/src/unsafe-deep-dive/mechanics/case-study-std-mem.md b/src/unsafe-deep-dive/mechanics/case-study-std-mem.md
new file mode 100644
index 000000000000..89b9a15afd40
--- /dev/null
+++ b/src/unsafe-deep-dive/mechanics/case-study-std-mem.md
@@ -0,0 +1 @@
+# Case Study: std::mem
diff --git a/src/unsafe-deep-dive/mechanics/case-study-unsafe-cell.md b/src/unsafe-deep-dive/mechanics/case-study-unsafe-cell.md
new file mode 100644
index 000000000000..cb5a9e1d7417
--- /dev/null
+++ b/src/unsafe-deep-dive/mechanics/case-study-unsafe-cell.md
@@ -0,0 +1 @@
+# Case Study: UnsafeCell
diff --git a/src/unsafe-deep-dive/mechanics/case-study.md b/src/unsafe-deep-dive/mechanics/case-study.md
new file mode 100644
index 000000000000..2004e04acfeb
--- /dev/null
+++ b/src/unsafe-deep-dive/mechanics/case-study.md
@@ -0,0 +1,25 @@
+# Case Study: Lesser-known parts of std::mem
+
+As a group, we'll study some parts of Rust's memory management functionality:
+
+- `std::mem::TransmuteFrom` trait and its `Assume` struct
+- `std::mem::discriminant`
+- `std::mem::forget_unsized`
+- `std::mem::MaybeUninit`
+
+
+
+Split learners into small groups and ask them to look into the implementation of
+one of the types above.
+
+You may need to show learner how to view the source code of the standard
+library.
+
+After a few minutes, they should be able to answer the following questions:
+
+- What is the purpose of the function/type/trait?
+- How does the documentation describe the safety contract, if any? Can that
+ documentation be improved?
+- Were there any interesting parts in its implementation?
+
+
diff --git a/src/unsafe-deep-dive/mechanics/example-ffi.md b/src/unsafe-deep-dive/mechanics/example-ffi.md
new file mode 100644
index 000000000000..c4954da2011b
--- /dev/null
+++ b/src/unsafe-deep-dive/mechanics/example-ffi.md
@@ -0,0 +1 @@
+# Example: FFI
diff --git a/src/unsafe-deep-dive/mechanics/guideline-invariant-checklist.md b/src/unsafe-deep-dive/mechanics/guideline-invariant-checklist.md
new file mode 100644
index 000000000000..c2e877e1b769
--- /dev/null
+++ b/src/unsafe-deep-dive/mechanics/guideline-invariant-checklist.md
@@ -0,0 +1,52 @@
+# Safety checklist
+
+When writing and reviewing `unsafe` code, we should make sure that we've
+considered the following considerations _and documented_ what callers must later
+uphold:
+
+- Validity
+- Alignment
+- Lifetimes
+- Ownership
+- Platform
+- Compliance with specifications
+
+
+
+**Validity**
+
+Callers must ensure that values must match some bit-pattern.
+
+**Alignment**
+
+Callers must ensure that values are correctly aligned.
+
+**Lifetimes**
+
+Do callers need to verify that a referent must exist before/after/during?
+
+**Ownership**
+
+Can this function generate confusion about ownership?
+
+> _Aside:_ Memory leaks
+>
+> A discussion about leaking memory may arise here. If calling a function
+> removes all ownership information, then .
+>
+> Memory leaking is not strictly a memory safety concern. However, it's often a
+> problem in practice, especially if it is unintentional.
+>
+> Therefore, this should at least be documented. If it's possible to mishandle
+> the API and cause an unintentional leak, then there is a case for an unsafe
+> block.
+
+**Platform**
+
+Callers must be wary of platform-specific behavior.
+
+**Compliance with specifications**
+
+"Business Rules", i.e. all values must be even numbers.
+
+
diff --git a/src/unsafe-deep-dive/mechanics/guideline-narrow-scope.md b/src/unsafe-deep-dive/mechanics/guideline-narrow-scope.md
new file mode 100644
index 000000000000..b5f4812f6548
--- /dev/null
+++ b/src/unsafe-deep-dive/mechanics/guideline-narrow-scope.md
@@ -0,0 +1,49 @@
+---
+minutes: 3
+---
+
+# Keep unsafe narrow
+
+Compare these two code examples:
+
+```rust
+fn main() {
+ let raw = b"Crab";
+
+ // SAFETY: `raw` has the static lifetime of valid UTF-8 data and therefore `ptr` is valid
+ let crab = unsafe {
+ let ptr = raw.as_ptr();
+ let bytes = std::slice::from_raw_parts(ptr, 4);
+ std::str::from_utf8_unchecked(bytes)
+ };
+
+ println!("{crab}");
+}
+```
+
+```rust
+fn main() {
+ let raw = b"Crab";
+ let ptr = raw.as_ptr();
+
+ // SAFETY: `raw` has the static lifetime and therefore `ptr` is valid
+ let bytes = unsafe { std::slice::from_raw_parts(ptr, 4) };
+
+ // SAFETY: We created `raw` with valid UTF-8 data
+ let crab = unsafe { std::str::from_utf8_unchecked(bytes) };
+
+ println!("{crab}");
+}
+```
+
+
+
+Unsafe blocks should have a narrow lens.
+
+
+
+If an unsafe block has multiple safety conditions that can be assessed
+independently, then it's likely that each of those conditions should be in its
+own block.
+
+
diff --git a/src/unsafe-deep-dive/mechanics/guideline-portal-types.md b/src/unsafe-deep-dive/mechanics/guideline-portal-types.md
new file mode 100644
index 000000000000..6f775bbefb21
--- /dev/null
+++ b/src/unsafe-deep-dive/mechanics/guideline-portal-types.md
@@ -0,0 +1,16 @@
+---
+minutes: 2
+---
+
+# Portal types
+
+> TODO(timclicks): expand
+
+Create a safe type that wraps a type that performs unsafe operations. The safe
+type makes the unsafe type impossible to misuse. The wrapper acts as a portal to
+the world of unsafe.
+
+Examples:
+
+- `std::collections::Vec` wraps `std::alloc::RawVec`
+- The "sys crate" pattern
diff --git a/src/unsafe-deep-dive/mechanics/guideline-reuse-preexisting.md b/src/unsafe-deep-dive/mechanics/guideline-reuse-preexisting.md
new file mode 100644
index 000000000000..b831a654efe2
--- /dev/null
+++ b/src/unsafe-deep-dive/mechanics/guideline-reuse-preexisting.md
@@ -0,0 +1,23 @@
+---
+minutes: 3
+---
+
+# Reuse pre-existing code
+
+> TODO(timclicks): expand
+
+Avoid re-implementing:
+
+- Interior mutability – `Cell` and `UnsafeCell`
+- Wrapping NULL pointers safely – `Option<&mut T>`
+
+
+
+When we are writing code, it can be tempting to write everything from scratch.
+Check whether pre-existing solutions exist already. In particular, the standard
+library offers excellent defaults for memory management.
+
+If you find yourself writing a better implementation of these types, then
+consider submitting them to the Rust project.
+
+
diff --git a/src/unsafe-deep-dive/mechanics/guideline-safety-comments.md b/src/unsafe-deep-dive/mechanics/guideline-safety-comments.md
new file mode 100644
index 000000000000..a137734c70a4
--- /dev/null
+++ b/src/unsafe-deep-dive/mechanics/guideline-safety-comments.md
@@ -0,0 +1,67 @@
+---
+minutes: 2
+---
+
+# Safety comments
+
+When defining unsafe functions, provide a `Safety` section in the docstring:
+
+```rust,editable
+/// Compress `data`, overwriting its memory and updating the length of the slice.
+unsafe fn compress_inplace(data: &mut [u8]) {
+ todo!();
+}
+```
+
+When using an unsafe block, document how you have upheld your side of the
+contract:
+
+```rust,editable
+unsafe {
+ std::mem::transmute::(x)
+}
+```
+
+
+
+## Code
+
+```rust
+/// Compress `data`, overwriting its memory and updating the length of the slice.
+///
+/// ## Safety
+///
+/// Callers must ensure that the data's compressed form is shorter than the
+/// original. As a heuristic, this function should not be used on a buffer
+/// that has fewer than 256 bytes.
+unsafe fn compress_inplace(data: &mut [u8]) {
+ todo!();
+}
+```
+
+```rust
+/// SAFETY: We control the generation of `x` and can ensure that it's 4 bytes wide
+unsafe {
+ std::mem::transmute::(x)
+}
+```
+
+> _Aside: In-place compression_
+>
+> Creating an algorithm that does in-place compression is likely to nerd snipe 1
+> or two people. Avoid getting distracted.
+>
+> You could mention that it's possible to use a stack-allocated tmp buffer
+> rather than something on the heap. If the implementation uses a static buffer,
+> the comment must be updated to mention that the code is not thread-safe.
+
+## Discussion
+
+An effective safety comment is falsifiable. That is, there should be something
+empirical that people can point to and check.
+
+Note that Clippy's lint for safety comments does little more than check that the
+string SAFETY: appears before the `unsafe` keyword. There is no further
+validation.
+
+
diff --git a/src/unsafe-deep-dive/mechanics/guideline-smart-constructors.md b/src/unsafe-deep-dive/mechanics/guideline-smart-constructors.md
new file mode 100644
index 000000000000..691ffb017c72
--- /dev/null
+++ b/src/unsafe-deep-dive/mechanics/guideline-smart-constructors.md
@@ -0,0 +1,43 @@
+# Smart constructors
+
+> TODO(timclicks): Think of a better type name; expand details
+
+```rust,ignore
+impl ForeignRefCount {
+ fn new(...) {
+ // ..
+ }
+
+ unsafe fn incr(&mut self) {
+ // ...
+ }
+
+ unsafe fn decr(&mut self) {
+ // ...
+ }
+}
+```
+
+```rust,ignore
+impl ForeignRefCount {
+ unsafe fn new_unchchecked(...) {
+ // ..
+ }
+
+ fn incr(&mut self) {
+ // ...
+ }
+
+ fn decr(&mut self) {
+ // ...
+ }
+}
+```
+
+
+
+It is tedious to check invariants at every call during an object's life.
+Instead, you can provide a `new_unchecked` method which provides an opportunity
+for the invariants to be checked once and then later relied upon.
+
+
diff --git a/src/unsafe-deep-dive/mechanics/guidelines.md b/src/unsafe-deep-dive/mechanics/guidelines.md
new file mode 100644
index 000000000000..be0af0a9c616
--- /dev/null
+++ b/src/unsafe-deep-dive/mechanics/guidelines.md
@@ -0,0 +1,17 @@
+# Guidelines
+
+> WORK IN PROGRESS
+>
+> These guidelines should not be interpreted as authoritative or official.
+
+Specific advice on creating well-written unsafe Rust code.
+
+
+
+The next few slides are intended as reference material. You do not need to spend
+much time here – the intent is to tell people that these guidelines exist.
+
+You should have covered discussed most of the points in the preceding
+discussion.
+
+
diff --git a/src/unsafe-deep-dive/mechanics/representing-booleans.md b/src/unsafe-deep-dive/mechanics/representing-booleans.md
new file mode 100644
index 000000000000..bbc85938cd42
--- /dev/null
+++ b/src/unsafe-deep-dive/mechanics/representing-booleans.md
@@ -0,0 +1,480 @@
+---
+minutes: 15
+---
+
+# Example: Representing Boolean values
+
+> TODO(timclicks): split this content into multiple sub-sections
+
+One of the terms that we introduced earlier was _undefined behavior_. This
+exercise aims to discuss what undefined behavior actually is and how it can
+arise.
+
+High performance code is particularly prone to accidentally introducing
+undefined behavior into a program, because its authors are typically very
+interested in finding ways to cut corners.
+
+---
+
+## What's wrong with undefined behavior?
+
+C++ compilers will typically (*) compile this code without warnings, and will
+run without error signaling an error:
+
+```cpp
+#include
+
+int axiom_increment_is_greater(int x) {
+ return x + 1 > x;
+}
+
+int main() {
+ int a = 2147483647;
+ assert(axiom_increment_is_greater(a));
+}
+```
+
+Equivalent Rust programs produce different output:
+
+```rust,editable
+fn axiom_increment_is_greater(x: i32) -> bool {
+ x + 1 > x
+}
+
+fn main() {
+ let a = 2147483647;
+ assert!(axiom_increment_is_greater(a));
+}
+```
+
+(*) We can't be certain. That's one of the problems.
+
+
+
+We don't want to have undefined behavior in our code, because it makes the code
+_unsound_.
+
+Unsound code can crash abruptly or produce unexpected results, because compilers
+are written with the assumption that undefined behavior does not exist. They
+will create optimizations that could be completely contrary to your
+expectations.
+
+In this example, assume that we're creating some sort of proof assistant that
+makes deductions based on mathematical axioms. One of the axioms that we want to
+encode is that an integer's increment is always greater than the integer itself:
+
+gcc v13.2, clang v16.0.0 and msvc v19.0 [all compile the C++ code to][asm] the
+following assembly when optimizations are enabled ( `-O2`):
+
+```asm
+axiom_increment_is_greater(int):
+ mov eax, 1
+ ret
+```
+
+[asm]: https://godbolt.org/z/q4MMY8vxs
+
+That is, while it looks like they'll always return `true`, the code also
+produces undefined behavior. When `x` is 2^32-1 and is incremented, it enters an
+undefined state. The operation produces a number that is outside of the range of
+a 32-bit signed integer.
+
+Integer overflow for signed integers is _undefined_. In the conventional twos
+complement representation, increment often wraps to -(2^31)-1 `i32::MIN`.
+
+Rust takes a stricter approach. When integer oveflow is signaled by the CPU, a
+panic is induced. This allows Safe Rust to be free of undefined behaviour.
+
+
+
+---
+
+## Rust keeps undefined behavior out...
+
+...but, unsafe provides a way for it to get back in.
+
+
+
+We are going to work through an example of how undefined behavior can be
+introduced in an attempt to improve performance.
+
+
+
+---
+
+## Booleans
+
+A typical representation:
+
+- 1 => truth/positivity
+- 0 => falsehood/negativity
+
+
+
+Just as integers can have their quirks, so do Boolean data types.
+
+How are the Boolean values `true` and `false` represented by programming
+languages?
+
+Many languages, including Rust and C++, encode Boolean values as an integer,
+where:
+
+- 1 represents truth or positivity
+- 0 represents falsehood or negativity
+
+However, there is an impedance mismatch because even the smallest integer (a
+single byte) can represent many more numbers than the two that are required.
+
+> Aside: Not a universal definition
+>
+> Programming language designers are free to have their own representations, or
+> not include a Boolean type in their language at all.
+>
+> CPUs do not have a Boolean datatype, rather they have Boolean operations that
+> are performed against operands that are typically integers.
+
+
+
+---
+
+## Exercise
+
+Define a type that represents a `bool` and conversion two conversion functions
+to convert between a `u8` and your new type and back again.
+
+
+
+
+
+---
+
+## Code review 1
+
+Critique this code and suggest improvements, if any:
+
+```rust,editable
+struct Boolean(u8);
+
+fn byte_to_boolean(b: u8) -> Boolean {
+ Boolean(b)
+}
+
+fn boolean_to_byte(boolean: Boolean) -> u8 {
+ boolean.0
+}
+
+fn boolean_to_bool(boolean: Boolean) -> bool {
+ match b.0 {
+ 0 => false,
+ _ => true,
+ }
+}
+```
+
+
+
+Which function should be `unsafe`? It could either be at the "constructor"
+(`byte_to_boolean`) or when the Boolean is converted to a Rust-native `bool`
+(`boolean_to_bool`).
+
+
+
+---
+
+## Code review 2
+
+```rust,editable
+struct Boolean(bool);
+
+fn byte_to_boolean(b: u8) -> Boolean {
+ match b.0 {
+ 0 => Boolean(false),
+ _ => Boolean(true),
+ }
+}
+
+fn boolean_to_byte(boolean: Boolean) -> u8 {
+ boolean.0 as u8
+}
+
+fn boolean_to_bool(boolean: Boolean) -> bool {
+ boolean.0
+}
+```
+
+
+
+In this version, we mask the error. All non-zero inputs are coerced to `true`.
+We store the internal field of the `Boolean` struct as a `bool` to make as much
+use of Rust's type system as possible.
+
+However, this `byte_to_boolean` is not zero-cost. There is still a `match`
+operation that's required.
+
+
+
+---
+
+## Code review 3
+
+```rust,editable
+#[repr(C)]
+union Boolean {
+ raw: u8,
+ rust: bool,
+}
+
+fn byte_to_boolean(b: u8) -> Boolean {
+ Boolean { raw: b }
+}
+
+fn boolean_to_byte(boolean: Boolean) -> u8 {
+ unsafe { boolean.rust }
+}
+
+fn boolean_to_bool(boolean: Boolean) -> bool {
+ unsafe { boolean.raw }
+}
+```
+
+---
+
+## Code review 4
+
+```rust,editable
+struct Boolean(bool);
+
+fn byte_to_boolean(b: u8) -> Boolean {
+ let b: bool = unsafe { sys::mem::transmute(b) };
+
+ Boolean(b)
+}
+
+fn boolean_to_byte(boolean: Boolean) -> u8 {
+ boolean.0 as u8
+}
+
+fn boolean_to_bool(boolean: Boolean) -> bool {
+ boolean.0
+}
+```
+
+---
+
+##
+
+---
+
+Or in Rust syntax:
+
+```rust
+struct Boolean(u8);
+
+const true: Boolean = Boolean(1);
+const false: Boolean = Boolean(0);
+```
+
+>> Instructor Notes
+>
+> We define a type here so that there is no confusion in the type system between
+> `u8` and `Boolean`.
+
+From a theoretical perspective, the two states `true` and `false` be represented
+by a single bit. However, the smallest integer available is `u8`, which has 254
+additional states.
+
+This is a similar problem to the mismatch casting from a `i64` to `i32`, but
+there is a significant difference. When converting an integer from a 64-bit type
+to a 32-bit type, there is not enough space in the narrower type for all
+possible input values. They can't all fit. In the case of casting from `u8` to
+`bool`, the number of bits isn't the issue. It's the standard that imposes the
+additional restrictions.
+
+Depending on one's perspective, this either presents an opportunity or a
+challenge.
+
+Moreover, [Rust (following C) imposes the following restrictions][ref-bool] on
+its `bool` type:
+
+> The value `false` has the bit pattern `0x00` and the value `true` has the bit
+> pattern `0x01`. It is _undefined behavior_ for an object with the boolean type
+> to have any other bit pattern. [emphasis added]
+
+Many CPUs, don't strictly have a "Boolean type". They have Boolean operations.
+
+- For true, CPUs ask. Does this value match
+
+[ref-bool]: https://doc.rust-lang.org/reference/types/boolean.html
+
+## Exercise
+
+Implement two conversion functions, `byte_to_boolean()` and `boolean_to_byte()`:
+
+```rust
+struct Boolean(u8);
+
+fn byte_to_boolean(b: u8) -> Boolean {
+ todo!();
+}
+
+fn boolean_to_byte(b: Boolean) -> u8 {
+ todo!();
+}
+```
+
+## Discussion
+
+Should this function be marked as unsafe?
+
+```rust
+struct Boolean(u8);
+
+fn byte_to_boolean(b: u8) -> Boolean {
+ match b {
+ 0 => false,
+ _ => true,
+ }
+}
+```
+
+---
+
+> Note: Content following this comment is from a previous revisions and is being
+> retained temporarily.
+
+> TODO(timclicks): Review the following content for anything useful that should
+> be retained.
+
+This example demonstrates how the search for high performance can . Software
+engineers can find themselves wanting to exploit characteristics of the
+operating environment,
+
+CPUs
+
+> Well, actually...
+>
+> CPUs don't really have a concept of a Boolean value. Instead, they have
+> Boolean operations.
+
+In Rust, the conventional way to think of them is something like this:
+
+Boolean values must match a precise representation to avoid undefined behavior:
+
+
+
+
Bit pattern
Rust type
+
+
+
00000001
true
+
+
+
00000000
false
+
+
+
Other patterns
Undefined
+
+
+
+You have two tasks in this exercise.
+
+- First,
+ - Create Rust type, `Boolean` type that represents a Boolean value in a
+ spec-compliant way
+ - The first create values of your type from `u8` with no overhead cost while
+ ensuring that undefined behavior is impossible.
+- Secondly, review someone else's implementation.
+
+
+
+## Discussion
+
+- The critical point in these reviews is that learners accurately describe the
+ contract that callers need to uphold when converting from `u8`. It should be
+ well described in a Safety section of the docstring.
+- Functions should have an `#[inline(always)]` annotation as Rust's `Copy` trait
+ involves memcpy. We want the compiler to erase the function call
+
+> _Aside: TransmuteFrom trait_
+>
+> The standard library contains a nightly feature, `transmutability` which
+> defines the [`std::mem::TransmuteFrom`] trait for performing this kind of
+> operation. This is one of the outputs from the [Safe Transmute Project] within
+> the Rust compiler team.
+
+[`transmutability`]: https://github.com/rust-lang/rust/issues/99571
+[Safe Transmute Project]: https://github.com/rust-lang/project-safe-transmute
+[`std::mem::TransmuteFrom`]: https://doc.rust-lang.org/std/mem/trait.TransmuteFrom.html
+
+### Picking a data structure
+
+**Newtype wrapping u8**
+
+The orthodox strategy will be to wrap `u8` in a struct:
+
+```rust
+struct Boolean(u8);
+```
+
+This ensures that the representation is the same as `u8`.
+
+**Newtype wrapping bool**
+
+Hopefully, you will have some learners will wrap `bool` as a newtype:
+
+```rust
+struct Boolean(bool);
+```
+
+At first, this may look like a bit of a cheat code for the exercise. It won't
+avoid the need to convert from `u8`, however.
+
+Wrapping `bool` includes the bonus that you can guarantee--in so far as you can
+guarantee Rust's own behavior--that `Boolean` is spec-compliant with `bool`.
+
+It may also look redundant - why bother creating a new type when it doesn't
+perform as a `bool`? Because it gives us complete control over the trait system.
+
+**Union**
+
+An alternative strategy would be to use a `union`:
+
+```rust
+union Byte {
+ u8,
+ bool,
+}
+```
+
+This isn't advised. It means that the value will _never_ be able to be
+considered safe to access. Callers will need to ensure that they comply with the
+rules at every interaction with the type.
+
+**Typestate**
+
+Some advanced programmers may attempt to encode Boolean values as zero-sized
+types in the type system. If you receive questions about this, gently nudge them
+back to including the byte.
+
+```rust
+struct True;
+struct False;
+```
+
+There are a couple of reasons for this. First, zero-sized types do not obey the
+width and alignment requirements of the spec for `bool`. Secondly, they're very
+difficult to work with in practice.
+
+If they wish to make use of the typestate pattern, then a possible alternative
+would be to .
+
+```rust
+struct Boolean(bool);
+struct True(bool);
+struct False(bool);
+```
+
+## Code review
+
+Suggest that there be some advice
+
+
diff --git a/src/unsafe-deep-dive/mechanics/representing-char.md b/src/unsafe-deep-dive/mechanics/representing-char.md
new file mode 100644
index 000000000000..ce75da82e9c2
--- /dev/null
+++ b/src/unsafe-deep-dive/mechanics/representing-char.md
@@ -0,0 +1,50 @@
+# Extension
+
+Create a similar data structure for Rust's [`char`] type. A `char` occupies 4
+bytes, but not all 4 bytes sequences are valid as `char`.
+
+[`char`]: https://doc.rust-lang.org/std/primitive.char.html
+
+Here is some starter code:
+
+```rust
+struct Char;
+
+impl TryFrom for Char {
+ type Error = u32;
+
+ fn try_from(x: u32) -> std::result::Result>::Error> {
+ todo!() // Attempt conversion, returning Err(x) when invalid
+ }
+}
+
+#[test]
+fn repr_matches() {
+ use std::alloc::Layout;
+
+ assert_eq!(Layout::new::(), Layout::new::());
+}
+
+#[test]
+fn conversion() {
+ for i in u32::MIN..=u32::MAX {
+ let res = Char::try_from(i);
+
+ match i {
+ 0..=0xD7FF | 0xE000..=0x10FFFF => assert!(res.is_ok()),
+ _ => assert!(res.is_err()),
+ };
+ }
+}
+```
+
+
+
+## Representation
+
+From Rust's documentation:
+
+> `char` is guaranteed to have the same size, alignment, and function call ABI
+> as `u32` on all platforms.
+
+
diff --git a/src/unsafe-deep-dive/motivations/interop.md b/src/unsafe-deep-dive/motivations/interop.md
index 85505b21fb35..6096a3b40fe2 100644
--- a/src/unsafe-deep-dive/motivations/interop.md
+++ b/src/unsafe-deep-dive/motivations/interop.md
@@ -59,8 +59,8 @@ parsing all take energy and time.
According to the C standard, an integer that's at least 32 bits wide. On
today's systems, It's an `i32` on Windows and an `i64` on Linux.
-[`std::ffi::c_long`]: https://doc.rust-lang.org/std/ffi/type.c_long.html
[safe]: https://doc.rust-lang.org/stable/edition-guide/rust-2024/unsafe-extern.html
+[`std::ffi::c_long`]: https://doc.rust-lang.org/std/ffi/type.c_long.html
## Consideration: type safety
diff --git a/src/unsafe-deep-dive/understanding-unsafety.md b/src/unsafe-deep-dive/understanding-unsafety.md
new file mode 100644
index 000000000000..ba4c6352b065
--- /dev/null
+++ b/src/unsafe-deep-dive/understanding-unsafety.md
@@ -0,0 +1,10 @@
+---
+minutes: 1
+---
+
+# Understanding Unsafety
+
+We've introduced a few technical terms, such as _undefined behavior_. Let's take
+a good look at what they actually mean.
+
+{{%segment outline}}
diff --git a/src/unsafe-deep-dive/understanding-unsafety/initialization.md b/src/unsafe-deep-dive/understanding-unsafety/initialization.md
new file mode 100644
index 000000000000..b87e07340dcc
--- /dev/null
+++ b/src/unsafe-deep-dive/understanding-unsafety/initialization.md
@@ -0,0 +1,638 @@
+# Initialization
+
+> TODO(timclicks): split this content into multiple sub-sections
+
+---
+
+## Memory lifecycle
+
+- Unpaged
+- Mapped but unallocated
+- Allocated
+- Allocated and "available" (uninitialized)
+- Allocated and "active" (initialized)
+- Deallocated but mapped
+- Unpaged
+
+
+
+Variables, the data that is used to represent them, have a surprisingly complex
+lifecycle.
+
+The details are complex and we don't want to turn this class into a
+graduate-&spy;level computer architecture course. However, understanding this
+system is useful, because it explains why programmers use uninitialized memory
+for performance-critical code.
+
+Operating systems, programming languages and hardware cooperate to programs with
+convenient access to data stored on physical devices, such as RAM chips.
+Programs are provided with a façade, an imaginary array of bytes addressed from
+1 to _n_, that allows them to store and retrieve data.
+
+This imaginary array of bytes is called the _virtual address space_ and this
+setup is called _virtual memory_.
+
+Each operating system process has its own virtual address space, meaning that
+the same address means different things in different processes. Another way of
+thinking about this is that process believes that it has exclusive access to the
+data available to the machine.
+
+The operating system kernel is responsible for mapping between these virtual
+memory addresses that your program understands to something that the hardware
+understands.
+
+To do this bookkeeping, the kernel stores information in its own data structures
+and relies on concept of a _memory page_. Pages allow components within the
+computer to work together, including the OS kernel, the OS process, the
+program's threads, the CPU, and storage hardware. Pages allow sections of the
+phyiscal memory to be reserved for specific purposes and for security
+restrictions to be enforced. also allow groups of memory addresses to be given
+attributes, such as write or execution.
+
+to improve their coordination, by referring to reduce the number of lookups when
+memory addresses are nearby.
+
+You may be familiar with the term _segmentation fault_, often shortened to _seg
+fault_. This term arises because each page is a _segment_ of the very large
+virtual address space. Only a small fraction of the address space is given a
+page.
+
+Virtual memory is complex and has many stages. We'll skip over most of them to
+allow us to build a general mental model of what's happening at runtime during a
+variable's lifecycle:
+
+- Memory starts as _unmapped_ and available to OS processes that require it. The
+ operating system knows that there is available space on the hardware, but the
+ process's virtual address space does not yet include a mapping to it.
+
+As space to store data is needed, memory transitions from the unmapped state:
+
+- Memory is then _mapped_ by the OS. The operation system maps a portion of the
+ available space on the hardware to the process's virtual address space.
+- The program's allocator then _allocates_ memory.
+- This allocated memory then becomes available to the program, but is in an
+ _uninitialized_ state.
+- When the variables are created within that memory and are guaranteed to be
+ _valid_, the memory is said to be _initialized_.
+
+As space for data decreases, memory reverts to the unmapped state:
+
+- After some time, the variable's lifetime ends. It has been moved or dropped.
+ The memory for the variable in the original position may not have been
+ modified though, however it is now invalid to access. Accessing those bytes at
+ this point is _undefined behavior_ .
+- At some point, the unused memory is _deallocated_. This memory addresses
+ remain mapped.
+- Later on, when the memory page is no longer being used, the operating system
+ may remove the page from the mapping table, allowing other processes to make
+ use of the hardware.
+
+Accessing uninitialized data is undefined behavior and a very serious safety
+hazard.
+
+### Other notes
+
+When virtual machines and hypervisors are involved, additional layers of mapping
+are involved.
+
+Unless your operating system or allocator provides specific guarantees, memory
+provided to a program is not necessarily in a clean state.
+
+Allocators: The allocator is part of the program itself. The operating system is
+agnostic to how
+
+The kernel understands physical memory addresses. User-space programs only have
+access to virtual memory.
+
+The mapping between memory addresses and the pages themselves is also stored
+within memory, in a data structure that is called TLB. TLB expands to
+"thread-local buffer", which is a name that has persisted for historical
+reasons.
+
+The CPU provides the operating system with privileged instructions for
+interacting with hardware, including main memory.
+
+Rust's ownership model adds its own characteristics to this overall model. The
+data is likely to still be present in the original location, after variables are
+moved, however this is inaccessible to the program.
+
+
+
+---
+
+## Addressing data
+
+```rust
+static s: &str = "_";
+
+fn main() {
+ let l = 123;
+ let h = Box::new(123);
+
+ println!("{:p}", &l);
+ println!("{:p}", s);
+ println!("{:p}", &*h);
+}
+```
+
+
+
+All data stored in a program lives at an _address_, a number which the operating
+system can use to retrieve or store data at that address.
+
+Local variables, such as `l`, are stored on the "stack". Memory addresses on the
+stack are quite high. (When executed, the program probably prints out a value
+near `0x7fffffffffff`)
+
+Static variables are lower
+
+Functions also stored in memory. In Rust, the keyword `fn` signifies a function
+pointer. Its address can also be printed.
+
+### Questions
+
+- Q: Why does addresses printed a not start at 1?\
+ A: The kernel reserves half of a process's address space for itself in the
+ lower half.
+
+### Variable mapping
+
+- `l` - L for _local_ - stored on the "stack"
+- `h` - H for _heap_
+- `f` - F for _function_
+- `s` - S for _static_
+
+
+
+---
+
+## Memory lifecycle - stack
+
+- Allocation:
+
+---
+
+> All runtime-allocated memory in a Rust program begins its life as
+> uninitialized.
+>
+> —
+> [The Rustonomicon](https://doc.rust-lang.org/nomicon/uninitialized.html)
+
+
+
+Validity related to other concepts that we've seen before, such as _undefined
+behavior_. Validity is a precondition for well-defined behavior.
+
+This segment of the course describes what initialization is and some of its
+related concepts, such as _alignment_ and _validity_, and how they relate to one
+that we've seen before: _undefined behavior_.
+
+The primary focus of the segment though is to introduce the
+`std::mem::MaybeUninit` type. Its role is to allow programmers to interact with
+memory that is uninitialized and convert it to some initialized state.
+
+To get this to work, we'll work through several code examples and other
+exercises.
+
+---
+
+```rust,editable
+fn mystery() -> u32 {
+ let mut x: u32;
+
+ unsafe { x }
+}
+
+fn main() {
+ let a = mystery();
+ println!("{a}")
+}
+```
+
+
+
+What is the value of `x`?
+
+**Action:** Pause and await for people's responses.
+
+We can't know.
+
+This is a case of an _uninitialized_ value. When we define the variable on line
+2, the compiler makes space for an integer on the stack, however it makes no
+guarantees that there is a valid value there.
+
+**Action:** Attempt compilation.
+
+**Action:** Suggested change:
+
+```rust
+use std::mem;
+
+fn mystery() -> u32 {
+ let mut x: u32 = unsafe { mem::MaybeUninit::uninit().assume_init() };
+
+ x
+}
+
+fn main() {
+ let a = mystery();
+ println!("{a}")
+}
+```
+
+Initialization transforms that a value's bytes from an undetermined state to
+something that's guaranteed to be valid.
+
+As we've seen from the Boolean case, not every bit pattern is a valid value in
+Rust's `bool` type.
+
+When a value uninitialized, it's impossible to know what'.
+
+Rust requires every variable is _valid_. An important part of validity is
+ensuring that values are initialized before use.
+
+Getting this wrong is so unsafe that you cannot simply use the `unsafe` keyword
+to convince Rust to compile your code.
+
+
+
+---
+
+## Validity
+
+- What is validity?
+- Why is it important?
+
+
+
+This segment of the course describes what that means and why it's important.
+
+Validity related to other concepts that we've seen before, such as _undefined
+behavior_. Validity is a precondition for well-defined behavior.
+
+
+
+---
+
+## Validity
+
+
+
+
+
+Data types define what it means to be _valid_. For some types, such as integers,
+every bit pattern is a valid type. For many others though, there are some
+patterns which are not.
+
+In Rust, references are not allowed to be NULL and `char` values must be valid
+Unicode scalar values.
+
+Outside of bit patterns, there are also other considerations. For example, many
+types impose rules that must be enforced that extend past. The way to find these
+rules is by the documentation. Therefore, we're also going to spend time
+examining docs.
+
+
+
+---
+
+## Why `MaybeUninit`?
+
+```rust,editable
+```
+
+
+
+Rust requires every variable to be initialized before use. More generally,
+compilers assume that all variables are properly initialized.
+
+But for FFI and for creating high performance data structures—sometimes
+referred to as getting stuff done—we need the ability to describe
+uninitialized buffers.
+
+
+
+---
+
+## Why care about initialization?
+
+```rust,editable
+fn create_1mb_buffer() -> Vec {
+ vec![0; 1_000_000]
+}
+```
+
+
+
+You're probably aware that this code allocates a new block of memory. It also
+has a second phase that is slightly more subtle. After allocation, every byte
+has its bits set to zero.
+
+However, there are cases where this second step is unnecessary. For example, if
+we're using this buffer for I/O, then we're going to overwrite the memory with
+whatever data that is going to be provided.
+
+
+
+---
+
+## Case study: selective initialization
+
+```rust
+use std::mem::MaybeUninit;
+
+/// Builds a sparse row where only certain positions have values
+struct ArrayFastBuilder {
+ data: [MaybeUninit; N],
+ initialized: [bool; N],
+ count: usize,
+}
+
+impl ArrayFastBuilder {
+ fn new() -> Self {
+ Self {
+ data: unsafe { MaybeUninit::uninit().assume_init() },
+ initialized: [false; N],
+ count: 0,
+ }
+ }
+
+ fn set(&mut self, index: usize, value: f64) -> Result<(), &'static str> {
+ if index >= N {
+ return Err("Index out of bounds");
+ }
+
+ if !self.initialized[index] {
+ self.count += 1;
+ }
+
+ self.data[index] = MaybeUninit::new(value);
+ self.initialized[index] = true;
+ Ok(())
+ }
+
+ fn get(&self, index: usize) -> Option {
+ if index < N && self.initialized[index] {
+ Some(unsafe { self.data[index].assume_init() })
+ } else {
+ None
+ }
+ }
+
+ fn into_array(self, default: f64) -> [f64; N] {
+ let mut result: [MaybeUninit; N] = std::array::from_fn(|i| {
+ if self.initialized[i] {
+ self.data[i] // Already initialized
+ } else {
+ MaybeUninit::new(default)
+ }
+ });
+
+ unsafe {
+ std::ptr::read(
+ &result as *const [MaybeUninit; N] as *const [f64; N],
+ )
+ }
+ }
+
+ fn into_sparse_vec(self) -> Vec<(usize, f64)> {
+ let mut result = Vec::with_capacity(self.count);
+
+ for (i, is_init) in self.initialized.iter().enumerate() {
+ if *is_init {
+ let value = unsafe { self.data[i].assume_init() };
+ result.push((i, value));
+ }
+ }
+
+ result
+ }
+}
+```
+
+
+
+Here is an application of what we just saw. `ArrayFastBuilder` reserves space on
+the stack for the contents, but skips avoids zeroing that array when it is
+created.
+
+
+
+---
+
+## What is the contract?
+
+Whenever we're creating unsafe code, we need to consider what the contract is.
+
+What does `assume_init(self)` mean? What do we need to do to guarantee that
+initialization it is no longer an assumption.
+
+
+
+What is this code asking of us? What are the expectations that we need to
+satisfy? If we don't know the expectations, where would we find them?
+
+
+
+---
+
+## Layout guarantees
+
+The following program runs successfully for `u64` values. Is that the case for
+all possible types `T`?
+
+```rust,editable
+use std::mem::MaybeUninit;
+
+fn main() {
+
+ let u = MaybeUninit::uninit();
+
+ assert_eq!(size_of::>(), size_of::());
+ assert_eq!(align_of::>(), align_of::());
+}
+```
+
+Look through the documentation for `MaybeUninit` to verify your assumptions.
+
+
+
+Another way to ask this is to check whether guarantees does `MaybeUninit`
+provide about its memory layout?
+
+Here is [the relevant quote][q] from the Layout section of the docs:
+
+> `MaybeUninit` is guaranteed to have the same size, alignment, and ABI as
+> `T`.
+
+[q]: https://doc.rust-lang.org/std/mem/union.MaybeUninit.html#layout-1
+
+
+
+---
+
+## What about safety when panicking?
+
+```rust
+```
+
+
+
+Rust's drop behavior presents a challenge during panics. In situations where
+there is partially-initiated values, dropping causes undefined behavior.
+
+
+
+---
+
+## Questions for review
+
+Where should the safety comment be? What kinds of tests can we perform. Fuzzing.
+
+---
+
+## Exercise: Vec
+
+Look up the documentation for `assume_init` and describe why this creates
+undefined behavior:
+
+```rust
+use std::mem::MaybeUninit;
+
+fn main() {
+ let x = MaybeUninit::>::uninit();
+ let x_ = unsafe { x.assume_init() };
+
+ println!("{x_:?}")
+}
+```
+
+
+
+Many types have additional invariants that need to be upheld. For example,
+`Vec` has a different representation when it's first created with `::new()`
+compared to after its first entry is inserted. It lazily allocates memory and
+there is no allocation involved until space is actually needed.
+
+From the [doc comment of `assume_init()`][docs]:
+
+> It is up to the caller to guarantee that the `MaybeUninit` really is in an
+> initialized state. Calling this when the content is not yet fully initialized
+> causes immediate undefined behavior. The type-level documentation contains
+> more information about this initialization invariant.
+>
+> On top of that, **remember that most types have additional invariants beyond
+> merely being considered initialized at the type level**. For example, a
+> 1-initialized `Vec` is considered initialized (under the current
+> implementation; this does not constitute a stable guarantee) because the only
+> requirement the compiler knows about it is that the data pointer must be
+> non-null. Creating such a `Vec` does not cause immediate undefined
+> behavior, but will cause undefined behavior with most safe operations
+> (including dropping it).
+>
+> _Emphasis added_
+
+[docs]: https://doc.rust-lang.org/std/mem/union.MaybeUninit.html#method.assume_init
+
+### Extension exercise
+
+Ask the class to think of other types that require special handling:
+
+- `char` outside the range of a Unicode scalar
+ (`[0x0000..=0xD7FF, 0xE000..=0x10FFFF]`)
+- References, (NULL is a valid pointer, but not a valid reference)
+- Types backed by `Vec<_>`, including `String`.
+- Pinned types, i.e. `Pin`
+- Non-zero types, i.e. `NonZeroU32`, etc
+
+
+
+---
+
+## MaybeUninit use case: initializing a struct field by field
+
+```rust
+use std::mem::MaybeUninit;
+use std::ptr::addr_of_mut;
+
+#[derive(Debug, PartialEq)]
+pub struct FileFormat {
+ marker: [u8; 4],
+ len: u32,
+ data: Vec,
+}
+
+fn main() {
+ let rfc = {
+ let mut uninit: MaybeUninit = MaybeUninit::uninit();
+ let ptr = uninit.as_mut_ptr();
+
+ unsafe {
+ addr_of_mut!((*ptr).name).write([b'R', b'F', b'C', b'1']);
+ }
+
+ unsafe {
+ addr_of_mut!((*ptr).len).write(3);
+ }
+
+ unsafe {
+ addr_of_mut!((*ptr).list).write(vec![0, 1, 2]);
+ }
+
+ unsafe { uninit.assume_init() }
+ };
+
+ assert_eq!(
+ rfc,
+ FileFormat {
+ name: b"RFC1",
+ len: 3
+ data: vec![0, 1, 2]
+ }
+ );
+}
+```
+
+---
+
+## Use case: partial initialization
+
+```rust,editable
+use std::mem::MaybeUninit;
+
+const SIZE: usize = 10_000_000;
+
+fn with_zeroing() -> Vec {
+ let mut vec = vec![0u8; SIZE];
+ for i in 0..SIZE {
+ vec[i] = (i % 256) as u8;
+ }
+ vec
+}
+
+fn without_zeroing() -> Vec {
+ let mut vec = Vec::with_capacity(SIZE);
+ unsafe {
+ let ptr = vec.as_mut_ptr();
+ for i in 0..SIZE {
+ ptr.add(i).write((i % 256) as u8);
+ }
+ vec.set_len(SIZE);
+ }
+ vec
+}
+```
+
+
+
+
diff --git a/src/unsafe-deep-dive/understanding-unsafety/out-of-bounds.md b/src/unsafe-deep-dive/understanding-unsafety/out-of-bounds.md
new file mode 100644
index 000000000000..e757a9ad8bae
--- /dev/null
+++ b/src/unsafe-deep-dive/understanding-unsafety/out-of-bounds.md
@@ -0,0 +1,50 @@
+# Out of bounds access
+
+The way that we often think of memory as application programmers, as a linear
+block of a space that we can reserve space from and give back to, is somewhat of
+an illusion.
+
+---
+
+## A motivating example
+
+```cpp
+int numbers[10] = {};
+
+bool numbers_contains(int n)
+{
+ for (int i = 0; i <= 10; i++) {
+ if (table[i] == v) return true;
+ }
+ return false;
+}
+```
+
+> Derived from the Undefined Behavior chapter from cppreference.com
+>
+
+
+
+The `numbers` array contains no members, and therefore should be false for all
+inputs. However, gcc13 with -O2 optimizes this code to ensure that it returns
+true for all cases.
+
+```asm
+numbers_contains(int):
+ mov eax, 1
+ ret
+numbers:
+ .zero 16
+```
+
+
+
+---
+
+Hello there
+
+
+
+More details
+
+
diff --git a/src/unsafe-deep-dive/understanding-unsafety/undefined-behavior.md b/src/unsafe-deep-dive/understanding-unsafety/undefined-behavior.md
new file mode 100644
index 000000000000..e125e014f91b
--- /dev/null
+++ b/src/unsafe-deep-dive/understanding-unsafety/undefined-behavior.md
@@ -0,0 +1,549 @@
+---
+minutes: 30
+---
+
+# Example: Representing Boolean values
+
+> TODO(timclicks): split this content into multiple sub-sections
+
+One of the terms that we introduced earlier was _undefined behavior_. This
+exercise aims to discuss what undefined behavior actually is and how it can
+arise.
+
+High performance code is particularly prone to accidentally introducing
+undefined behavior into a program, because its authors are typically very
+interested in finding ways to cut corners.
+
+---
+
+## What's wrong with undefined behavior?
+
+C++ compilers will typically (*) compile this code without warnings, and will
+run without error signaling an error:
+
+```cpp
+#include
+
+int axiom_increment_is_greater(int x) {
+ return x + 1 > x;
+}
+
+int main() {
+ int a = 2147483647;
+ assert(axiom_increment_is_greater(a));
+}
+```
+
+Equivalent Rust programs produce different output:
+
+```rust,editable
+fn axiom_increment_is_greater(x: i32) -> bool {
+ x + 1 > x
+}
+
+fn main() {
+ let a = 2147483647;
+ assert!(axiom_increment_is_greater(a));
+}
+```
+
+(*) We can't be certain. That's one of the problems.
+
+
+
+We don't want to have undefined behavior in our code, because it makes the code
+_unsound_.
+
+Unsound code can crash abruptly or produce unexpected results, because compilers
+are written with the assumption that undefined behavior does not exist. They
+will create optimizations that could be completely contrary to your
+expectations.
+
+In this example, assume that we're creating some sort of proof assistant that
+makes deductions based on mathematical axioms. One of the axioms that we want to
+encode is that an integer's increment is always greater than the integer itself:
+
+gcc v13.2, clang v16.0.0 and msvc v19.0 [all compile the C++ code to][asm] the
+following assembly when optimizations are enabled ( `-O2`):
+
+```asm
+axiom_increment_is_greater(int):
+ mov eax, 1
+ ret
+```
+
+[asm]: https://godbolt.org/z/q4MMY8vxs
+
+That is, while it looks like they'll always return `true`, the code also
+produces undefined behavior. When `x` is 2^32-1 and is incremented, it enters an
+undefined state. The operation produces a number that is outside of the range of
+a 32-bit signed integer.
+
+Integer overflow for signed integers is _undefined_. In the conventional twos
+complement representation, increment often wraps to -(2^31)-1 `i32::MIN`.
+
+Rust takes a stricter approach. When integer oveflow is signaled by the CPU, a
+panic is induced. This allows Safe Rust to be free of undefined behaviour.
+
+
+
+---
+
+## Rust keeps undefined behavior out...
+
+...but, unsafe provides a way for it to get back in.
+
+
+
+We are going to work through an example of how undefined behavior can be
+introduced in an attempt to improve performance.
+
+
+
+---
+
+## Booleans
+
+A typical representation:
+
+- 1 => truth/positivity
+- 0 => falsehood/negativity
+
+
+
+## Discussion
+
+### Encoding
+
+Just as integers can have their quirks, so do Boolean data types.
+
+How are the Boolean values `true` and `false` represented by programming
+languages?
+
+Many languages, including Rust and C++, encode Boolean values as an integer,
+where:
+
+- 1 represents truth or positivity
+- 0 represents falsehood or negativity
+
+However, there is an impedance mismatch because even the smallest integer (a
+single byte) can represent many more numbers than the two that are required.
+
+> Aside: Not a universal definition
+>
+> Programming language designers are free to have their own representations, or
+> not include a Boolean type in their language at all.
+>
+> CPUs do not have a Boolean datatype, rather they have Boolean operations that
+> are performed against operands that are typically integers.
+
+As the input space is larger than the output space, this can cause problems.
+Allowing any byte to represent "true", except for `0x01`, is undefined.
+
+[Rust (following C) imposes the following restrictions][ref-bool] on its `bool`
+type:
+
+> The value `false` has the bit pattern `0x00` and the value `true` has the bit
+> pattern `0x01`. It is _undefined behavior_ for an object with the boolean type
+> to have any other bit pattern. [emphasis added]
+
+[ref-bool]: https://doc.rust-lang.org/reference/types/boolean.html
+
+Depending on one's perspective, this either presents an opportunity or a
+difficulty.
+
+
+
+---
+
+## Exercise
+
+- Define a type that represents a Boolean value
+- A zero-cost conversion function from `u8` to your new type
+- A zero-cost conversion function from your new type to `bool`
+
+```rust,editable
+fn byte_to_boolean(byte: u8) -> Boolean {
+ todo!("convert from u8")
+}
+
+fn boolean_to_bool(b: Boolean) -> bool {
+ todo!("convert to Rust's bool")
+}
+```
+
+
+
+Tell the group that they will need start by defining the `Boolean` type that's
+provided in the type signature themselves. (This is not included in the sample
+code so that the audience is not biased using a `struct`)
+
+This exercise should be completed quite quickly – no more than 3 minutes
+– because we will soon review several examples ourselves.
+
+### Recommended guidance
+
+- User-defined Booleans should occupy a single byte the same space. This
+ precludes using an `enum`.
+- The following function annotations are likely to be needed:
+ - `unsafe` on the `byte_to_boolean` function
+ - `#[inline]`
+- Rust's `Copy` trait involves memcpy and is therefore _not_ zero-cost
+
+> _Aside: Possible upcoming language feature - the TransmuteFrom trait_
+>
+> The standard library contains a nightly feature, [`transmutability`] which
+> defines the [`std::mem::TransmuteFrom`] trait for performing this kind of
+> operation. This is one of the outputs from the [Safe Transmute Project] within
+> the Rust compiler team.
+
+[`std::mem::TransmuteFrom`]: https://doc.rust-lang.org/std/mem/trait.TransmuteFrom.html
+[`transmutability`]: https://github.com/rust-lang/rust/issues/99571
+[Safe Transmute Project]: https://github.com/rust-lang/project-safe-transmute
+
+### Questions to raise
+
+- How would we indicate to callers that they can cause undefined behavior by
+ calling `byte_to_boolean` with invalid inputs?
+ - Safety comments. You could briefly mention safety comments and raise
+ questions about what learners would expect to see if they were reviewing
+ code.
+ - Adding assertions. While not a complete solution, you can suggest that
+ learners add assertions under debug and/or test.
+
+### Partial solution focusing on assertions
+
+```rust
+fn is_valid_bool_repr(byte: u8) -> bool {
+ (byte >> 1) != 0
+}
+
+fn byte_to_boolean(byte: u8) -> Boolean {
+ if cfg!(debug_assertions) || cfg!(test) {
+ assert!(is_valid_bool_repr(byte), "input must be 0x00 or 0x01")
+ }
+
+ todo!("convert from u8")
+}
+```
+
+### Full solution
+
+```rust
+struct Boolean(bool);
+
+fn is_valid_bool_repr(byte: u8) -> bool {
+ (byte >> 1) != 0
+}
+
+/// Create a `Boolean` from a `u8`
+///
+/// ## Safety
+///
+/// This function produces undefined bahavior when `byte` is neither 0 nor 1.
+unsafe fn byte_to_boolean(byte: u8) -> Boolean {
+ if cfg!(debug_assertions) || cfg!(test) {
+ assert!(is_valid_bool_repr(byte), "input must be 0x00 or 0x01")
+ }
+
+ // SAFETY: Valid for all valid inputs into this function
+ let b = unsafe { std::mem::transmute(byte) };
+ Boolean(b)
+}
+
+fn boolean_to_byte(b: Boolean) -> bool {
+ b.0
+}
+
+fn main() {
+ let t = 123;
+ let ub = unsafe { byte_to_boolean(t) };
+ if boolean_to_byte(ub) {
+ println!(r"¯\_(ツ)_/¯");
+ }
+}
+```
+
+### Picking a data structure
+
+**Newtype wrapping u8**
+
+The orthodox strategy will be to wrap `u8` in a struct:
+
+```rust
+struct Boolean(u8);
+```
+
+This ensures that the representation is the same as `u8`.
+
+**Newtype wrapping bool**
+
+Hopefully, you will have some learners will wrap `bool` as a newtype:
+
+```rust
+struct Boolean(bool);
+```
+
+At first, this may look like a bit of a cheat code for the exercise. It won't
+avoid the need to convert from `u8`, however.
+
+Wrapping `bool` includes the bonus that you can guarantee--in so far as you can
+guarantee Rust's own behavior--that `Boolean` is spec-compliant with `bool`.
+
+It may also look redundant - why bother creating a new type when it doesn't
+perform as a `bool`? Because it gives us complete control over the trait system.
+
+**Union**
+
+An alternative strategy would be to use a `union`:
+
+```rust
+union Byte {
+ u8,
+ bool,
+}
+```
+
+This isn't advised. It means that the value will _never_ be able to be
+considered safe to access. Callers will need to ensure that they comply with the
+rules at every interaction with the type.
+
+**Typestate**
+
+Some advanced programmers may attempt to encode Boolean values as zero-sized
+types in the type system. If you receive questions about this, gently nudge them
+back to including the byte.
+
+```rust
+struct True;
+struct False;
+```
+
+There are a couple of reasons for this. First, zero-sized types do not obey the
+width and alignment requirements of the spec for `bool`. Secondly, they're very
+difficult to work with in practice.
+
+If they wish to make use of the typestate pattern, then a possible alternative
+would be to create three independent types. This creates an ergonomic problem,
+but might that might be justified if you only want to permit a follow-on
+function from being only called from a "true" value.
+
+```rust
+struct True(bool);
+struct False(bool);
+```
+
+
+
+---
+
+## Code reviews
+
+We'll now be critiquing other implementations of the previous exercise.
+
+
+
+The critical point in these reviews is that learners accurately describe the
+contract that callers need to uphold when converting from `u8`. It should be
+well described in a Safety section of the docstring.
+
+
+
+---
+
+## Code review 1
+
+Critique this code and suggest improvements, if any:
+
+```rust,editable
+struct Boolean(u8);
+
+fn byte_to_boolean(b: u8) -> Boolean {
+ Boolean(b)
+}
+
+fn boolean_to_byte(boolean: Boolean) -> u8 {
+ boolean.0
+}
+
+fn boolean_to_bool(boolean: Boolean) -> bool {
+ match b.0 {
+ 0 => false,
+ _ => true,
+ }
+}
+```
+
+
+
+Which function should be `unsafe`? It could either be at the "constructor"
+(`byte_to_boolean`) or when the Boolean is converted to a Rust-native `bool`
+(`boolean_to_bool`).
+
+
+
+---
+
+## Code review 2
+
+```rust,editable
+struct Boolean(bool);
+
+fn byte_to_boolean(b: u8) -> Boolean {
+ match b.0 {
+ 0 => Boolean(false),
+ _ => Boolean(true),
+ }
+}
+
+fn boolean_to_byte(boolean: Boolean) -> u8 {
+ boolean.0 as u8
+}
+
+fn boolean_to_bool(boolean: Boolean) -> bool {
+ boolean.0
+}
+```
+
+
+
+In this version, we mask the error. All non-zero inputs are coerced to `true`.
+We store the internal field of the `Boolean` struct as a `bool` to make as much
+use of Rust's type system as possible.
+
+However, this `byte_to_boolean` is not zero-cost. There is still a `match`
+operation that's required.
+
+
+
+---
+
+## Code review 3
+
+```rust,editable
+#[repr(C)]
+union Boolean {
+ raw: u8,
+ rust: bool,
+}
+
+fn byte_to_boolean(b: u8) -> Boolean {
+ Boolean { raw: b }
+}
+
+fn boolean_to_byte(boolean: Boolean) -> u8 {
+ unsafe { boolean.rust }
+}
+
+fn boolean_to_bool(boolean: Boolean) -> bool {
+ unsafe { boolean.raw }
+}
+```
+
+---
+
+## Code review 4
+
+```rust,editable
+struct Boolean(bool);
+
+fn byte_to_boolean(b: u8) -> Boolean {
+ let b: bool = unsafe { sys::mem::transmute(b) };
+
+ Boolean(b)
+}
+
+fn boolean_to_byte(boolean: Boolean) -> u8 {
+ boolean.0 as u8
+}
+
+fn boolean_to_bool(boolean: Boolean) -> bool {
+ boolean.0
+}
+```
+
+---
+
+## Scratch Space
+
+> Note: Content following this comment is from a previous revisions and is being
+> retained temporarily.
+
+> TODO(timclicks): Review the following content for anything useful that should
+> be retained.
+
+---
+
+Or in Rust syntax:
+
+```rust
+struct Boolean(u8);
+
+const true: Boolean = Boolean(1);
+const false: Boolean = Boolean(0);
+```
+
+>> Instructor Notes
+>
+> We define a type here so that there is no confusion in the type system between
+> `u8` and `Boolean`.
+
+From a theoretical perspective, the two states `true` and `false` be represented
+by a single bit. However, the smallest integer available is `u8`, which has 254
+additional states.
+
+This is a similar problem to the mismatch casting from a `i64` to `i32`, but
+there is a significant difference. When converting an integer from a 64-bit type
+to a 32-bit type, there is not enough space in the narrower type for all
+possible input values. They can't all fit. In the case of casting from `u8` to
+`bool`, the number of bits isn't the issue. It's the standard that imposes the
+additional restrictions.
+
+Depending on one's perspective, this either presents an opportunity or a
+challenge.
+
+Moreover, [Rust (following C) imposes the following restrictions][ref-bool] on
+its `bool` type:
+
+> The value `false` has the bit pattern `0x00` and the value `true` has the bit
+> pattern `0x01`. It is _undefined behavior_ for an object with the boolean type
+> to have any other bit pattern. [emphasis added]
+
+Many CPUs, don't strictly have a "Boolean type". They have Boolean operations.
+
+- For true, CPUs ask. Does this value match
+
+[ref-bool]: https://doc.rust-lang.org/reference/types/boolean.html
+
+## Exercise
+
+Implement two conversion functions, `byte_to_boolean()` and `boolean_to_byte()`:
+
+```rust
+struct Boolean(u8);
+
+fn byte_to_boolean(b: u8) -> Boolean {
+ todo!();
+}
+
+fn boolean_to_byte(b: Boolean) -> u8 {
+ todo!();
+}
+```
+
+## Discussion
+
+Should this function be marked as unsafe?
+
+```rust
+struct Boolean(u8);
+
+fn byte_to_boolean(b: u8) -> Boolean {
+ match b {
+ 0 => false,
+ _ => true,
+ }
+}
+```
+
+---