Skip to content

Commit 22d6af4

Browse files
timClicksgribozavr
andauthored
Add Unsafe Rust Deep Dive (#2806)
Adds the start of an unsafe deep dive to Comprehensive Rust. The `unsafe` keyword is easy to type, but hard to master. When used appropriately, it forms a useful and indeed essential part of the Rust programming language. By the end of this deep dive, you'll know how to work with `unsafe` code, review others' changes that include the `unsafe` keyword, and produce your own. What you'll learn: - What the terms undefined behavior, soundness, and safety mean - Why the `unsafe` keyword exists in the Rust language - How to write your own code using `unsafe` safely - How to review `unsafe` code Here is a tentative outline of a 10h (2 day) treatment: Day 1: Using and Reviewing Unsafe - Welcome - Motivations: explain why the `unsafe` keyword exists - Foundations: provide background knowledge; what is soundness? what is undefined behavior? what is validity in respect to pointers? - Mechanics: what a safe `unsafe` block should look like - Representations and Interoperability: explore how data is laid out in memory and how that can be sent across the wire and/or stored on disk. - Reviewing unsafe - Patterns for safer unsafe: Encapsulating unsafe code in safe-to-use abstractions, such as marking a type's constructor as `unsafe` so that invariants only need to be enforced once by the programmer. Day 2: Deploying Unsafe to Build Abstractions - Welcome - Validity in detail: A refresher. Emphasis on the details of the invariants that are being upheld by a “typical” unsafe block, such as aliasing, alignment, data validity, padding. - Concurrency and thread safety: understanding `Send` and `Sync`, knowing how to implement them on a user-defined type - Case study: Small string optimization - Case study: Zero-copy parsing - Review --------- Co-authored-by: Dmitri Gribenko <[email protected]>
1 parent 0a485b5 commit 22d6af4

15 files changed

+674
-0
lines changed

src/SUMMARY.md

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -440,6 +440,23 @@
440440

441441
---
442442

443+
# Unsafe
444+
445+
- [Welcome](unsafe-deep-dive/welcome.md)
446+
- [Setup](unsafe-deep-dive/setup.md)
447+
- [Motivations](unsafe-deep-dive/motivations.md)
448+
- [Interoperability](unsafe-deep-dive/motivations/interop.md)
449+
- [Data Structures](unsafe-deep-dive/motivations/data-structures.md)
450+
- [Performance](unsafe-deep-dive/motivations/performance.md)
451+
- [Foundations](unsafe-deep-dive/foundations.md)
452+
- [What is unsafe?](unsafe-deep-dive/foundations/what-is-unsafe.md)
453+
- [When is unsafe used?](unsafe-deep-dive/foundations/when-is-unsafe-used.md)
454+
- [Data structures are safe](unsafe-deep-dive/foundations/data-structures-are-safe.md)
455+
- [Actions might not be](unsafe-deep-dive/foundations/actions-might-not-be.md)
456+
- [Less powerful than it seems](unsafe-deep-dive/foundations/less-powerful.md)
457+
458+
---
459+
443460
# Final Words
444461

445462
- [Thanks!](thanks.md)

src/running-the-course/course-structure.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -82,6 +82,15 @@ You should be familiar with the material in
8282

8383
{{%course outline Idiomatic Rust}}
8484

85+
### Unsafe (Work in Progress)
86+
87+
The [Unsafe](../unsafe-deep-dive/welcome.md) deep dive is a two-day class on the
88+
_unsafe_ Rust language. It covers the fundamentals of Rust's safety guarantees,
89+
the motivation for `unsafe`, review process for `unsafe` code, FFI basics, and
90+
building data structures that the borrow checker would normally reject.
91+
92+
{{%course outline Unsafe}}
93+
8594
## Format
8695

8796
The course is meant to be very interactive and we recommend letting the

src/unsafe-deep-dive/Cargo.toml

Whitespace-only changes.

src/unsafe-deep-dive/foundations.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
# Foundations
2+
3+
Some fundamental concepts and terms.
4+
5+
{{%segment outline}}
Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
---
2+
minutes: 2
3+
---
4+
5+
# ... but actions on them might not be
6+
7+
```rust
8+
fn main() {
9+
let n: i64 = 12345;
10+
let safe = &n as *const _;
11+
println!("{safe:p}");
12+
}
13+
```
14+
15+
<details>
16+
17+
Modify the example to de-reference `safe` without an `unsafe` block.
18+
19+
</details>
Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
---
2+
minutes: 2
3+
---
4+
5+
# Data structures are safe ...
6+
7+
Data structures are inert. They cannot do any harm by themselves.
8+
9+
Safe Rust code can create raw pointers:
10+
11+
```rust
12+
fn main() {
13+
let n: i64 = 12345;
14+
let safe = &raw const n;
15+
println!("{safe:p}");
16+
}
17+
```
18+
19+
<details>
20+
21+
Consider a raw pointer to an integer, i.e., the value `safe` is the raw pointer
22+
type `*const i64`. Raw pointers can be out-of-bounds, misaligned, or be null.
23+
But the unsafe keyword is not required when creating them.
24+
25+
</details>
Lines changed: 52 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,52 @@
1+
---
2+
minutes: 10
3+
---
4+
5+
# Less powerful than it seems
6+
7+
The `unsafe` keyword does not allow you to break Rust.
8+
9+
```rust,ignore
10+
use std::mem::transmute;
11+
12+
let orig = b"RUST";
13+
let n: i32 = unsafe { transmute(orig) };
14+
15+
println!("{n}")
16+
```
17+
18+
<details>
19+
20+
## Suggested outline
21+
22+
- Request that someone explains what `std::mem::transmute` does
23+
- Discuss why it doesn't compile
24+
- Fix the code
25+
26+
## Expected compiler output
27+
28+
```ignore
29+
Compiling playground v0.0.1 (/playground)
30+
error[E0512]: cannot transmute between types of different sizes, or dependently-sized types
31+
--> src/main.rs:5:27
32+
|
33+
5 | let n: i32 = unsafe { transmute(orig) };
34+
| ^^^^^^^^^
35+
|
36+
= note: source type: `&[u8; 4]` (64 bits)
37+
= note: target type: `i32` (32 bits)
38+
```
39+
40+
## Suggested change
41+
42+
```diff
43+
- let n: i32 = unsafe { transmute(orig) };
44+
+ let n: i64 = unsafe { transmute(orig) };
45+
```
46+
47+
## Notes on less familiar Rust
48+
49+
- the `b` prefix on a string literal marks it as byte slice (`&[u8]`) rather
50+
than a string slice (`&str`)
51+
52+
</details>
Lines changed: 98 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,98 @@
1+
---
2+
minutes: 6
3+
---
4+
5+
# What is &ldquo;unsafety&rdquo;?
6+
7+
Unsafe Rust is a superset of Safe Rust.
8+
9+
Let's create a list of things that are enabled by the `unsafe` keyword.
10+
11+
<details>
12+
13+
## Definitions from authoritative docs:
14+
15+
From the [unsafe keyword's documentation]():
16+
17+
> Code or interfaces whose memory safety cannot be verified by the type system.
18+
>
19+
> ...
20+
>
21+
> Here are the abilities Unsafe Rust has in addition to Safe Rust:
22+
>
23+
> - Dereference raw pointers
24+
> - Implement unsafe traits
25+
> - Call unsafe functions
26+
> - Mutate statics (including external ones)
27+
> - Access fields of unions
28+
29+
From the [reference](https://doc.rust-lang.org/reference/unsafety.html)
30+
31+
> The following language level features cannot be used in the safe subset of
32+
> Rust:
33+
>
34+
> - Dereferencing a raw pointer.
35+
> - Reading or writing a mutable or external static variable.
36+
> - Accessing a field of a union, other than to assign to it.
37+
> - Calling an unsafe function (including an intrinsic or foreign function).
38+
> - Calling a safe function marked with a target_feature from a function that
39+
> does not have a target_feature attribute enabling the same features (see
40+
> attributes.codegen.target_feature.safety-restrictions).
41+
> - Implementing an unsafe trait.
42+
> - Declaring an extern block.
43+
> - Applying an unsafe attribute to an item.
44+
45+
## Group exercise
46+
47+
> You may have a group of learners who are not familiar with each other yet.
48+
> This is a way for you to gather some data about their confidence levels and
49+
> the psychological safety that they're feeling.
50+
51+
### Part 1: Informal definition
52+
53+
> Use this to gauge the confidence level of the group. If they are uncertain,
54+
> then tailor the next section to be more directed.
55+
56+
Ask the class: **By raising your hand, indicate if you would feel comfortable
57+
defining unsafe?**
58+
59+
If anyone's feeling confident, allow them to try to explain.
60+
61+
### Part 2: Evidence gathering
62+
63+
Ask the class to spend 3-5 minutes.
64+
65+
- Find a use of the unsafe keyword. What contract/invariant/pre-condition is
66+
being established or satisfied?
67+
- Write down terms that need to be defined (unsafe, memory safety, soundness,
68+
undefined behavior)
69+
70+
### Part 3: Write a working definition
71+
72+
### Part 4: Remarks
73+
74+
Mention that we'll be reviewing our definition at the end of the day.
75+
76+
## Note: Avoid detailed discussion about precise semantics of memory safety
77+
78+
It's possible that the group will slide into a discussion about the precise
79+
semantics of what memory safety actually is and how define pointer validity.
80+
This isn't a productive line of discussion. It can undermine confidence in less
81+
experienced learners.
82+
83+
Perhaps refer people who wish to discuss this to the discussion within the
84+
official [documentation for pointer types] (excerpt below) as a place for
85+
further research.
86+
87+
> Many functions in [this module] take raw pointers as arguments and read from
88+
> or write to them. For this to be safe, these pointers must be _valid_ for the
89+
> given access.
90+
>
91+
> ...
92+
>
93+
> The precise rules for validity are not determined yet.
94+
95+
[this module]: https://doc.rust-lang.org/std/ptr/index.html
96+
[documentation for pointer types]: https://doc.rust-lang.org/std/ptr/index.html#safety
97+
98+
</details>
Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,48 @@
1+
---
2+
minutes: 2
3+
---
4+
5+
# When is unsafe used?
6+
7+
The unsafe keyword indicates that the programmer is responsible for upholding
8+
Rust's safety guarantees.
9+
10+
The keyword has two roles:
11+
12+
- define pre-conditions that must be satisfied
13+
- assert to the compiler (= promise) that those defined pre-conditions are
14+
satisfied
15+
16+
## Further references
17+
18+
- [The unsafe keyword chapter of the Rust Reference](https://doc.rust-lang.org/reference/unsafe-keyword.html)
19+
20+
<details>
21+
22+
Places where pre-conditions can be defined (Role 1)
23+
24+
- [unsafe functions] (`unsafe fn foo() { ... }`). Example: `get_unchecked`
25+
method on slices, which requires callers to verify that the index is
26+
in-bounds.
27+
- unsafe traits (`unsafe trait`). Examples: [`Send`] and [`Sync`] marker traits
28+
in the standard library.
29+
30+
Places where pre-conditions must be satisfied (Role 2)
31+
32+
- unsafe blocks (`unafe { ... }`)
33+
- implementing unsafe traits (`unsafe impl`)
34+
- access external items (`unsafe extern`)
35+
- adding
36+
[unsafe attributes](https://doc.rust-lang.org/reference/attributes.html) o an
37+
item. Examples: [`export_name`], [`link_section`] and [`no_mangle`]. Usage:
38+
`#[unsafe(no_mangle)]`
39+
40+
[unsafe functions]: https://doc.rust-lang.org/reference/unsafe-keyword.html#unsafe-functions-unsafe-fn
41+
[unsafe traits]: https://doc.rust-lang.org/reference/unsafe-keyword.html#unsafe-traits-unsafe-trait
42+
[`export_name`]: https://doc.rust-lang.org/reference/abi.html#the-export_name-attribute
43+
[`link_section`]: https://doc.rust-lang.org/reference/abi.html#the-link_section-attribute
44+
[`no_mangle`]: https://doc.rust-lang.org/reference/abi.html#the-no_mangle-attribute
45+
[`Send`]: https://doc.rust-lang.org/std/marker/trait.Send.html
46+
[`Sync`]: https://doc.rust-lang.org/std/marker/trait.Sync.html
47+
48+
</details>

src/unsafe-deep-dive/motivations.md

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
---
2+
minutes: 1
3+
---
4+
5+
# Motivations
6+
7+
We know that writing code without the guarantees that Rust provides ...
8+
9+
> “Use-after-free (UAF), integer overflows, and out of bounds (OOB) reads/writes
10+
> comprise 90% of vulnerabilities with OOB being the most common.”
11+
>
12+
> --— **Jeff Vander Stoep and Chong Zang**, Google.
13+
> "[Queue the Hardening Enhancements](https://security.googleblog.com/2019/05/queue-hardening-enhancements.html)"
14+
15+
... so why is `unsafe` part of the language?
16+
17+
{{%segment outline}}
18+
19+
<details>
20+
21+
The `unsafe` keyword exists because there is no compiler technology available
22+
today that makes it obsolete. Compilers cannot verify everything.
23+
24+
</details>

0 commit comments

Comments
 (0)