|
1 | 1 | ---
|
2 |
| -minutes: 15 |
| 2 | +minutes: 30 |
3 | 3 | ---
|
4 | 4 |
|
5 |
| -## Typestate Pattern |
| 5 | +## Typestate Pattern: Problem |
6 | 6 |
|
7 |
| -Typestate is the practice of encoding a part of the state of the value in its type, preventing incorrect or inapplicable operations from being called on the value. |
| 7 | +How can we ensure that only valid operations are allowed on a value based on its |
| 8 | +current state? |
| 9 | + |
| 10 | +```rust,editable |
| 11 | +use std::fmt::Write as _; |
8 | 12 |
|
9 |
| -```rust |
10 |
| -# use std::fmt::Write; |
11 | 13 | #[derive(Default)]
|
12 |
| -struct Serializer { output: String } |
13 |
| -struct SerializeStruct { serializer: Serializer } |
| 14 | +struct Serializer { |
| 15 | + output: String, |
| 16 | +} |
14 | 17 |
|
15 | 18 | impl Serializer {
|
16 |
| - fn serialize_struct(mut self, name: &str) -> SerializeStruct { |
| 19 | + fn serialize_struct_start(&mut self, name: &str) { |
17 | 20 | let _ = writeln!(&mut self.output, "{name} {{");
|
18 |
| - SerializeStruct { serializer: self } |
19 | 21 | }
|
20 |
| -} |
21 | 22 |
|
22 |
| -impl SerializeStruct { |
23 |
| - fn serialize_field(mut self, key: &str, value: &str) -> Self { |
24 |
| - let _ = writeln!(&mut self.serializer.output, " {key}={value};"); |
25 |
| - self |
| 23 | + fn serialize_struct_field(&mut self, key: &str, value: &str) { |
| 24 | + let _ = writeln!(&mut self.output, " {key}={value};"); |
| 25 | + } |
| 26 | +
|
| 27 | + fn serialize_struct_end(&mut self) { |
| 28 | + self.output.push_str("}\n"); |
26 | 29 | }
|
27 | 30 |
|
28 |
| - fn finish_struct(mut self) -> Serializer { |
29 |
| - self.serializer.output.push_str("}\n"); |
30 |
| - self.serializer |
| 31 | + fn finish(self) -> String { |
| 32 | + self.output |
31 | 33 | }
|
32 | 34 | }
|
33 | 35 |
|
34 | 36 | fn main() {
|
35 |
| - let serializer = Serializer::default() |
36 |
| - .serialize_struct("User") |
37 |
| - .serialize_field("id", "42") |
38 |
| - .serialize_field("name", "Alice") |
39 |
| - .finish_struct(); |
40 |
| - println!("{}", serializer.output); |
| 37 | + let mut serializer = Serializer::default(); |
| 38 | + serializer.serialize_struct_start("User"); |
| 39 | + serializer.serialize_struct_field("id", "42"); |
| 40 | + serializer.serialize_struct_field("name", "Alice"); |
| 41 | +
|
| 42 | + // serializer.serialize_struct_end(); // ← Oops! Forgotten |
| 43 | +
|
| 44 | + println!("{}", serializer.finish()); |
41 | 45 | }
|
42 | 46 | ```
|
43 | 47 |
|
44 | 48 | <details>
|
45 | 49 |
|
46 |
| -- This example is inspired by |
47 |
| - [Serde's `Serializer` trait](https://docs.rs/serde/latest/serde/ser/trait.Serializer.html). |
48 |
| - For a deeper explanation of how Serde models serialization as a state machine, |
49 |
| - see <https://serde.rs/impl-serializer.html>. |
50 |
| - |
51 |
| -- The typestate pattern allows us to model state machines using Rust’s type |
52 |
| - system. In this case, the state machine is a simple serializer. |
53 |
| - |
54 |
| -- The key idea is that at each state in the process, we can only |
55 |
| - do the actions which are valid for that state. Transitions between |
56 |
| - states happen by consuming one value and producing another. |
| 50 | +- This `Serializer` is meant to write a structured value. The expected usage |
| 51 | + follows this sequence: |
57 | 52 |
|
58 | 53 | ```bob
|
59 |
| -+------------+ serialize struct +-----------------+ |
60 |
| -| Serializer +-------------------->| SerializeStruct |<-------+ |
61 |
| -+------------+ +-+-----+---------+ | |
62 |
| - ^ | | | |
63 |
| - | finish struct | | serialize field | |
64 |
| - +-----------------------------+ +------------------+ |
| 54 | +serialize struct start |
| 55 | +-+--------------------- |
| 56 | + | |
| 57 | + +--> serialize struct field |
| 58 | + -+--------------------- |
| 59 | + | |
| 60 | + +--> serialize struct field |
| 61 | + -+--------------------- |
| 62 | + | |
| 63 | + +--> serialize struct end |
65 | 64 | ```
|
66 | 65 |
|
67 |
| -- In the example above: |
68 |
| - |
69 |
| - - Once we begin serializing a struct, the `Serializer` is moved into the |
70 |
| - `SerializeStruct` state. At that point, we no longer have access to the |
71 |
| - original `Serializer`. |
| 66 | +- However, in this example we forgot to call `serialize_struct_end()` before |
| 67 | + `finish()`. As a result, the serialized output is incomplete or syntactically |
| 68 | + incorrect. |
72 | 69 |
|
73 |
| - - While in the `SerializeStruct` state, we can only call methods related to |
74 |
| - writing fields. We cannot use the same instance to serialize a tuple, list, |
75 |
| - or primitive. Those constructors simply do not exist here. |
| 70 | +- One approach to fix this would be to track internal state manually, and return |
| 71 | + a `Result` from methods like `serialize_struct_field()` or `finish()` if the |
| 72 | + current state is invalid. |
76 | 73 |
|
77 |
| - - Only after calling `finish_struct` do we get the `Serializer` back. At that |
78 |
| - point, we can inspect the output or start a new serialization session. |
| 74 | +- But this has downsides: |
79 | 75 |
|
80 |
| - - If we forget to call `finish_struct` and drop the `SerializeStruct` instead, |
81 |
| - the original `Serializer` is lost. This ensures that incomplete or invalid |
82 |
| - output can never be observed. |
| 76 | + - It is easy to get wrong as an implementer. Rust’s type system cannot help |
| 77 | + enforce the correctness of our state transitions. |
83 | 78 |
|
84 |
| -- By contrast, if all methods were defined on `Serializer` itself, nothing would |
85 |
| - prevent users from mixing serialization modes or leaving a struct unfinished. |
| 79 | + - It also adds unnecessary burden on the user, who must handle `Result` values |
| 80 | + for operations that are misused in source code rather than at runtime. |
86 | 81 |
|
87 |
| -- This pattern avoids such misuse by making it **impossible to represent invalid |
88 |
| - transitions**. |
| 82 | +- A better solution is to model the valid state transitions directly in the type |
| 83 | + system. |
89 | 84 |
|
90 |
| -- One downside of typestate modeling is potential code duplication between |
91 |
| - states. In the next section, we will see how to use **generics** to reduce |
92 |
| - duplication while preserving correctness. |
| 85 | + In the next slide, we will apply the **typestate pattern** to enforce correct |
| 86 | + usage at compile time and make invalid states unrepresentable. |
93 | 87 |
|
94 | 88 | </details>
|
0 commit comments