|
| 1 | +# Generics as Type Classes |
| 2 | + |
| 3 | +## Description |
| 4 | + |
| 5 | +Rust's type system is designed more like functional languages (like Haskell) |
| 6 | +rather than imperative languages (like Java and C++). As a result, Rust can turn |
| 7 | +many kinds of programming problems into "static typing" problems. This is one |
| 8 | +of the biggest wins of choosing a functional language, and is critical to many |
| 9 | +of Rust's compile time guarantees. |
| 10 | + |
| 11 | +A key part of this idea is the way generic types work. In C++ and Java, for |
| 12 | +example, generic types are a meta-programming construct for the compiler. |
| 13 | +`vector<int>` and `vector<char>` in C++ are just two different copies of the |
| 14 | +same boilerplate code for a `vector` type (known as a `template`) with two |
| 15 | +different types filled in. |
| 16 | + |
| 17 | +In Rust, a generic type parameter creates what is known in functional languages |
| 18 | +as a "type class constraint", and each different parameter filled in by an end |
| 19 | +user *actually changes the type*. In other words, `Vec<isize>` and `Vec<char>` |
| 20 | +*are two different types*, which are recognized as distinct by all parts of the |
| 21 | +type system. |
| 22 | + |
| 23 | +This is called **monomorphization**, where different types are created from |
| 24 | +**polymorphic** code. This special behavior requires `impl` blocks to specify |
| 25 | +generic parameters: different values for the generic type cause different types, |
| 26 | +and different types can have different `impl` blocks. |
| 27 | + |
| 28 | +In object oriented languages, classes can inherit behavior from their parents. |
| 29 | +However, this allows the attachment of not only additional behavior to |
| 30 | +particular members of a type class, but extra behavior as well. |
| 31 | + |
| 32 | +The nearest equivalent is the runtime polymorphism in Javascript and Python, |
| 33 | +where new members can be added to objects willy-nilly by any constructor. |
| 34 | +Unlike those languages, however, all of Rust's additional methods can be type |
| 35 | +checked when they are used, because their generics are statically defined. That |
| 36 | +makes them more usable while remaining safe. |
| 37 | + |
| 38 | +## Example |
| 39 | + |
| 40 | +Suppose you are designing a storage server for a series of lab machines. |
| 41 | +Because of the software involved, there are two different protocols you need |
| 42 | +to support: BOOTP (for PXE network boot), and NFS (for remote mount storage). |
| 43 | + |
| 44 | +Your goal is to have one program, written in Rust, which can handle both of |
| 45 | +them. It will have protocol handlers and listen for both kinds of requests. The |
| 46 | +main application logic will then allow a lab administrator to configure storage |
| 47 | +and security controls for the actual files. |
| 48 | + |
| 49 | +The requests from machines in the lab for files contain the same basic |
| 50 | +information, no matter what protocol they came from: an authentication method, |
| 51 | +and a file name to retrieve. A straightforward implementation would look |
| 52 | +something like this: |
| 53 | + |
| 54 | +```rust,ignore |
| 55 | +
|
| 56 | +enum AuthInfo { |
| 57 | + Nfs(crate::nfs::AuthInfo), |
| 58 | + Bootp(crate::bootp::AuthInfo), |
| 59 | +} |
| 60 | +
|
| 61 | +struct FileDownloadRequest { |
| 62 | + file_name: PathBuf, |
| 63 | + authentication: AuthInfo, |
| 64 | +} |
| 65 | +``` |
| 66 | + |
| 67 | +This design might work well enough. But now suppose you needed to support |
| 68 | +adding metadata that was *protocol specific*. For example, with NFS, you |
| 69 | +wanted to determine what their mount point was in order to enforce additional |
| 70 | +security rules. |
| 71 | + |
| 72 | +The way the current struct is designed leaves the protocol decision until |
| 73 | +runtime. That means any method that applies to one protocol and not the other |
| 74 | +requires the programmer to do a runtime check. |
| 75 | + |
| 76 | +Here is how getting an NFS mount point would look: |
| 77 | + |
| 78 | +```rust,ignore |
| 79 | +struct FileDownloadRequest { |
| 80 | + file_name: PathBuf, |
| 81 | + authentication: AuthInfo, |
| 82 | + mount_point: Option<PathBuf>, |
| 83 | +} |
| 84 | +
|
| 85 | +impl FileDownloadRequest { |
| 86 | + // ... other methods ... |
| 87 | +
|
| 88 | + /// Gets an NFS mount point if this is an NFS request. Otherwise, |
| 89 | + /// return None. |
| 90 | + pub fn mount_point(&self) -> Option<&Path> { |
| 91 | + self.mount_point.as_ref() |
| 92 | + } |
| 93 | +} |
| 94 | +``` |
| 95 | + |
| 96 | +Every caller of `mount_point()` must check for `None` and write code to handle |
| 97 | +it. This is true even if they know only NFS requests are ever used in a given |
| 98 | +code path! |
| 99 | + |
| 100 | +It would be far more optimal to cause a compile-time error if the different |
| 101 | +request types were confused. After all, the entire path of the user's code, |
| 102 | +including what functions from the library they use, will know whether a request |
| 103 | +is an NFS request or a BOOTP request. |
| 104 | + |
| 105 | +In Rust, this is actually possible! The solution is to *add a generic type* in |
| 106 | +order to split the API. |
| 107 | + |
| 108 | +Here is what that looks like: |
| 109 | + |
| 110 | +```rust |
| 111 | +use std::path::{Path, PathBuf}; |
| 112 | + |
| 113 | +mod nfs { |
| 114 | + #[derive(Clone)] |
| 115 | + pub(crate) struct AuthInfo(String); // NFS session management omitted |
| 116 | +} |
| 117 | + |
| 118 | +mod bootp { |
| 119 | + pub(crate) struct AuthInfo(); // no authentication in bootp |
| 120 | +} |
| 121 | + |
| 122 | +// private module, lest outside users invent their own protocol kinds! |
| 123 | +mod proto_trait { |
| 124 | + use std::path::{Path, PathBuf}; |
| 125 | + use super::{bootp, nfs}; |
| 126 | + |
| 127 | + pub(crate) trait ProtoKind { |
| 128 | + type AuthInfo; |
| 129 | + fn auth_info(&self) -> Self::AuthInfo; |
| 130 | + } |
| 131 | + |
| 132 | + pub struct Nfs { |
| 133 | + auth: nfs::AuthInfo, |
| 134 | + mount_point: PathBuf, |
| 135 | + } |
| 136 | + |
| 137 | + impl Nfs { |
| 138 | + pub(crate) fn mount_point(&self) -> &Path { |
| 139 | + &self.mount_point |
| 140 | + } |
| 141 | + } |
| 142 | + |
| 143 | + impl ProtoKind for Nfs { |
| 144 | + type AuthInfo = nfs::AuthInfo; |
| 145 | + fn auth_info(&self) -> Self::AuthInfo { |
| 146 | + self.auth.clone() |
| 147 | + } |
| 148 | + } |
| 149 | + |
| 150 | + pub struct Bootp(); // no additional metadata |
| 151 | + |
| 152 | + impl ProtoKind for Bootp { |
| 153 | + type AuthInfo = bootp::AuthInfo; |
| 154 | + fn auth_info(&self) -> Self::AuthInfo { |
| 155 | + bootp::AuthInfo() |
| 156 | + } |
| 157 | + } |
| 158 | +} |
| 159 | + |
| 160 | +use proto_trait::ProtoKind; // keep internal to prevent impls |
| 161 | +pub use proto_trait::{Nfs, Bootp}; // re-export so callers can see them |
| 162 | + |
| 163 | +struct FileDownloadRequest<P: ProtoKind> { |
| 164 | + file_name: PathBuf, |
| 165 | + protocol: P, |
| 166 | +} |
| 167 | + |
| 168 | +// all common API parts go into a generic impl block |
| 169 | +impl<P: ProtoKind> FileDownloadRequest<P> { |
| 170 | + fn file_path(&self) -> &Path { |
| 171 | + &self.file_name |
| 172 | + } |
| 173 | + |
| 174 | + fn auth_info(&self) -> P::AuthInfo { |
| 175 | + self.protocol.auth_info() |
| 176 | + } |
| 177 | +} |
| 178 | + |
| 179 | +// all protocol-specific impls go into their own block |
| 180 | +impl FileDownloadRequest<Nfs> { |
| 181 | + fn mount_point(&self) -> &Path { |
| 182 | + self.protocol.mount_point() |
| 183 | + } |
| 184 | +} |
| 185 | + |
| 186 | +fn main() { |
| 187 | + // your code here |
| 188 | +} |
| 189 | +``` |
| 190 | + |
| 191 | +With this approach, if the user were to make a mistake and use the wrong |
| 192 | +type; |
| 193 | + |
| 194 | +```rust,ignore |
| 195 | +fn main() { |
| 196 | + let mut socket = crate::bootp::listen()?; |
| 197 | + while let Some(request) = socket.next_request()? { |
| 198 | + match request.mount_point().as_ref() |
| 199 | + "/secure" => socket.send("Access denied"), |
| 200 | + _ => {} // continue on... |
| 201 | + } |
| 202 | + // Rest of the code here |
| 203 | + } |
| 204 | +} |
| 205 | +``` |
| 206 | + |
| 207 | +They would get a syntax error. The type `FileDownloadRequest<Bootp>` does not |
| 208 | +implement `mount_point()`, only the type `FileDownloadRequest<Nfs>` does. And |
| 209 | +that is created by the NFS module, not the BOOTP module of course! |
| 210 | + |
| 211 | +## Advantages |
| 212 | + |
| 213 | +First, it allows fields that are common to multiple states to be de-duplicated. |
| 214 | +By making the non-shared fields generic, they are implemented once. |
| 215 | + |
| 216 | +Second, it makes the `impl` blocks easier to read, because they are broken down |
| 217 | +by state. Methods common to all states are typed once in one block, and methods |
| 218 | +unique to one state are in a separate block. |
| 219 | + |
| 220 | +Both of these mean there are fewer lines of code, and they are better organized. |
| 221 | + |
| 222 | +## Disadvantages |
| 223 | + |
| 224 | +This currently increases the size of the binary, due to the way monomorphization |
| 225 | +is implemented in the compiler. Hopefully the implementation will be able to |
| 226 | +improve in the future. |
| 227 | + |
| 228 | +## Alternatives |
| 229 | + |
| 230 | +* If a type seems to need a "split API" due to construction or partial |
| 231 | +initialization, consider the |
| 232 | +[Builder Pattern](../patterns/creational/builder.md) instead. |
| 233 | + |
| 234 | +* If the API between types does not change -- only the behavior does -- then |
| 235 | +the [Strategy Pattern](../patterns/behavioural/strategy.md) is better used |
| 236 | +instead. |
| 237 | + |
| 238 | +## See also |
| 239 | + |
| 240 | +This pattern is used throughout the standard library: |
| 241 | + |
| 242 | +* `Vec<u8>` can be cast from a String, unlike every other type of `Vec<T>`.[^1] |
| 243 | +* They can also be cast into a binary heap, but only if they contain a type |
| 244 | + that implements the `Ord` trait.[^2] |
| 245 | +* The `to_string` method was specialized for `Cow` only of type `str`.[^3] |
| 246 | + |
| 247 | +It is also used by several popular crates to allow API flexibility: |
| 248 | + |
| 249 | +* The `embedded-hal` ecosystem used for embedded devices makes extensive use of |
| 250 | + this pattern. For example, it allows statically verifying the configuration of |
| 251 | + device registers used to control embedded pins. When a pin is put into a mode, |
| 252 | + it returns a `Pin<MODE>` struct, whose generic determines the functions |
| 253 | + usable in that mode, which are not on the `Pin` itself. [^4] |
| 254 | + |
| 255 | +* The `hyper` HTTP client library uses this to expose rich APIs for different |
| 256 | + pluggable requests. Clients with different connectors have different methods |
| 257 | + on them as well as different trait implementations, while a core set of |
| 258 | + methods apply to any connector. [^5] |
| 259 | + |
| 260 | +* The "type state" pattern -- where an object gains and loses API based on an |
| 261 | + internal state or invariant -- is implemented in Rust using the same basic |
| 262 | + concept, and a slightly different techinque. [^6] |
| 263 | + |
| 264 | +[^1]: See: [impl From\<CString\> for Vec\<u8\>]( |
| 265 | +https://doc.rust-lang.org/stable/src/std/ffi/c_str.rs.html#799-801) |
| 266 | + |
| 267 | +[^2]: See: [impl\<T\> From\<Vec\<T, Global\>\> for BinaryHeap\<T\>]( |
| 268 | +https://doc.rust-lang.org/stable/src/alloc/collections/binary_heap.rs.html#1345-1354) |
| 269 | + |
| 270 | +[^3]: See: [impl\<'_\> ToString for Cow\<'_, str>]( |
| 271 | +https://doc.rust-lang.org/stable/src/alloc/string.rs.html#2235-2240) |
| 272 | + |
| 273 | +[^4]: Example: |
| 274 | +[https://docs.rs/stm32f30x-hal/0.1.0/stm32f30x_hal/gpio/gpioa/struct.PA0.html]( |
| 275 | +https://docs.rs/stm32f30x-hal/0.1.0/stm32f30x_hal/gpio/gpioa/struct.PA0.html) |
| 276 | + |
| 277 | +[^5]: See: |
| 278 | +[https://docs.rs/hyper/0.14.5/hyper/client/struct.Client.html]( |
| 279 | +https://docs.rs/hyper/0.14.5/hyper/client/struct.Client.html) |
| 280 | + |
| 281 | +[^6]: See: |
| 282 | +[The Case for the Type State Pattern]( |
| 283 | +https://web.archive.org/web/20210325065112/https://www.novatec-gmbh.de/en/blog/the-case-for-the-typestate-pattern-the-typestate-pattern-itself/) |
| 284 | +and |
| 285 | +[Rusty Typestate Series (an extensive thesis)]( |
| 286 | +https://web.archive.org/web/20210328164854/https://rustype.github.io/notes/notes/rust-typestate-series/rust-typestate-index) |
0 commit comments