diff --git a/SUMMARY.md b/SUMMARY.md index 41191739..abce59cb 100644 --- a/SUMMARY.md +++ b/SUMMARY.md @@ -46,6 +46,7 @@ - [Functional Programming](./functional/index.md) - [Programming paradigms](./functional/paradigms.md) + - [Generics as Type Classes](./functional/generics-type-classes.md) - [Additional Resources](./additional_resources/index.md) - [Design principles](./additional_resources/design-principles.md) diff --git a/functional/generics-type-classes.md b/functional/generics-type-classes.md new file mode 100644 index 00000000..9999e921 --- /dev/null +++ b/functional/generics-type-classes.md @@ -0,0 +1,286 @@ +# Generics as Type Classes + +## Description + +Rust's type system is designed more like functional languages (like Haskell) +rather than imperative languages (like Java and C++). As a result, Rust can turn +many kinds of programming problems into "static typing" problems. This is one +of the biggest wins of choosing a functional language, and is critical to many +of Rust's compile time guarantees. + +A key part of this idea is the way generic types work. In C++ and Java, for +example, generic types are a meta-programming construct for the compiler. +`vector` and `vector` in C++ are just two different copies of the +same boilerplate code for a `vector` type (known as a `template`) with two +different types filled in. + +In Rust, a generic type parameter creates what is known in functional languages +as a "type class constraint", and each different parameter filled in by an end +user *actually changes the type*. In other words, `Vec` and `Vec` +*are two different types*, which are recognized as distinct by all parts of the +type system. + +This is called **monomorphization**, where different types are created from +**polymorphic** code. This special behavior requires `impl` blocks to specify +generic parameters: different values for the generic type cause different types, +and different types can have different `impl` blocks. + +In object oriented languages, classes can inherit behavior from their parents. +However, this allows the attachment of not only additional behavior to +particular members of a type class, but extra behavior as well. + +The nearest equivalent is the runtime polymorphism in Javascript and Python, +where new members can be added to objects willy-nilly by any constructor. +Unlike those languages, however, all of Rust's additional methods can be type +checked when they are used, because their generics are statically defined. That +makes them more usable while remaining safe. + +## Example + +Suppose you are designing a storage server for a series of lab machines. +Because of the software involved, there are two different protocols you need +to support: BOOTP (for PXE network boot), and NFS (for remote mount storage). + +Your goal is to have one program, written in Rust, which can handle both of +them. It will have protocol handlers and listen for both kinds of requests. The +main application logic will then allow a lab administrator to configure storage +and security controls for the actual files. + +The requests from machines in the lab for files contain the same basic +information, no matter what protocol they came from: an authentication method, +and a file name to retrieve. A straightforward implementation would look +something like this: + +```rust,ignore + +enum AuthInfo { + Nfs(crate::nfs::AuthInfo), + Bootp(crate::bootp::AuthInfo), +} + +struct FileDownloadRequest { + file_name: PathBuf, + authentication: AuthInfo, +} +``` + +This design might work well enough. But now suppose you needed to support +adding metadata that was *protocol specific*. For example, with NFS, you +wanted to determine what their mount point was in order to enforce additional +security rules. + +The way the current struct is designed leaves the protocol decision until +runtime. That means any method that applies to one protocol and not the other +requires the programmer to do a runtime check. + +Here is how getting an NFS mount point would look: + +```rust,ignore +struct FileDownloadRequest { + file_name: PathBuf, + authentication: AuthInfo, + mount_point: Option, +} + +impl FileDownloadRequest { + // ... other methods ... + + /// Gets an NFS mount point if this is an NFS request. Otherwise, + /// return None. + pub fn mount_point(&self) -> Option<&Path> { + self.mount_point.as_ref() + } +} +``` + +Every caller of `mount_point()` must check for `None` and write code to handle +it. This is true even if they know only NFS requests are ever used in a given +code path! + +It would be far more optimal to cause a compile-time error if the different +request types were confused. After all, the entire path of the user's code, +including what functions from the library they use, will know whether a request +is an NFS request or a BOOTP request. + +In Rust, this is actually possible! The solution is to *add a generic type* in +order to split the API. + +Here is what that looks like: + +```rust +use std::path::{Path, PathBuf}; + +mod nfs { + #[derive(Clone)] + pub(crate) struct AuthInfo(String); // NFS session management omitted +} + +mod bootp { + pub(crate) struct AuthInfo(); // no authentication in bootp +} + +// private module, lest outside users invent their own protocol kinds! +mod proto_trait { + use std::path::{Path, PathBuf}; + use super::{bootp, nfs}; + + pub(crate) trait ProtoKind { + type AuthInfo; + fn auth_info(&self) -> Self::AuthInfo; + } + + pub struct Nfs { + auth: nfs::AuthInfo, + mount_point: PathBuf, + } + + impl Nfs { + pub(crate) fn mount_point(&self) -> &Path { + &self.mount_point + } + } + + impl ProtoKind for Nfs { + type AuthInfo = nfs::AuthInfo; + fn auth_info(&self) -> Self::AuthInfo { + self.auth.clone() + } + } + + pub struct Bootp(); // no additional metadata + + impl ProtoKind for Bootp { + type AuthInfo = bootp::AuthInfo; + fn auth_info(&self) -> Self::AuthInfo { + bootp::AuthInfo() + } + } +} + +use proto_trait::ProtoKind; // keep internal to prevent impls +pub use proto_trait::{Nfs, Bootp}; // re-export so callers can see them + +struct FileDownloadRequest { + file_name: PathBuf, + protocol: P, +} + +// all common API parts go into a generic impl block +impl FileDownloadRequest

{ + fn file_path(&self) -> &Path { + &self.file_name + } + + fn auth_info(&self) -> P::AuthInfo { + self.protocol.auth_info() + } +} + +// all protocol-specific impls go into their own block +impl FileDownloadRequest { + fn mount_point(&self) -> &Path { + self.protocol.mount_point() + } +} + +fn main() { + // your code here +} +``` + +With this approach, if the user were to make a mistake and use the wrong +type; + +```rust,ignore +fn main() { + let mut socket = crate::bootp::listen()?; + while let Some(request) = socket.next_request()? { + match request.mount_point().as_ref() + "/secure" => socket.send("Access denied"), + _ => {} // continue on... + } + // Rest of the code here + } +} +``` + +They would get a syntax error. The type `FileDownloadRequest` does not +implement `mount_point()`, only the type `FileDownloadRequest` does. And +that is created by the NFS module, not the BOOTP module of course! + +## Advantages + +First, it allows fields that are common to multiple states to be de-duplicated. +By making the non-shared fields generic, they are implemented once. + +Second, it makes the `impl` blocks easier to read, because they are broken down +by state. Methods common to all states are typed once in one block, and methods +unique to one state are in a separate block. + +Both of these mean there are fewer lines of code, and they are better organized. + +## Disadvantages + +This currently increases the size of the binary, due to the way monomorphization +is implemented in the compiler. Hopefully the implementation will be able to +improve in the future. + +## Alternatives + +* If a type seems to need a "split API" due to construction or partial +initialization, consider the +[Builder Pattern](../patterns/creational/builder.md) instead. + +* If the API between types does not change -- only the behavior does -- then +the [Strategy Pattern](../patterns/behavioural/strategy.md) is better used +instead. + +## See also + +This pattern is used throughout the standard library: + +* `Vec` can be cast from a String, unlike every other type of `Vec`.[^1] +* They can also be cast into a binary heap, but only if they contain a type + that implements the `Ord` trait.[^2] +* The `to_string` method was specialized for `Cow` only of type `str`.[^3] + +It is also used by several popular crates to allow API flexibility: + +* The `embedded-hal` ecosystem used for embedded devices makes extensive use of + this pattern. For example, it allows statically verifying the configuration of + device registers used to control embedded pins. When a pin is put into a mode, + it returns a `Pin` struct, whose generic determines the functions + usable in that mode, which are not on the `Pin` itself. [^4] + +* The `hyper` HTTP client library uses this to expose rich APIs for different + pluggable requests. Clients with different connectors have different methods + on them as well as different trait implementations, while a core set of + methods apply to any connector. [^5] + +* The "type state" pattern -- where an object gains and loses API based on an + internal state or invariant -- is implemented in Rust using the same basic + concept, and a slightly different techinque. [^6] + +[^1]: See: [impl From\ for Vec\]( +https://doc.rust-lang.org/stable/src/std/ffi/c_str.rs.html#799-801) + +[^2]: See: [impl\ From\\> for BinaryHeap\]( +https://doc.rust-lang.org/stable/src/alloc/collections/binary_heap.rs.html#1345-1354) + +[^3]: See: [impl\<'_\> ToString for Cow\<'_, str>]( +https://doc.rust-lang.org/stable/src/alloc/string.rs.html#2235-2240) + +[^4]: Example: +[https://docs.rs/stm32f30x-hal/0.1.0/stm32f30x_hal/gpio/gpioa/struct.PA0.html]( +https://docs.rs/stm32f30x-hal/0.1.0/stm32f30x_hal/gpio/gpioa/struct.PA0.html) + +[^5]: See: +[https://docs.rs/hyper/0.14.5/hyper/client/struct.Client.html]( +https://docs.rs/hyper/0.14.5/hyper/client/struct.Client.html) + +[^6]: See: +[The Case for the Type State Pattern]( +https://web.archive.org/web/20210325065112/https://www.novatec-gmbh.de/en/blog/the-case-for-the-typestate-pattern-the-typestate-pattern-itself/) +and +[Rusty Typestate Series (an extensive thesis)]( +https://web.archive.org/web/20210328164854/https://rustype.github.io/notes/notes/rust-typestate-series/rust-typestate-index)