Skip to content

Commit daed54d

Browse files
committed
Guide changes: Generics and Traits sections
Mostly copy-editing, clarification---in particular, monomorphization
1 parent 29ad853 commit daed54d

File tree

1 file changed

+110
-122
lines changed

1 file changed

+110
-122
lines changed

src/doc/guide.md

Lines changed: 110 additions & 122 deletions
Original file line numberDiff line numberDiff line change
@@ -4738,13 +4738,13 @@ enum OptionalFloat64 {
47384738
}
47394739
```
47404740
4741-
This is really unfortunate. Luckily, Rust has a feature that gives us a better
4742-
way: generics. Generics are called **parametric polymorphism** in type theory,
4743-
which means that they are types or functions that have multiple forms ("poly"
4744-
is multiple, "morph" is form) over a given parameter ("parametric").
4741+
Such repetition is unfortunate. Luckily, Rust has a feature that gives us a
4742+
better way: **generics**. Generics are called **parametric polymorphism** in
4743+
type theory, which means that they are types or functions that have multiple
4744+
forms over a given parameter ("parametric").
47454745
4746-
Anyway, enough with type theory declarations, let's check out the generic form
4747-
of `OptionalInt`. It is actually provided by Rust itself, and looks like this:
4746+
Let's see how generics help us escape `OptionalInt`. `Option` is already
4747+
provided in Rust's standard library and looks like this:
47484748
47494749
```rust
47504750
enum Option<T> {
@@ -4753,25 +4753,27 @@ enum Option<T> {
47534753
}
47544754
```
47554755
4756-
The `<T>` part, which you've seen a few times before, indicates that this is
4757-
a generic data type. Inside the declaration of our enum, wherever we see a `T`,
4758-
we substitute that type for the same type used in the generic. Here's an
4759-
example of using `Option<T>`, with some extra type annotations:
4756+
The `<T>` part, which you've seen a few times before, indicates that this is a
4757+
generic data type. `T` is called a **type parameter**. When we create instances
4758+
of `Option`, we need to provide a concrete type in place of the type
4759+
parameter. For example, if we wanted something like our `OptionalInt`, we would
4760+
need to instantiate an `Option<int>`. Inside the declaration of our enum,
4761+
wherever we see a `T`, we replace it with the type specified (or inferred by the
4762+
the compiler).
47604763
47614764
```{rust}
47624765
let x: Option<int> = Some(5i);
47634766
```
47644767
4765-
In the type declaration, we say `Option<int>`. Note how similar this looks to
4766-
`Option<T>`. So, in this particular `Option`, `T` has the value of `int`. On
4767-
the right-hand side of the binding, we do make a `Some(T)`, where `T` is `5i`.
4768-
Since that's an `int`, the two sides match, and Rust is happy. If they didn't
4769-
match, we'd get an error:
4768+
In this particular `Option`, `T` has the value of `int`. On the right-hand side
4769+
of the binding, we do make a `Some(T)`, where `T` is `5i`. Since that's an
4770+
`int`, the two sides match, and Rust is happy. If they didn't match, we'd get an
4771+
error:
47704772
47714773
```{rust,ignore}
47724774
let x: Option<f64> = Some(5i);
4773-
// error: mismatched types: expected `core::option::Option<f64>`
4774-
// but found `core::option::Option<int>` (expected f64 but found int)
4775+
// error: mismatched types: expected `core::option::Option<f64>`,
4776+
// found `core::option::Option<int>` (expected f64, found int)
47754777
```
47764778
47774779
That doesn't mean we can't make `Option<T>`s that hold an `f64`! They just have to
@@ -4782,8 +4784,6 @@ let x: Option<int> = Some(5i);
47824784
let y: Option<f64> = Some(5.0f64);
47834785
```
47844786
4785-
This is just fine. One definition, multiple uses.
4786-
47874787
Generics don't have to only be generic over one type. Consider Rust's built-in
47884788
`Result<T, E>` type:
47894789
@@ -4804,20 +4804,20 @@ enum Result<H, N> {
48044804
}
48054805
```
48064806
4807-
if we wanted to. Convention says that the first generic parameter should be
4808-
`T`, for 'type,' and that we use `E` for 'error'. Rust doesn't care, however.
4807+
Convention says that the first generic parameter should be `T`, for "type," and
4808+
that we use `E` for "error."
48094809
4810-
The `Result<T, E>` type is intended to
4811-
be used to return the result of a computation, and to have the ability to
4812-
return an error if it didn't work out. Here's an example:
4810+
The `Result<T, E>` type is intended to be used to return the result of a
4811+
computation and to have the ability to return an error if it didn't work
4812+
out. Here's an example:
48134813
48144814
```{rust}
48154815
let x: Result<f64, String> = Ok(2.3f64);
48164816
let y: Result<f64, String> = Err("There was an error.".to_string());
48174817
```
48184818
4819-
This particular Result will return an `f64` if there's a success, and a
4820-
`String` if there's a failure. Let's write a function that uses `Result<T, E>`:
4819+
This particular `Result` will return an `f64` upon success and a `String` if
4820+
there's a failure. Let's write a function that uses `Result<T, E>`:
48214821
48224822
```{rust}
48234823
fn inverse(x: f64) -> Result<f64, String> {
@@ -4827,17 +4827,18 @@ fn inverse(x: f64) -> Result<f64, String> {
48274827
}
48284828
```
48294829
4830-
We don't want to take the inverse of zero, so we check to make sure that we
4831-
weren't passed zero. If we were, then we return an `Err`, with a message. If
4832-
it's okay, we return an `Ok`, with the answer.
4830+
We want to indicate that `inverse(0.0f64)` is undefined or is an erroneous usage
4831+
of the function, so we check to make sure that we weren't passed zero. If we
4832+
were, we return an `Err` with a message. If it's okay, we return an `Ok` with
4833+
the answer.
48334834
48344835
Why does this matter? Well, remember how `match` does exhaustive matches?
48354836
Here's how this function gets used:
48364837
48374838
```{rust}
48384839
# fn inverse(x: f64) -> Result<f64, String> {
4839-
# if x == 0.0f64 { return Err("x cannot be zero!".to_string()); }
4840-
# Ok(1.0f64 / x)
4840+
# if x == 0.0f64 { return Err("x cannot be zero!".to_string()); }
4841+
# Ok(1.0f64 / x)
48414842
# }
48424843
let x = inverse(25.0f64);
48434844
@@ -4858,8 +4859,8 @@ println!("{}", x + 2.0f64); // error: binary operation `+` cannot be applied
48584859
```
48594860
48604861
This function is great, but there's one other problem: it only works for 64 bit
4861-
floating point values. What if we wanted to handle 32 bit floating point as
4862-
well? We'd have to write this:
4862+
floating point values. If we wanted to handle 32 bit floating point values we'd
4863+
have to write this:
48634864
48644865
```{rust}
48654866
fn inverse32(x: f32) -> Result<f32, String> {
@@ -4869,9 +4870,9 @@ fn inverse32(x: f32) -> Result<f32, String> {
48694870
}
48704871
```
48714872
4872-
Bummer. What we need is a **generic function**. Luckily, we can write one!
4873-
However, it won't _quite_ work yet. Before we get into that, let's talk syntax.
4874-
A generic version of `inverse` would look something like this:
4873+
What we need is a **generic function**. We can do that with Rust! However, it
4874+
won't _quite_ work yet. We need to talk about syntax. A first attempt at a
4875+
generic version of `inverse` might look something like this:
48754876
48764877
```{rust,ignore}
48774878
fn inverse<T>(x: T) -> Result<T, String> {
@@ -4881,24 +4882,34 @@ fn inverse<T>(x: T) -> Result<T, String> {
48814882
}
48824883
```
48834884
4884-
Just like how we had `Option<T>`, we use a similar syntax for `inverse<T>`.
4885-
We can then use `T` inside the rest of the signature: `x` has type `T`, and half
4886-
of the `Result` has type `T`. However, if we try to compile that example, we'll get
4887-
an error:
4885+
Just like how we had `Option<T>`, we use a similar syntax for `inverse<T>`. We
4886+
can then use `T` inside the rest of the signature: `x` has type `T`, and half of
4887+
the `Result` has type `T`. However, if we try to compile that example, we'll get
4888+
some errors:
48884889
48894890
```text
48904891
error: binary operation `==` cannot be applied to type `T`
4892+
if x == 0.0 { return Err("x cannot be zero!".to_string()); }
4893+
^~~~~~~~
4894+
error: mismatched types: expected `_`, found `T` (expected floating-point variable, found type parameter)
4895+
Ok(1.0 / x)
4896+
^
4897+
error: mismatched types: expected `core::result::Result<T, collections::string::String>`, found `core::result::Result<_, _>` (expected type parameter, found floating-point variable)
4898+
Ok(1.0 / x)
4899+
^~~~~~~~~~~
48914900
```
48924901
4893-
Because `T` can be _any_ type, it may be a type that doesn't implement `==`,
4894-
and therefore, the first line would be wrong. What do we do?
4902+
The problem is that `T` is unconstrained: it can be _any_ type. It could be a
4903+
`String`, and the expression `1.0 / x` has no meaning if `x` is a `String`. It
4904+
may be a type that doesn't implement `==`, and the first line would be
4905+
wrong. What do we do?
48954906
4896-
To fix this example, we need to learn about another Rust feature: traits.
4907+
To fix this example, we need to learn about another Rust feature: **traits**.
48974908
48984909
# Traits
48994910
4900-
Do you remember the `impl` keyword, used to call a function with method
4901-
syntax?
4911+
Our discussion of **traits** begins with the `impl` keyword. We used it before
4912+
to specify methods.
49024913
49034914
```{rust}
49044915
struct Circle {
@@ -4914,8 +4925,8 @@ impl Circle {
49144925
}
49154926
```
49164927
4917-
Traits are similar, except that we define a trait with just the method
4918-
signature, then implement the trait for that struct. Like this:
4928+
We define a trait in terms of its methods. We then `impl` a trait `for` a type
4929+
(or many types).
49194930
49204931
```{rust}
49214932
struct Circle {
@@ -4935,19 +4946,18 @@ impl HasArea for Circle {
49354946
}
49364947
```
49374948
4938-
As you can see, the `trait` block looks very similar to the `impl` block,
4939-
but we don't define a body, just a type signature. When we `impl` a trait,
4940-
we use `impl Trait for Item`, rather than just `impl Item`.
4949+
The `trait` block defines only type signatures. When we `impl` a trait, we use
4950+
`impl Trait for Item`, rather than just `impl Item`.
49414951
4942-
So what's the big deal? Remember the error we were getting with our generic
4943-
`inverse` function?
4952+
The first of the three errors we got with our generic `inverse` function was
4953+
this:
49444954
49454955
```text
49464956
error: binary operation `==` cannot be applied to type `T`
49474957
```
49484958
4949-
We can use traits to constrain our generics. Consider this function, which
4950-
does not compile, and gives us a similar error:
4959+
We can use traits to constrain generic type parameters. Consider this function,
4960+
which does not compile, and gives us a similar error:
49514961
49524962
```{rust,ignore}
49534963
fn print_area<T>(shape: T) {
@@ -4962,8 +4972,9 @@ error: type `T` does not implement any method in scope named `area`
49624972
```
49634973
49644974
Because `T` can be any type, we can't be sure that it implements the `area`
4965-
method. But we can add a **trait constraint** to our generic `T`, ensuring
4966-
that it does:
4975+
method. But we can add a **trait constraint** to our generic `T`, ensuring that
4976+
we can only compile the function if it's called with types which `impl` the
4977+
`HasArea` trait:
49674978
49684979
```{rust}
49694980
# trait HasArea {
@@ -4974,9 +4985,9 @@ fn print_area<T: HasArea>(shape: T) {
49744985
}
49754986
```
49764987
4977-
The syntax `<T: HasArea>` means `any type that implements the HasArea trait`.
4978-
Because traits define function type signatures, we can be sure that any type
4979-
which implements `HasArea` will have an `.area()` method.
4988+
The syntax `<T: HasArea>` means "any type that implements the HasArea trait."
4989+
Because traits define method signatures, we can be sure that any type which
4990+
implements `HasArea` will have an `area` method.
49804991
49814992
Here's an extended example of how this works:
49824993
@@ -5074,55 +5085,22 @@ impl HasArea for int {
50745085
It is considered poor style to implement methods on such primitive types, even
50755086
though it is possible.
50765087
5077-
This may seem like the Wild West, but there are two other restrictions around
5078-
implementing traits that prevent this from getting out of hand. First, traits
5079-
must be `use`d in any scope where you wish to use the trait's method. So for
5080-
example, this does not work:
5081-
5082-
```{rust,ignore}
5083-
mod shapes {
5084-
use std::f64::consts;
5085-
5086-
trait HasArea {
5087-
fn area(&self) -> f64;
5088-
}
5089-
5090-
struct Circle {
5091-
x: f64,
5092-
y: f64,
5093-
radius: f64,
5094-
}
5095-
5096-
impl HasArea for Circle {
5097-
fn area(&self) -> f64 {
5098-
consts::PI * (self.radius * self.radius)
5099-
}
5100-
}
5101-
}
5102-
5103-
fn main() {
5104-
let c = shapes::Circle {
5105-
x: 0.0f64,
5106-
y: 0.0f64,
5107-
radius: 1.0f64,
5108-
};
5088+
## Scoped Method Resolution and Orphan `impl`s
51095089
5110-
println!("{}", c.area());
5111-
}
5112-
```
5113-
5114-
Now that we've moved the structs and traits into their own module, we get an
5115-
error:
5090+
There are two restrictions for implementing traits that prevent this from
5091+
getting out of hand.
51165092
5117-
```text
5118-
error: type `shapes::Circle` does not implement any method in scope named `area`
5119-
```
5093+
1. **Scope-based Method Resolution**: Traits must be `use`d in any scope where
5094+
you wish to use the trait's methods
5095+
2. **No Orphan `impl`s**: Either the trait or the type you're writing the `impl`
5096+
for must be inside your crate.
51205097
5121-
If we add a `use` line right above `main` and make the right things public,
5122-
everything is fine:
5098+
If we organize our crate differently by using modules, we'll need to ensure both
5099+
of the conditions are satisfied. Don't worry, you can lean on the compiler since
5100+
it won't let you get away with violating them.
51235101
51245102
```{rust}
5125-
use shapes::HasArea;
5103+
use shapes::HasArea; // satisfies #1
51265104
51275105
mod shapes {
51285106
use std::f64::consts;
@@ -5144,8 +5122,8 @@ mod shapes {
51445122
}
51455123
}
51465124
5147-
51485125
fn main() {
5126+
// use shapes::HasArea; // This would satisfy #1, too
51495127
let c = shapes::Circle {
51505128
x: 0.0f64,
51515129
y: 0.0f64,
@@ -5156,18 +5134,25 @@ fn main() {
51565134
}
51575135
```
51585136
5159-
This means that even if someone does something bad like add methods to `int`,
5160-
it won't affect you, unless you `use` that trait.
5137+
Requiring us to `use` traits whose methods we want means that even if someone
5138+
does something bad like add methods to `int`, it won't affect us, unless you
5139+
`use` that trait.
5140+
5141+
The second condition allows us to `impl` built-in `trait`s for types we define,
5142+
or allows us to `impl` our own `trait`s for built-in types, but restricts us
5143+
from mixing and matching third party or built-in `impl`s with third party or
5144+
built-in types.
51615145
5162-
There's one more restriction on implementing traits. Either the trait or the
5163-
type you're writing the `impl` for must be inside your crate. So, we could
5164-
implement the `HasArea` type for `int`, because `HasArea` is in our crate. But
5165-
if we tried to implement `Float`, a trait provided by Rust, for `int`, we could
5166-
not, because both the trait and the type aren't in our crate.
5146+
We could `impl` the `HasArea` trait for `int`, because `HasArea` is in our
5147+
crate. But if we tried to implement `Float`, a standard library `trait`, for
5148+
`int`, we could not, because neither the `trait` nor the `type` are in our
5149+
crate.
51675150
5168-
One last thing about traits: generic functions with a trait bound use
5169-
**monomorphization** ("mono": one, "morph": form), so they are statically
5170-
dispatched. What's that mean? Well, let's take a look at `print_area` again:
5151+
## Monomorphization
5152+
5153+
One last thing about generics and traits: the compiler performs
5154+
**monomorphization** on generic functions so they are statically dispatched. To
5155+
see what that means, let's take a look at `print_area` again:
51715156
51725157
```{rust,ignore}
51735158
fn print_area<T: HasArea>(shape: T) {
@@ -5184,10 +5169,11 @@ fn main() {
51845169
}
51855170
```
51865171
5187-
When we use this trait with `Circle` and `Square`, Rust ends up generating
5188-
two different functions with the concrete type, and replacing the call sites with
5189-
calls to the concrete implementations. In other words, you get something like
5190-
this:
5172+
Because we have called `print_area` with two different types in place of its
5173+
type paramater `T`, Rust will generate two versions of the function with the
5174+
appropriate concrete types, replacing the call sites with calls to the concrete
5175+
implementations. In other words, the compiler will actually compile something
5176+
more like this:
51915177
51925178
```{rust,ignore}
51935179
fn __print_area_circle(shape: Circle) {
@@ -5208,10 +5194,12 @@ fn main() {
52085194
}
52095195
```
52105196
5211-
The names don't actually change to this, it's just for illustration. But
5212-
as you can see, there's no overhead of deciding which version to call here,
5213-
hence 'statically dispatched'. The downside is that we have two copies of
5214-
the same function, so our binary is a little bit larger.
5197+
These names are for illustration; the compiler will generate its own cryptic
5198+
names for internal uses. The point is that there is no runtime overhead of
5199+
deciding which version to call. The function to be called is determined
5200+
statically, at compile time. Thus, generic functions are **statically
5201+
dispatched**. The downside is that we have two similar functions, so our binary
5202+
is larger.
52155203
52165204
# Tasks
52175205

0 commit comments

Comments
 (0)