Skip to content

Commit 3585815

Browse files
committed
Add doctests to overview.md
1 parent 3c935cc commit 3585815

File tree

1 file changed

+32
-31
lines changed

1 file changed

+32
-31
lines changed

docs/src/models/overview.md

Lines changed: 32 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ Here's how you'd use Flux to build and train the most basic of models, step by s
1515

1616
This example will predict the output of the function `4x + 2`. First, import `Flux` and define the function we want to simulate:
1717

18-
```julia
18+
```jldoctest overview; setup = :(using Random; Random.seed!(0))
1919
julia> using Flux
2020
2121
julia> actual(x) = 4x + 2
@@ -28,7 +28,7 @@ This example will build a model to approximate the `actual` function.
2828

2929
Use the `actual` function to build sets of data for training and verification:
3030

31-
```julia
31+
```jldoctest overview
3232
julia> x_train, x_test = hcat(0:5...), hcat(6:10...)
3333
([0 1 … 4 5], [6 7 … 9 10])
3434
@@ -42,13 +42,13 @@ Normally, your training and test data come from real world observations, but thi
4242

4343
Now, build a model to make predictions with `1` input and `1` output:
4444

45-
```julia
45+
```jldoctest overview
4646
julia> model = Dense(1 => 1)
47-
Dense(1 => 1)
47+
Dense(1 => 1) # 2 parameters
4848
4949
julia> model.weight
5050
1×1 Matrix{Float32}:
51-
-1.4925033
51+
0.95041317
5252
5353
julia> model.bias
5454
1-element Vector{Float32}:
@@ -57,28 +57,29 @@ julia> model.bias
5757

5858
Under the hood, a dense layer is a struct with fields `weight` and `bias`. `weight` represents a weights' matrix and `bias` represents a bias vector. There's another way to think about a model. In Flux, *models are conceptually predictive functions*:
5959

60-
```julia
60+
```jldoctest overview
6161
julia> predict = Dense(1 => 1)
62+
Dense(1 => 1) # 2 parameters
6263
```
6364

6465
`Dense(1 => 1)` also implements the function `σ(Wx+b)` where `W` and `b` are the weights and biases. `σ` is an activation function (more on activations later). Our model has one weight and one bias, but typical models will have many more. Think of weights and biases as knobs and levers Flux can use to tune predictions. Activation functions are transformations that tailor models to your needs.
6566

6667
This model will already make predictions, though not accurate ones yet:
6768

68-
```julia
69+
```jldoctest overview
6970
julia> predict(x_train)
7071
1×6 Matrix{Float32}:
71-
0.0 -1.4925 -2.98501 -4.47751 -5.97001 -7.46252
72+
0.0 0.906654 1.81331 2.71996 3.62662 4.53327
7273
```
7374

7475
In order to make better predictions, you'll need to provide a *loss function* to tell Flux how to objectively *evaluate* the quality of a prediction. Loss functions compute the cumulative distance between actual values and predictions.
7576

76-
```julia
77+
```jldoctest overview
7778
julia> loss(x, y) = Flux.Losses.mse(predict(x), y)
7879
loss (generic function with 1 method)
7980
8081
julia> loss(x_train, y_train)
81-
282.16010605766024
82+
122.64734f0
8283
```
8384

8485
More accurate predictions will yield a lower loss. You can write your own loss functions or rely on those already provided by Flux. This loss function is called [mean squared error](https://www.statisticshowto.com/probability-and-statistics/statistics-definitions/mean-squared-error/). Flux works by iteratively reducing the loss through *training*.
@@ -87,39 +88,39 @@ More accurate predictions will yield a lower loss. You can write your own loss f
8788

8889
Under the hood, the Flux [`Flux.train!`](@ref) function uses *a loss function* and *training data* to improve the *parameters* of your model based on a pluggable [`optimiser`](../training/optimisers.md):
8990

90-
```julia
91+
```jldoctest overview
9192
julia> using Flux: train!
9293
9394
julia> opt = Descent()
9495
Descent(0.1)
9596
9697
julia> data = [(x_train, y_train)]
97-
1-element Array{Tuple{Array{Int64,2},Array{Int64,2}},1}:
98+
1-element Vector{Tuple{Matrix{Int64}, Matrix{Int64}}}:
9899
([0 1 … 4 5], [2 6 … 18 22])
99100
```
100101

101102
Now, we have the optimiser and data we'll pass to `train!`. All that remains are the parameters of the model. Remember, each model is a Julia struct with a function and configurable parameters. Remember, the dense layer has weights and biases that depend on the dimensions of the inputs and outputs:
102103

103-
```julia
104+
```jldoctest overview
104105
julia> predict.weight
105-
1-element Array{Float64,1}:
106-
-0.99009055
106+
1×1 Matrix{Float32}:
107+
0.9066542
107108
108109
julia> predict.bias
109-
1-element Array{Float64,1}:
110+
1-element Vector{Float32}:
110111
0.0
111112
```
112113

113114
The dimensions of these model parameters depend on the number of inputs and outputs. Since models can have hundreds of inputs and several layers, it helps to have a function to collect the parameters into the data structure Flux expects:
114115

115-
```
116+
```jldoctest overview
116117
julia> parameters = Flux.params(predict)
117-
Params([[-0.99009055], [0.0]])
118+
Params([Float32[0.9066542], Float32[0.0]])
118119
```
119120

120121
These are the parameters Flux will change, one step at a time, to improve predictions. At each step, the contents of this `Params` object changes too, since it is just a collection of references to the mutable arrays inside the model:
121122

122-
```
123+
```jldoctest overview
123124
julia> predict.weight in parameters, predict.bias in parameters
124125
(true, true)
125126
@@ -129,22 +130,22 @@ The first parameter is the weight and the second is the bias. Flux will adjust p
129130

130131
This optimiser implements the classic gradient descent strategy. Now improve the parameters of the model with a call to [`Flux.train!`](@ref) like this:
131132

132-
```
133+
```jldoctest overview
133134
julia> train!(loss, parameters, data, opt)
134135
```
135136

136137
And check the loss:
137138

138-
```
139+
```jldoctest overview
139140
julia> loss(x_train, y_train)
140-
267.8037f0
141+
116.38745f0
141142
```
142143

143144
It went down. Why?
144145

145-
```
146+
```jldoctest overview
146147
julia> parameters
147-
Params([[9.158408791666668], [2.895045275]])
148+
Params([Float32[7.5777884], Float32[1.9466728]])
148149
```
149150

150151
The parameters have changed. This single step is the essence of machine learning.
@@ -153,16 +154,16 @@ The parameters have changed. This single step is the essence of machine learning
153154

154155
In the previous section, we made a single call to `train!` which iterates over the data we passed in just once. An *epoch* refers to one pass over the dataset. Typically, we will run the training for multiple epochs to drive the loss down even further. Let's run it a few more times:
155156

156-
```
157+
```jldoctest overview
157158
julia> for epoch in 1:200
158159
train!(loss, parameters, data, opt)
159160
end
160161
161162
julia> loss(x_train, y_train)
162-
0.007433314787010791
163+
0.00339581f0
163164
164165
julia> parameters
165-
Params([[3.9735880692372345], [1.9925541368157165]])
166+
Params([Float32[4.0178537], Float32[2.0050256]])
166167
```
167168

168169
After 200 training steps, the loss went down, and the parameters are getting close to those in the function the model is built to predict.
@@ -171,13 +172,13 @@ After 200 training steps, the loss went down, and the parameters are getting clo
171172

172173
Now, let's verify the predictions:
173174

174-
```
175+
```jldoctest overview
175176
julia> predict(x_test)
176-
1×5 Array{Float64,2}:
177-
25.8442 29.8194 33.7946 37.7698 41.745
177+
1×5 Matrix{Float32}:
178+
26.1121 30.13 34.1479 38.1657 42.1836
178179
179180
julia> y_test
180-
1×5 Array{Int64,2}:
181+
1×5 Matrix{Int64}:
181182
26 30 34 38 42
182183
```
183184

0 commit comments

Comments
 (0)