Add doctests to overview.md

Saransh-cpp · Saransh-cpp · commit 358581551475 · 2022-03-22T01:14:51.000+05:30
diff --git a/docs/src/models/overview.md b/docs/src/models/overview.md
@@ -15,7 +15,7 @@ Here's how you'd use Flux to build and train the most basic of models, step by s
 
 This example will predict the output of the function `4x + 2`. First, import `Flux` and define the function we want to simulate:
 
-```julia
+```jldoctest overview; setup = :(using Random; Random.seed!(0))
 julia> using Flux
 
 julia> actual(x) = 4x + 2
@@ -28,7 +28,7 @@ This example will build a model to approximate the `actual` function.
 
 Use the `actual` function to build sets of data for training and verification:
 
-```julia
+```jldoctest overview
 julia> x_train, x_test = hcat(0:5...), hcat(6:10...)
 ([0 1 … 4 5], [6 7 … 9 10])
 
@@ -42,13 +42,13 @@ Normally, your training and test data come from real world observations, but thi
 
 Now, build a model to make predictions with `1` input and `1` output:
 
-```julia
+```jldoctest overview
 julia> model = Dense(1 => 1)
-Dense(1 => 1)
+Dense(1 => 1)       # 2 parameters
 
 julia> model.weight
 1×1 Matrix{Float32}:
- -1.4925033
+ 0.95041317
 
 julia> model.bias
 1-element Vector{Float32}:
@@ -57,28 +57,29 @@ julia> model.bias
 
 Under the hood, a dense layer is a struct with fields `weight` and `bias`. `weight` represents a weights' matrix and `bias` represents a bias vector. There's another way to think about a model. In Flux, *models are conceptually predictive functions*: 
 
-```julia
+```jldoctest overview
 julia> predict = Dense(1 => 1)
+Dense(1 => 1)       # 2 parameters
 ```
 
 `Dense(1 => 1)` also implements the function `σ(Wx+b)` where `W` and `b` are the weights and biases. `σ` is an activation function (more on activations later). Our model has one weight and one bias, but typical models will have many more. Think of weights and biases as knobs and levers Flux can use to tune predictions. Activation functions are transformations that tailor models to your needs. 
 
 This model will already make predictions, though not accurate ones yet:
 
-```julia
+```jldoctest overview
 julia> predict(x_train)
 1×6 Matrix{Float32}:
- 0.0  -1.4925  -2.98501  -4.47751  -5.97001  -7.46252
+ 0.0  0.906654  1.81331  2.71996  3.62662  4.53327
 ```
 
 In order to make better predictions, you'll need to provide a *loss function* to tell Flux how to objectively *evaluate* the quality of a prediction. Loss functions compute the cumulative distance between actual values and predictions. 
 
-```julia
+```jldoctest overview
 julia> loss(x, y) = Flux.Losses.mse(predict(x), y)
 loss (generic function with 1 method)
 
 julia> loss(x_train, y_train)
-282.16010605766024
+122.64734f0
 ```
 
 More accurate predictions will yield a lower loss. You can write your own loss functions or rely on those already provided by Flux. This loss function is called [mean squared error](https://www.statisticshowto.com/probability-and-statistics/statistics-definitions/mean-squared-error/). Flux works by iteratively reducing the loss through *training*.
@@ -87,39 +88,39 @@ More accurate predictions will yield a lower loss. You can write your own loss f
 
 Under the hood, the Flux [`Flux.train!`](@ref) function uses *a loss function* and *training data* to improve the *parameters* of your model based on a pluggable [`optimiser`](../training/optimisers.md):
 
-```julia
+```jldoctest overview
 julia> using Flux: train!
 
 julia> opt = Descent()
 Descent(0.1)
 
 julia> data = [(x_train, y_train)]
-1-element Array{Tuple{Array{Int64,2},Array{Int64,2}},1}:
+1-element Vector{Tuple{Matrix{Int64}, Matrix{Int64}}}:
  ([0 1 … 4 5], [2 6 … 18 22])
 ```
 
 Now, we have the optimiser and data we'll pass to `train!`. All that remains are the parameters of the model. Remember, each model is a Julia struct with a function and configurable parameters. Remember, the dense layer has weights and biases that depend on the dimensions of the inputs and outputs: 
 
-```julia
+```jldoctest overview
 julia> predict.weight
-1-element Array{Float64,1}:
- -0.99009055
+1×1 Matrix{Float32}:
+ 0.9066542
 
 julia> predict.bias
-1-element Array{Float64,1}:
+1-element Vector{Float32}:
  0.0
 ```
 
 The dimensions of these model parameters depend on the number of inputs and outputs. Since models can have hundreds of inputs and several layers, it helps to have a function to collect the parameters into the data structure Flux expects:
 
-```
+```jldoctest overview
 julia> parameters = Flux.params(predict)
-Params([[-0.99009055], [0.0]])
+Params([Float32[0.9066542], Float32[0.0]])
 ```
 
 These are the parameters Flux will change, one step at a time, to improve predictions. At each step, the contents of this `Params` object changes too, since it is just a collection of references to the mutable arrays inside the model: 
 
-```
+```jldoctest overview
 julia> predict.weight in parameters, predict.bias in parameters
 (true, true)
 
@@ -129,22 +130,22 @@ The first parameter is the weight and the second is the bias. Flux will adjust p
 
 This optimiser implements the classic gradient descent strategy. Now improve the parameters of the model with a call to [`Flux.train!`](@ref) like this:
 
-```
+```jldoctest overview
 julia> train!(loss, parameters, data, opt)
 ```
 
 And check the loss:
 
-```
+```jldoctest overview
 julia> loss(x_train, y_train)
-267.8037f0
+116.38745f0
 ```
 
 It went down. Why? 
 
-```
+```jldoctest overview
 julia> parameters
-Params([[9.158408791666668], [2.895045275]])
+Params([Float32[7.5777884], Float32[1.9466728]])
 ```
 
 The parameters have changed. This single step is the essence of machine learning.
@@ -153,16 +154,16 @@ The parameters have changed. This single step is the essence of machine learning
 
 In the previous section, we made a single call to `train!` which iterates over the data we passed in just once. An *epoch* refers to one pass over the dataset. Typically, we will run the training for multiple epochs to drive the loss down even further. Let's run it a few more times:
 
-```
+```jldoctest overview
 julia> for epoch in 1:200
          train!(loss, parameters, data, opt)
        end
 
 julia> loss(x_train, y_train)
-0.007433314787010791
+0.00339581f0
 
 julia> parameters
-Params([[3.9735880692372345], [1.9925541368157165]])
+Params([Float32[4.0178537], Float32[2.0050256]])
 ```
 
 After 200 training steps, the loss went down, and the parameters are getting close to those in the function the model is built to predict.
@@ -171,13 +172,13 @@ After 200 training steps, the loss went down, and the parameters are getting clo
 
 Now, let's verify the predictions:
 
-```
+```jldoctest overview
 julia> predict(x_test)
-1×5 Array{Float64,2}:
- 25.8442  29.8194  33.7946  37.7698  41.745
+1×5 Matrix{Float32}:
+ 26.1121  30.13  34.1479  38.1657  42.1836
 
 julia> y_test
-1×5 Array{Int64,2}:
+1×5 Matrix{Int64}:
  26  30  34  38  42
 ```