You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -76,21 +78,28 @@ Equivalent to the `RNN` stateful constructor, `LSTM` and `GRU` are also availabl
76
78
77
79
Using these tools, we can now build the model shown in the above diagram with:
78
80
79
-
```julia
80
-
m =Chain(RNN(2=>5), Dense(5=>1))
81
+
```jldoctest recurrence
82
+
julia> m = Chain(RNN(2 => 5), Dense(5 => 1))
83
+
Chain(
84
+
Recur(
85
+
RNNCell(2 => 5, tanh), # 45 parameters
86
+
),
87
+
Dense(5 => 1), # 6 parameters
88
+
) # Total: 6 trainable arrays, 51 parameters,
89
+
# plus 1 non-trainable, 5 parameters, summarysize 580 bytes.
81
90
```
82
91
In this example, each output has only one component.
83
92
84
93
## Working with sequences
85
94
86
95
Using the previously defined `m` recurrent model, we can now apply it to a single step from our sequence:
87
96
88
-
```julia
97
+
```jldoctest recurrence
89
98
julia> x = rand(Float32, 2);
90
99
91
100
julia> m(x)
92
101
1-element Vector{Float32}:
93
-
0.31759313
102
+
0.45860028
94
103
```
95
104
96
105
The `m(x)` operation would be represented by `x1 -> A -> y1` in our diagram.
@@ -102,14 +111,14 @@ iterating the model on a sequence of data.
102
111
103
112
To do so, we'll need to structure the input data as a `Vector` of observations at each time step. This `Vector` will therefore be of `length = seq_length` and each of its elements will represent the input features for a given step. In our example, this translates into a `Vector` of length 3, where each element is a `Matrix` of size `(features, batch_size)`, or just a `Vector` of length `features` if dealing with a single observation.
@@ -90,32 +92,39 @@ The new `model` we created will now be identical to the one we saved parameters
90
92
91
93
In longer training runs it's a good idea to periodically save your model, so that you can resume if training is interrupted (for example, if there's a power cut). You can do this by saving the model in the [callback provided to `train!`](training/training.md).
92
94
93
-
```julia
94
-
using Flux: throttle
95
-
using BSON:@save
95
+
```jldoctest saving
96
+
julia> using Flux: throttle
96
97
97
-
m =Chain(Dense(10=>5, relu), Dense(5=>2), softmax)
(::Flux.var"#throttled#70"{Flux.var"#throttled#66#71"{Bool, Bool, var"#1#2", Int64}}) (generic function with 1 method)
103
112
```
104
113
105
114
This will update the `"model-checkpoint.bson"` file every thirty seconds.
106
115
107
116
You can get more advanced by saving a series of models throughout training, for example
108
117
109
118
```julia
110
-
@save"model-$(now()).bson" model
119
+
julia>@save"model-$(now()).bson" model
111
120
```
112
121
113
122
will produce a series of models like `"model-2018-03-06T02:57:10.41.bson"`. You
114
123
could also store the current test set loss, so that it's easy to (for example)
115
124
revert to an older copy of the model if it starts to overfit.
116
125
117
126
```julia
118
-
@save"model-$(now()).bson" model loss =testloss()
127
+
julia>@save"model-$(now()).bson" model loss =testloss()
119
128
```
120
129
121
130
Note that to resume a model's training, you might need to restore other stateful parts of your training loop. Possible examples are stateful optimizers (which usually utilize an `IdDict` to store their state), and the randomness used to partition the original data into the training and validation sets.
@@ -124,7 +133,7 @@ You can store the optimiser state alongside the model, to resume training
124
133
exactly where you left off. BSON is smart enough to [cache values](https://github.com/JuliaIO/BSON.jl/blob/v0.3.4/src/write.jl#L71) and insert links when saving, but only if it knows everything to be saved up front. Thus models and optimizers must be saved together to have the latter work after restoring.
0 commit comments