Skip to content

Commit df5e770

Browse files
mcabbottCarloLucibellodarsnack
authored
Expand readme, link to Lux.jl (#73)
* expand readme * where's mr clippy when you need him? * Apply suggestions from code review Co-authored-by: Carlo Lucibello <[email protected]> Co-authored-by: Kyle Daruwalla <[email protected]> Co-authored-by: Carlo Lucibello <[email protected]> Co-authored-by: Kyle Daruwalla <[email protected]>
1 parent 694bb9b commit df5e770

File tree

1 file changed

+26
-3
lines changed

1 file changed

+26
-3
lines changed

README.md

Lines changed: 26 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -20,14 +20,37 @@
2020
Optimisers.jl defines many standard gradient-based optimisation rules, and tools for applying them to deeply nested models.
2121

2222
This is the future of training for [Flux.jl](https://github.com/FluxML/Flux.jl) neural networks,
23-
but it can be used separately on anything understood by [Functors.jl](https://github.com/FluxML/Functors.jl).
23+
and the present for [Lux.jl](https://github.com/avik-pal/Lux.jl).
24+
But it can be used separately on anything understood by [Functors.jl](https://github.com/FluxML/Functors.jl).
2425

2526
## Installation
2627

2728
```julia
28-
]add Optimisers
29+
] add Optimisers
2930
```
3031

3132
## Usage
3233

33-
Find out more about using Optimisers.jl [in the docs](https://fluxml.ai/Optimisers.jl/dev/).
34+
The core idea is that optimiser state (such as momentum) is explicitly handled.
35+
It is initialised by `setup`, and then at each step, `update` returns both the new
36+
state, and the model with its trainable parameters adjusted:
37+
38+
```julia
39+
state = Optimisers.setup(Optimisers.ADAM(), model) # just once
40+
41+
state, model = Optimisers.update(state, model, grad) # at every step
42+
```
43+
44+
For models with deeply nested layers containing the parameters (like [Flux.jl](https://github.com/FluxML/Flux.jl) models),
45+
this state is a similarly nested tree.
46+
The function `destructure` collects all the trainable parameters into one vector,
47+
and returns this along with a function to re-build a similar model:
48+
49+
```julia
50+
vector, re = Optimisers.destructure(model)
51+
52+
model2 = re(2 .* vector)
53+
```
54+
55+
[The documentation](https://fluxml.ai/Optimisers.jl/dev/) explains usage in more detail,
56+
describes all the optimization rules, and shows how to define new ones.

0 commit comments

Comments
 (0)