Skip to content

Commit df6f4ce

Browse files
sunxd3github-actions[bot]yebai
authored
Update autodiff.jmd following adaptation of ADTypes (TuringLang#430)
* Update autodiff.jmd following adaptation of `ADTypes` TuringLang/Turing.jl#2047 * Update tutorials/docs-10-using-turing-autodiff/autodiff.jmd Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * Using `ADTypes` for ad doc * Update tutorials/docs-10-using-turing-autodiff/autodiff.jmd Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * Update tutorials/docs-10-using-turing-autodiff/autodiff.jmd Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Hong Ge <[email protected]> * Update tutorials/docs-10-using-turing-autodiff/autodiff.jmd * Update tutorials/docs-10-using-turing-autodiff/autodiff.jmd Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * Apply suggestions from code review * Update tutorials/docs-10-using-turing-autodiff/autodiff.jmd * Update tutorials/docs-10-using-turing-autodiff/autodiff.jmd * Update tutorials/docs-10-using-turing-autodiff/autodiff.jmd * Update tutorials/docs-10-using-turing-autodiff/autodiff.jmd Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Hong Ge <[email protected]>
1 parent 86ab4e5 commit df6f4ce

File tree

2 files changed

+21
-5
lines changed

2 files changed

+21
-5
lines changed
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,4 @@
11
[deps]
22
StatsPlots = "f3b207a7-027a-5e70-b257-86293d7955fd"
33
Turing = "fce5fe82-541a-59a6-adf8-730c64b5f9a0"
4+
ReverseDiff = "37e2e3b7-166d-5795-8a7a-e32c996b4267"

tutorials/docs-10-using-turing-autodiff/autodiff.jmd

Lines changed: 20 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -9,21 +9,33 @@ weave_options:
99

1010
## Switching AD Modes
1111

12-
Turing supports four automatic differentiation (AD) packages in the back end during sampling. The default AD backend is [ForwardDiff](https://github.com/JuliaDiff/ForwardDiff.jl) for forward-mode AD. Three reverse-mode AD backends are also supported, namely [Tracker](https://github.com/FluxML/Tracker.jl), [Zygote](https://github.com/FluxML/Zygote.jl) and [ReverseDiff](https://github.com/JuliaDiff/ReverseDiff.jl). `Zygote` and `ReverseDiff` are supported optionally if explicitly loaded by the user with `using Zygote` or `using ReverseDiff` next to `using Turing`.
12+
Turing currently supports four automatic differentiation (AD) backends for sampling: [ForwardDiff](https://github.com/JuliaDiff/ForwardDiff.jl) for forward-mode AD; and [ReverseDiff](https://github.com/JuliaDiff/ReverseDiff.jl), [Zygote](https://github.com/FluxML/Zygote.jl), and [Tracker](https://github.com/FluxML/Tracker.jl) for reverse-mode AD.
13+
While `Tracker` is still available, its use is discouraged due to a lack of active maintenance.
14+
`ForwardDiff` is automatically imported by Turing. To utilize `Zygote` or `ReverseDiff` for AD, users must explicitly import them with `using Zygote` or `using ReverseDiff`, alongside `using Turing`.
15+
16+
As of Turing version v0.30, the global configuration flag for the AD backend has been removed in favour of [`AdTypes.jl`](https://github.com/SciML/ADTypes.jl), allowing users to specify the AD backend for individual samplers independently.
17+
Users can pass the `adtype` keyword argument to the sampler constructor to select the desired AD backend, with the default being `AutoForwardDiff(; chunksize=0)`.
18+
19+
For `ForwardDiff`, pass `adtype=AutoForwardDiff(; chunksize)` to the sampler constructor. A `chunksize` of 0 permits the chunk size to be automatically determined. For more information regarding the selection of `chunksize`, please refer to [related section of `ForwardDiff`'s documentation](https://juliadiff.org/ForwardDiff.jl/dev/user/advanced/#Configuring-Chunk-Size).
20+
For `ReverseDiff`, pass `adtype=AutoReverseDiff()` to the sampler constructor. An additional argument can be provided to `AutoReverseDiff` to specify whether to to compile the tape only once and cache it for later use (`false` by default, which means no caching tape). Be aware that the use of caching in certain types of models can lead to incorrect results and/or errors.
1321

14-
To switch between the different AD backends, one can call the function `Turing.setadbackend(backend_sym)`, where `backend_sym` can be `:forwarddiff` (`ForwardDiff`), `:tracker` (`Tracker`), `:zygote` (`Zygote`) or `:reversediff` (`ReverseDiff.jl`). When using `ReverseDiff`, to compile the tape only once and cache it for later use, the user has to call `Turing.setrdcache(true)`. However, note that the use of caching in certain types of models can lead to incorrect results and/or errors.
1522
Compiled tapes should only be used if you are absolutely certain that the computation doesn't change between different executions of your model.
1623
Thus, e.g., in the model definition and all im- and explicitly called functions in the model all loops should be of fixed size, and `if`-statements should consistently execute the same branches.
1724
For instance, `if`-statements with conditions that can be determined at compile time or conditions that depend only on the data will always execute the same branches during sampling (if the data is constant throughout sampling and, e.g., no mini-batching is used).
1825
However, `if`-statements that depend on the model parameters can take different branches during sampling; hence, the compiled tape might be incorrect.
1926
Thus you must not use compiled tapes when your model makes decisions based on the model parameters, and you should be careful if you compute functions of parameters that those functions do not have branching which might cause them to execute different code for different values of the parameter.
2027

28+
For `Zygote`, pass `adtype=AutoZygote()` to the sampler constructor.
29+
30+
And the previously used interface functions including `ADBackend`, `setadbackend`, `setsafe`, `setchunksize`, and `setrdcache` are deprecated and removed.
31+
2132
## Compositional Sampling with Differing AD Modes
2233

23-
Turing supports intermixed automatic differentiation methods for different variable spaces. The snippet below shows using `ForwardDiff` to sample the mean (`m`) parameter and using the Tracker-based `TrackerAD` autodiff for the variance (`s`) parameter:
34+
Turing supports intermixed automatic differentiation methods for different variable spaces. The snippet below shows using `ForwardDiff` to sample the mean (`m`) parameter, and using `ReverseDiff` for the variance (`s`) parameter:
2435

2536
```julia
2637
using Turing
38+
using ReverseDiff
2739

2840
# Define a simple Normal model with unknown mean and variance.
2941
@model function gdemo(x, y)
@@ -36,11 +48,14 @@ end
3648
# Sample using Gibbs and varying autodiff backends.
3749
c = sample(
3850
gdemo(1.5, 2),
39-
Gibbs(HMC{Turing.ForwardDiffAD{1}}(0.1, 5, :m), HMC{Turing.TrackerAD}(0.1, 5, :s²)),
51+
Gibbs(
52+
HMC(0.1, 5, :m; adtype=AutoForwardDiff(; chunksize=0)),
53+
HMC(0.1, 5, :s²; adtype=AutoReverseDiff(false)),
54+
),
4055
1000,
4156
)
4257
```
4358

44-
Generally, `TrackerAD` is faster when sampling from variables of high dimensionality (greater than 20), and `ForwardDiffAD` is more efficient for lower-dimension variables. This functionality allows those who are performance sensitive to fine-tune their automatic differentiation for their specific models.
59+
Generally, reverse-mode AD, for instance `ReverseDiff`, is faster when sampling from variables of high dimensionality (greater than 20), while forward-mode AD, for instance `ForwardDiff`, is more efficient for lower-dimension variables. This functionality allows those who are performance sensitive to fine tune their automatic differentiation for their specific models.
4560

4661
If the differentiation method is not specified in this way, Turing will default to using whatever the global AD backend is. Currently, this defaults to `ForwardDiff`.

0 commit comments

Comments
 (0)