|
2 | 2 |
|
3 | 3 | Abstract types and interfaces for Markov chain Monte Carlo methods.
|
4 | 4 |
|
| 5 | +[](https://turinglang.github.io/AbstractMCMC.jl/stable) |
| 6 | +[](https://turinglang.github.io/AbstractMCMC.jl/dev) |
5 | 7 | [](https://github.com/TuringLang/AbstractMCMC.jl/actions?query=workflow%3ACI+branch%3Amaster)
|
6 | 8 | [](https://github.com/TuringLang/AbstractMCMC.jl/actions?query=workflow%3AIntegrationTest+branch%3Amaster)
|
7 | 9 | [](https://codecov.io/gh/TuringLang/AbstractMCMC.jl)
|
8 | 10 | [](https://coveralls.io/github/TuringLang/AbstractMCMC.jl?branch=master)
|
9 |
| - |
10 |
| -## Overview |
11 |
| - |
12 |
| -AbstractMCMC defines an interface for sampling and combining Markov chains. |
13 |
| -It comes with a default sampling algorithm that provides support of progress |
14 |
| -bars, parallel sampling (multithreaded and multicore), and user-provided callbacks |
15 |
| -out of the box. Typically developers only have to define the sampling step |
16 |
| -of their inference method in an iterator-like fashion to make use of this |
17 |
| -functionality. Additionally, the package defines an iterator and a transducer |
18 |
| -for sampling Markov chains based on the interface. |
19 |
| - |
20 |
| -## User-facing API |
21 |
| - |
22 |
| -The user-facing sampling API consists of |
23 |
| -```julia |
24 |
| -StatsBase.sample( |
25 |
| - [rng::Random.AbstractRNG,] |
26 |
| - model::AbstractMCMC.AbstractModel, |
27 |
| - sampler::AbstractMCMC.AbstractSampler, |
28 |
| - nsamples[; |
29 |
| - kwargs...] |
30 |
| -) |
31 |
| -``` |
32 |
| -and |
33 |
| -```julia |
34 |
| -StatsBase.sample( |
35 |
| - [rng::Random.AbstractRNG,] |
36 |
| - model::AbstractMCMC.AbstractModel, |
37 |
| - sampler::AbstractMCMC.AbstractSampler, |
38 |
| - parallel::AbstractMCMC.AbstractMCMCParallel, |
39 |
| - nsamples::Integer, |
40 |
| - nchains::Integer[; |
41 |
| - kwargs...] |
42 |
| -) |
43 |
| -``` |
44 |
| -for regular and parallel sampling, respectively. In regular sampling, users may |
45 |
| -provide a function |
46 |
| -```julia |
47 |
| -isdone(rng, model, sampler, samples, state, iteration; kwargs...) |
48 |
| -``` |
49 |
| -that returns `true` when sampling should end, and `false` otherwise, instead of |
50 |
| -a fixed number of samples `nsamples`. AbstractMCMC defines the abstract types |
51 |
| -`AbstractMCMC.AbstractModel`, `AbstractMCMC.AbstractSampler`, and |
52 |
| -`AbstractMCMC.AbstractMCMCParallel` for models, samplers, and parallel sampling |
53 |
| -algorithms, respectively. Two algorithms `MCMCThreads` and `MCMCDistributed` |
54 |
| -are provided for parallel sampling with multiple threads and multiple processes, |
55 |
| -respectively. |
56 |
| - |
57 |
| -The function |
58 |
| -```julia |
59 |
| -AbstractMCMC.steps([rng::AbstractRNG, ]model::AbstractModel, sampler::AbstractSampler[; kwargs...]) |
60 |
| -``` |
61 |
| -returns an iterator that returns samples continuously, without a predefined |
62 |
| -stopping condition. Similarly, |
63 |
| -```julia |
64 |
| -AbstractMCMC.Sample([rng::Random.AbstractRNG, ]model::AbstractModel, sampler::AbstractSampler[; kwargs...]) |
65 |
| -``` |
66 |
| -returns a transducer that returns samples continuously. |
67 |
| - |
68 |
| -Common keyword arguments for regular and parallel sampling (not supported by the iterator and transducer) |
69 |
| -are: |
70 |
| -- `progress` (default: `AbstractMCMC.PROGRESS[]` which is `true` initially): toggles progress logging |
71 |
| -- `chain_type` (default: `Any`): determines the type of the returned chain |
72 |
| -- `callback` (default: `nothing`): if `callback !== nothing`, then |
73 |
| - `callback(rng, model, sampler, sample, state, iteration; kwargs...)` is called after every sampling step, |
74 |
| - where `sample` is the most recent sample of the Markov chain and `iteration` is the current iteration |
75 |
| -- `discard_initial` (default: `0`): number of initial samples that are discarded |
76 |
| -- `thinning` (default: `1`): factor by which to thin samples. |
77 |
| - |
78 |
| -Progress logging can be enabled and disabled globally with `AbstractMCMC.setprogress!(progress)`. |
79 |
| - |
80 |
| -Additionally, AbstractMCMC defines the abstract type `AbstractChains` for Markov chains and the |
81 |
| -method `AbstractMCMC.chainscat(::AbstractChains...)` for concatenating multiple chains. |
82 |
| -(defaults to `cat(::AbstractChains...; dims = 3)`). |
83 |
| - |
84 |
| -Note that AbstractMCMC exports only `MCMCThreads` and `MCMCDistributed` (and in |
85 |
| -particular not `StatsBase.sample`). |
86 |
| - |
87 |
| -## Developer documentation: Default implementation |
88 |
| - |
89 |
| -AbstractMCMC provides a default implementation of the user-facing interface described |
90 |
| -above. You can completely neglect these and define your own implementation of the |
91 |
| -interface. However, as described below, in most use cases the default implementation |
92 |
| -allows you to obtain support of parallel sampling, progress logging, callbacks, iterators, |
93 |
| -and transducers for free by just defining the sampling step of your inference algorithm, |
94 |
| -drastically reducing the amount of code you have to write. In general, the docstrings |
95 |
| -of the functions described below might be helpful if you intend to make use of the default |
96 |
| -implementations. |
97 |
| - |
98 |
| -### Basic structure |
99 |
| - |
100 |
| -The simplified structure for regular sampling (the actual implementation contains |
101 |
| -some additional error checks and support for progress logging and callbacks) is |
102 |
| -```julia |
103 |
| -StatsBase.sample( |
104 |
| - rng::Random.AbstractRNG, |
105 |
| - model::AbstractMCMC.AbstractModel, |
106 |
| - sampler::AbstractMCMC.AbstractSampler, |
107 |
| - nsamples::Integer; |
108 |
| - chain_type = ::Type{Any}, |
109 |
| - kwargs... |
110 |
| -) |
111 |
| - # Obtain the initial sample and state. |
112 |
| - sample, state = AbstractMCMC.step(rng, model, sampler; kwargs...) |
113 |
| - |
114 |
| - # Save the sample. |
115 |
| - samples = AbstractMCMC.samples(sample, model, sampler, N; kwargs...) |
116 |
| - samples = AbstractMCMC.save!!(samples, sample, 1, model, sampler, N; kwargs...) |
117 |
| - |
118 |
| - # Step through the sampler. |
119 |
| - for i in 2:N |
120 |
| - # Obtain the next sample and state. |
121 |
| - sample, state = AbstractMCMC.step(rng, model, sampler, state; kwargs...) |
122 |
| - |
123 |
| - # Save the sample. |
124 |
| - samples = AbstractMCMC.save!!(samples, sample, i, model, sampler, N; kwargs...) |
125 |
| - end |
126 |
| - |
127 |
| - return AbstractMCMC.bundle_samples(samples, model, sampler, state, chain_type; kwargs...) |
128 |
| -end |
129 |
| -``` |
130 |
| -All other default implementations make use of the same structure and in particular |
131 |
| -call the same methods. |
132 |
| - |
133 |
| -### Sampling step |
134 |
| - |
135 |
| -The only method for which no default implementation is provided (and hence which |
136 |
| -downstream packages *have* to implement) is `AbstractMCMC.step` |
137 |
| -that defines the sampling step of the inference method. In the initial step it is |
138 |
| -called as |
139 |
| -```julia |
140 |
| -AbstractMCMC.step(rng, model, sampler; kwargs...) |
141 |
| -``` |
142 |
| -whereas in all subsequent steps it is called as |
143 |
| -```julia |
144 |
| -AbstractMCMC.step(rng, model, sampler, state; kwargs...) |
145 |
| -``` |
146 |
| -where `state` denotes the current state of the sampling algorithm. It should return |
147 |
| -a 2-tuple consisting of the next sample and the updated state of the sampling algorithm. |
148 |
| -Hence `AbstractMCMC.step` can be viewed as an extended version of |
149 |
| -[`Base.iterate`](https://docs.julialang.org/en/v1/base/collections/#lib-collections-iteration-1) |
150 |
| -with additional positional and keyword arguments. |
151 |
| - |
152 |
| -### Collecting samples (does not apply to the iterator and transducer) |
153 |
| - |
154 |
| -After the initial sample is obtained, the default implementations for regular and parallel sampling |
155 |
| -(not for the iterator and the transducer since it is not needed there) create a container for all |
156 |
| -samples (the initial one and all subsequent samples) using `AbstractMCMC.samples`. By default, |
157 |
| -`AbstractMCMC.samples` just returns a concretely typed `Vector` with the initial sample as single |
158 |
| -entry. If the total number of samples is fixed, we use `sizehint!` to suggest that the container |
159 |
| -reserves capacity for all samples to improve performance. |
160 |
| - |
161 |
| -In each step, the sample is saved in the container by `AbstractMCMC.save!!`. The notation `!!` |
162 |
| -follows the convention of the package [BangBang.jl](https://github.com/JuliaFolds/BangBang.jl) |
163 |
| -which is used in the default implementation of `AbstractMCMC.save!!`. It indicates that the |
164 |
| -sample is pushed to the container but a "widening" fallback is used if the container type |
165 |
| -does not allow to save the sample. Therefore `AbstractMCMC.save!!` *always has* to return the container. |
166 |
| - |
167 |
| -For most use cases the default implementation of `AbstractMCMC.samples` and `AbstractMCMC.save!!` |
168 |
| -should work out of the box and hence need not to be overloaded in downstream code. Please have |
169 |
| -a look at the docstrings of `AbstractMCMC.samples` and `AbstractMCMC.save!!` if you intend |
170 |
| -to overload these functions. |
171 |
| - |
172 |
| -### Creating chains (does not apply to the iterator and transducer) |
173 |
| - |
174 |
| -At the end of the sampling procedure for regular and paralle sampling (not for the iterator |
175 |
| -and the transducer) we transform the collection of samples to the desired output type by |
176 |
| -calling |
177 |
| -```julia |
178 |
| -AbstractMCMC.bundle_samples(samples, model, sampler, state, chain_type; kwargs...) |
179 |
| -``` |
180 |
| -where `samples` is the collection of samples, `state` is the final state of the sampler, |
181 |
| -and `chain_type` is the desired return type. The default implementation in AbstractMCMC |
182 |
| -just returns the collection `samples`. |
183 |
| - |
184 |
| -`sample` will log the start and stop time (in Unix timestamp format) for a chain, and pass the sampling statistics to the chain with `bundle_samples(...; stats=stats)`. The sample time information can be retrieved with `stats.start`, `stats.stop`, and `stats.duration`. |
185 |
| - |
186 |
| -The default implementation should be fine in most use cases, but downstream packages |
187 |
| -could, e.g., save the final state of the sampler as well if they overload `AbstractMCMC.bundle_samples`. |
0 commit comments