@@ -10,246 +10,4 @@ programming languages (especially their modeling languages). The overall goals
10
10
abstract type and minimal set of functions that will be supported all model and trace types. Some
11
11
other commonly used code, such as variable names, can also go here.
12
12
13
-
14
- ## ` AbstractProbabilisticProgram ` interface
15
-
16
- There are at least two incompatible conventions used for the term “model”: in Turing.jl, it is an
17
- instantiated “conditional distribution” object with fixed values for parameters and observations,
18
- while in Soss.jl, it is the raw symbolic structure from which distributions can be derived.
19
-
20
- Relevant discussions:
21
- [ 1] ( https://julialang.zulipchat.com/#narrow/stream/234072-probprog/topic/Naming.20the.20.22likelihood.22.20thingy ) , [ 2] ( https://github.com/TuringLang/AbstractPPL.jl/discussions/10 ) .
22
-
23
-
24
- ### Traces & probability expressions
25
-
26
- Models are always, at least in a theoretical sense, distributions over * traces* – types which carry
27
- collections of values together with their names. Existing realizations of these are ` VarInfo ` in
28
- Turing.jl, choice maps in Gen.jl, and the usage of named tuples in Soss.jl.
29
-
30
- Traces solve the problem of having to name random variables in function calls, and in samples from
31
- models. In essence, every concrete trace type will just be a fancy kind of dictionary from variable
32
- names (ideally, ` VarName ` s) to values.
33
-
34
- ``` julia
35
- t = @T (Y[1 ] = ... , Z = ... )
36
- ```
37
-
38
- Note that this needs to be a macro, if written this way, since the keys may themselves be more
39
- complex than just symbols (e.g., indexed variables.) (Don’t hang yourselves up on that ` @T ` name
40
- though, this is just a working draft.)
41
-
42
- The idea here is to standardize the construction (and manipulation) of * abstract probability
43
- expressions* , plus the interface for turning them into concrete traces for a specific model – like
44
- [ ` @formula ` ] ( https://juliastats.org/StatsModels.jl/stable/formula/#Modeling-tabular-data ) and
45
- [ ` apply_schema ` ] ( https://juliastats.org/StatsModels.jl/stable/internals/#Semantics-time-(apply_schema) )
46
- from StatsModels.jl are doing.
47
-
48
- Maybe the following would suffice to do that:
49
-
50
- ``` julia
51
- maketrace (m, t):: tracetype (m, t)
52
- ```
53
-
54
- where ` maketrace ` produces a concrete trace corresponding to ` t ` for the model ` m ` , and ` tracetype `
55
- is the corresponding ` eltype ` –like function giving you the concrete trace type for a certain model
56
- and probability expression combination.
57
-
58
- Possible extensions of this idea:
59
-
60
- - Pearl-style do-notation: ` @T(Y = y | do(X = x)) `
61
- - Allowing free variables, to specify model transformations: ` query(m, @T(X | Y)) `
62
- - “Graph queries”: ` @T(X | Parents(X)) ` , ` @T(Y | Not(X)) ` (a nice way to express Gibbs conditionals!)
63
- - Predicate style for “measure queries”: ` @T(X < Y + Z) `
64
-
65
- The latter applications are the reason I originally liked the idea of the macro being called ` @P `
66
- (or even ` @𝓅 ` or ` @ℙ ` ), since then it would look like a “Bayesian probability expression”: `@P(X <
67
- Y + Z)`. But this would not be so meaningful in the case of representing a trace instance.
68
-
69
- Perhaps both ` @T ` and ` @P ` can coexist, and both produce different kinds of ` ProbabilityExpression `
70
- objects?
71
-
72
- NB: the exact details of this kind of “schema application”, and what results from it, will need to
73
- be specified in the interface of ` AbstractModelTrace ` , aka “the new ` VarInfo ` ”.
74
-
75
-
76
- ### “Conversions”
77
-
78
- The purpose of this part is to provide common names for how we want a model instance to be
79
- understood. In some modelling languages, model instances are primarily generative or “joint”, with
80
- some parameters fixed (e.g. in Soss.jl), while other instance types pair model instances conditioned
81
- on observations (e.g. Turing.jl’s models).
82
-
83
- Let’s start from a generative model:
84
-
85
- ``` julia
86
- # (hypothetical) generative spec a la Soss
87
- @generativemodel function foo_gen (μ)
88
- X ~ Normal (0 , μ)
89
- Y[1 ] ~ Normal (X)
90
- Y[2 ] ~ Normal (X + 1 )
91
- end
92
- ```
93
-
94
- Applying the “constructor” ` foo_gen ` now means to fix the parameters, and should return a concrete
95
- object of the generative type (a ` JointDistribution ` in Soss.jl):
96
-
97
- ``` julia
98
- g = foo_gen (μ= … ):: GenerativeModel
99
- ```
100
-
101
- With this kind of object, we should be able to sample and calculate joint log-densities from, i.e.,
102
- over the combined trace space of ` X ` , ` Y[1] ` , and ` Y[2] ` .
103
-
104
- For model types that contain enough structural information, it should then be possible to condition
105
- on observed values and obtain a conditioned model:
106
-
107
- ``` julia
108
- condition (g, @T (Y = ... )):: ConditionedModel
109
- ```
110
-
111
- For this operation, there will probably exist syntactic sugar in the form of
112
-
113
- ``` julia
114
- g | @T (Y = ... )
115
- ```
116
-
117
- Now, if we start from a Turing.jl-like model instead, with the “observation part” already specified,
118
- we have a situation like this, with the observation fixed in the instantiation:
119
-
120
- ``` julia
121
- # conditioned spec a la DPPL
122
- @model function foo (Y, μ)
123
- X ~ Normal (0 , μ)
124
- Y[1 ] ~ Normal (X)
125
- Y[2 ] ~ Normal (X + 1 )
126
- end
127
-
128
- m = foo (Y= … , μ= … ):: ConditionedModel
129
- ```
130
-
131
- From this we can, if supported, go back to the generative form via ` decondition ` , and back via ` condition ` :
132
-
133
- ``` julia
134
- decondition (m) == g:: GenerativeModel
135
- m == condition (g, @T (Y = ... ))
136
- ```
137
-
138
-
139
- In the case of Turing.jl, the object ` m ` would at the same time contain the information about the
140
- generative and posterior distribution ` condition ` and ` decondition ` can simply return different
141
- kinds of “tagged” model types which put the model specification into a certain context.
142
-
143
- Soss.jl pretty much already works like the examples above, with one model object being either a
144
- ` JointModel ` or a ` ConditionedModel ` , and the ` | ` syntax just being sugar for the latter.
145
-
146
- A hypothetical ` DensityModel ` , or something like the types from LogDensityProblems.jl, would be a
147
- case for a model type that does not support the structural operations ` condition ` and
148
- ` decondition ` .
149
-
150
-
151
- ### Sampling
152
-
153
- For sampling, model instances are assumed to implement the ` AbstractMCMC ` interface – i.e., at least
154
- [ ` step ` ] ( https://github.com/TuringLang/AbstractMCMC.jl#sampling-step ) , and accordingly ` sample ` ,
155
- ` steps ` , ` Samples ` . The most important aspect is ` sample ` , though, which plays the role of ` rand `
156
- for distributions.
157
-
158
- The results of ` sample ` generalize ` rand ` – while ` rand(d, N) ` is assumed to give you iid samples,
159
- ` sample(m, sampler, N) ` returns a sample from a (Markov) chain of length ` N ` approximating ` m ` ’s
160
- distribution by a specific sampling algorithm (which of course subsumes the case that ` m ` can be
161
- sampled from exactly, in which case the “chain” actually is iid).
162
-
163
- Depending on which kind of sampling is supported, several methods may be supported. In the case of
164
- a (posterior) ` ConditionedModel ` with no known exact sampling possible, we just have what is given
165
- through ` AbstractMCMC ` :
166
-
167
- ``` julia
168
- sample ([rng], m, N, sampler; [args… ]) # chain of length N using `sampler`
169
- ```
170
-
171
- In the case of a generative model, or a posterior model with exact solution, we can have some more
172
- methods without the need to specify a sampler:
173
-
174
- ``` julia
175
- sample ([rng], m; [args… ]) # one random sample
176
- sample ([rng], m, N; [args… ]) # N iid samples; equivalent to `rand` in certain cases
177
- ```
178
-
179
- It should be possible to implement this by a special sampler ` Exact ` (name still to be discussed),
180
- that can then also be reused for generative sampling:
181
-
182
- ```
183
- step(g, spl = Exact(), state = nothing) # IID sample from exact distribution with trivial state
184
- sample(g, Exact(), [N])
185
- ```
186
-
187
- with dispatch failing for models types for which exact sampling is not possible (or implemented).
188
-
189
- This could even be useful for Monte Carlo methods not being based on Markov Chains, e.g.,
190
- particle-based sampling using a return type with weights, or rejection sampling.
191
-
192
- Not all variants need to be supported – for example, a posterior model might not support
193
- ` sample(m) ` when exact sampling is not possible, only ` sample(m, N, alg) ` for Markov chains.
194
-
195
- ` rand ` is then just a special case when “trivial” exact sampling works for a model, e.g. a joint
196
- model.
197
-
198
-
199
- ### Density Calculation
200
-
201
- Since the different “contexts” of how a model is to be understood are to be expressed in the type,
202
- there should be no need for separate functions ` logjoint ` , ` loglikelihood ` , etc., but one
203
- ` logdensity ` suffice for all. Note that this generalizes ` logpdf ` , too, since the posterior density
204
- will of course in general be unnormalized.
205
-
206
- The evaluation will usually work with the internal, concrete trace type, like ` VarInfo ` in Turing.jl:
207
-
208
- ``` julia
209
- logdensity (m, vi)
210
- ```
211
-
212
- But the user will more likely work on the interface using probability expressions:
213
-
214
- ``` julia
215
- logdensity (m, @T (X = ... ))
216
- ```
217
-
218
- (Note that this could replace the current ` prob ` string macro in Turing.jl.)
219
-
220
- It should be able to make this fall back on the internal method with the right definition and
221
- implementation of ` maketrace ` :
222
-
223
- ``` julia
224
- logdensity (m, t:: ProbabilityExpression ) = logdensity (m, maketrace (m, t))
225
- ```
226
-
227
- There is one open question – should normalized and unnormalized densities be able to be
228
- distinguished? This could be done by dispatch as well, e.g., if the caller wants to make sure normalization:
229
-
230
- ```
231
- logdensity(g, @T(X = ..., Y = ..., Z = ...); normalized=Val{true})
232
- ```
233
-
234
- Although there is proably a better way through traits; maybe like for arrays, with
235
- ` NormalizationStyle(g, t) = IsNormalized() ` ?
236
-
237
-
238
- ## TL/DR:
239
-
240
- - Probability expressions: ` @T ` and ` maketrace `
241
- - ` condition(::Model, ::Trace) -> ConditionedModel `
242
- - ` decondition(::ConditionedModel) -> GenerativeModel `
243
- - ` sample(::Model, ::Sampler = Exact(), [Int]) `
244
- - ` logdensity(::Model, ::Trace) `
245
-
246
- Decomposing models into prior and observation distributions is not yet specified; the former is
247
- rather easy, since it is only a marginal of the generative distribution, while the latter requires
248
- more structural information. Perhaps both can be generalized under the ` query ` function I have
249
- hinted to above.
250
-
251
-
252
-
253
-
254
-
255
-
13
+ See [ interface draft] ( interface.md ) .
0 commit comments