You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
label=nothing, title="Observations - regular data and outliers")
220
221
```
221
222
222
-
We will to express our _observations_ as a `ChoiceMap` that constrains the
223
+
We will to express our _observations_ as a [`ChoiceMap`](@ref) that constrains the
223
224
values of certain random choices to equal their observed values. Here, we
224
225
want to constrain the values of the choices with address `:data => i => :y`
225
226
(that is, the sampled $y$ coordinates) to equal the observed $y$ values.
226
227
Let's write a helper function that takes in a vector of $y$ values and
227
-
creates a `ChoiceMap` that we can use to constrain our model:
228
+
creates a [`ChoiceMap`](@ref) that we can use to constrain our model:
228
229
229
230
```@example mcmc_map_tutorial
230
231
function make_constraints(ys::Vector{Float64})
@@ -282,10 +283,11 @@ algorithms for iteratively producing approximate samples from a distribution
282
283
(when applied to Bayesian inference problems, the posterior distribution of
283
284
unknown (hidden) model variables given data).
284
285
285
-
There is a rich theory behind MCMC methods, but we focus on applying MCMC in
286
-
Gen and introducing theoretical ideas only when necessary for understanding.
287
-
As we will see, Gen provides abstractions that hide and automate much of the
288
-
math necessary for implementing MCMC algorithms correctly.
286
+
There is a rich theory behind MCMC methods (see [this paper](https://doi.org/10.1023/A:1020281327116)
287
+
for an introduction), but we focus on applying MCMC in Gen, introducing
288
+
theoretical ideas only when necessary for understanding. As we will see, Gen
289
+
provides abstractions that hide and automate much of the math necessary for
290
+
implementing MCMC algorithms correctly.
289
291
290
292
The general shape of an MCMC algorithm is as follows. We begin by sampling an
291
293
intial setting of all unobserved variables; in Gen, we produce an initial
@@ -305,7 +307,7 @@ tries not to go down dead ends: it is more likely to take an exploratory step
305
307
into a low-probability region if it knows it can easily get back to where it
306
308
came from.
307
309
308
-
Gen's `metropolis_hastings` function _automatically_ adds this
310
+
Gen's [`metropolis_hastings`](@ref) function _automatically_ adds this
309
311
"accept/reject" check (including the correct computation of the probability
310
312
of acceptance or rejection), so that inference programmers need only
311
313
think about what sorts of updates might be useful to propose. Starting in
@@ -318,7 +320,8 @@ One of the simplest strategies we can use is called Resimulation MH, and it
318
320
works as follows.
319
321
320
322
We begin, as in most iterative inference algorithms, by sampling an initial
321
-
trace from our model, fixing the observed choices to their observed values.
323
+
trace from our model using the [`generate`](@ref) API function, fixing the
324
+
observed choices to their observed values.
322
325
323
326
```julia
324
327
# Gen's `generate` function accepts a model, a tuple of arguments to the model,
@@ -373,17 +376,16 @@ with the current hypothesized proportion of `is_outlier` choices that are
373
376
set to `true`.
374
377
375
378
Resimulating a block of variables is the simplest form of update that Gen's
376
-
`metropolis_hastings` operator (or `mh` for short) supports. When supplied
377
-
with a _current trace_ and a _selection_ of trace addresses to resimulate,
378
-
`mh` performs the resimulation and the appropriate accept/reject check, then
379
-
returns a possibly updated trace, along with a boolean indicating whether the
380
-
update was accepted or not. A selection is created using the `select`
381
-
method. So a single update of the scheme we proposed above would look like
382
-
this:
383
-
384
-
Perform a single block resimulation update of a trace.
379
+
[`metropolis_hastings`](@ref) operator (or [`mh`](@ref) for short) supports.
380
+
When supplied with a _current trace_ and a _selection_ of trace addresses to
381
+
resimulate, [`mh`](@ref) performs the resimulation and the appropriate
382
+
accept/reject check, then returns a possibly updated trace, along with a Boolean
383
+
indicating whether the update was accepted or not. A selection is created using
384
+
the [`select`](@ref) method. So a single update of the scheme we proposed above
385
+
would look like this:
385
386
386
387
```@example mcmc_map_tutorial
388
+
# Perform a single block resimulation update of a trace.
387
389
function block_resimulation_update(tr)
388
390
# Block 1: Update the line's parameters
389
391
line_params = select(:noise, :slope, :intercept)
@@ -463,8 +465,9 @@ end
463
465
gif(viz)
464
466
```
465
467
466
-
We can see that although the algorithm keeps changing the inferences of which points are inliers and outliers,
467
-
it has a harder time refining the continuous parameters. We address this challenge next.
468
+
We can see that although the algorithm keeps changing the inferences of which
469
+
points are inliers and outliers, it has a harder time refining the continuous
470
+
parameters. We address this challenge next.
468
471
469
472
## MCMC Part 2: Gaussian Drift MH
470
473
@@ -503,7 +506,10 @@ end
503
506
nothing # hide
504
507
```
505
508
506
-
This is often called a "Gaussian drift" proposal, because it essentially amounts to proposing steps of a random walk. (What makes it different from a random walk is that we will still use an MH accept/reject step to make sure we don't wander into areas of very low probability.)
509
+
This is often called a "Gaussian drift" proposal, because it essentially amounts
510
+
to proposing steps of a random walk. (What makes it different from a random walk
511
+
is that we will still use an MH accept/reject step to make sure we don't wander
512
+
into areas of very low probability.)
507
513
508
514
To use the proposal, we write:
509
515
@@ -520,8 +526,8 @@ Two things to note:
520
526
2. The argument list to the proposal is an empty tuple, `()`. The
521
527
`line_proposal` generative function does expect an argument, the previous
522
528
trace, but this is supplied automatically to all MH custom proposals
523
-
(a proposal generative function for use with `mh` must take as its first argument the
524
-
current trace of the model).
529
+
(a proposal generative function for use with [`mh`](@ref) must take as its
530
+
first argument the current trace of the model).
525
531
526
532
Let's swap it into our update:
527
533
@@ -634,7 +640,6 @@ struct RANSACParams
634
640
end
635
641
end
636
642
637
-
638
643
function ransac(xs::Vector{Float64}, ys::Vector{Float64}, params::RANSACParams)
639
644
best_num_inliers::Int = -1
640
645
best_slope::Float64 = NaN
@@ -681,14 +686,13 @@ nothing # hide
681
686
682
687
(Notice that although `ransac` makes random choices, they are not addressed
683
688
(and they happen outside of a Gen generative function), so Gen cannot reason
684
-
about them. This is OK (see [1]). Writing proposals that have
685
-
traced internal randomness (i.e., that make traced random choices that are
686
-
not directly used in the proposal) can lead to better inference, but requires
687
-
the use of a more complex version of Gen's `mh` operator, which is beyond the
688
-
scope of this tutorial.)
689
+
about them. This is OK (see [1]). Writing proposals that have traced internal
690
+
randomness (i.e., that make traced random choices that are not directly used
691
+
in the proposal) can lead to better inference, but requires the use of a more
692
+
complex version of Gen's [`mh`](@ref) operator, which is beyond the scope of
693
+
this tutorial.)
689
694
690
-
[1][Using probabilistic programs as
691
-
proposals](https://arxiv.org/abs/1801.03612), Marco F. Cusumano-Towner, Vikash K. Mansinghka, 2018.
695
+
[1][Using probabilistic programs as proposals](https://arxiv.org/abs/1801.03612), Marco F. Cusumano-Towner, Vikash K. Mansinghka, 2018.
692
696
693
697
One iteration of our update algorithm will now look like this:
694
698
@@ -777,12 +781,12 @@ gif(viz)
777
781
### Exercise
778
782
#### Improving the heuristic
779
783
Currently, the RANSAC heuristic does not use the current trace's information
780
-
at all. Try changing it to use the current state as follows:
781
-
Instead of a constant `eps` parameter that controls whether a point is
782
-
considered an inlier, make this decision based on the currently hypothesized
783
-
noise level. Specifically, set `eps` to be equal to the `noise` parameter of the trace.
784
+
at all. Try changing it to use the current state as follows: Instead of a
785
+
constant `eps` parameter that controls whether a point is considered an inlier,
786
+
make this decision based on the currently hypothesized noise level.
787
+
Specifically, set `eps` to be equal to the `noise` parameter of the trace.
784
788
785
-
Examine whether this improves inference (no need to respond in words here).
789
+
Examine whether this improves inference.
786
790
787
791
```@example mcmc_map_tutorial
788
792
# Modify the function below (which currently is just a copy of `ransac_proposal`)
@@ -869,8 +873,7 @@ currently classified as *inliers*, finds the line of best fit for this
869
873
subset of points, and adds some noise.
870
874
871
875
_Hint_: you can get the result for linear regression using least squares approximation by
872
-
solving a linear system using Julia's [backslash operator, `\`](https://docs.julialang.org/en/v1/base/math/#Base.:\\-Tuple{Any,Any}) (as is done in the `ransac`
873
-
function, above).
876
+
solving a linear system using Julia's [backslash operator, `\`](https://docs.julialang.org/en/v1/base/math/#Base.:\\-Tuple{Any,Any}) (as is done in the `ransac` function, above).
874
877
875
878
We provide some starter code. You can test your solution by modifying the plotting code above.
876
879
@@ -994,8 +997,7 @@ given observations.
994
997
For example, let's say we wanted to take a trace and assign each point's
995
998
`is_outlier` score to the most likely possibility. We can do this by
996
999
iterating over both possible traces, scoring them, and choosing the one with
the higher score. We can do this using Gen's [`update`](@ref) function,
999
1001
which allows us to manually update a trace to satisfy some constraints:
1000
1002
1001
1003
```@example mcmc_map_tutorial
@@ -1014,13 +1016,17 @@ end
1014
1016
nothing # hide
1015
1017
```
1016
1018
1017
-
For continuous parameters, we can use Gen's `map_optimize` function, which uses _automatic differentiation_ to shift the selected parameters in the direction that causes the probability of the trace to increase most sharply:
1019
+
For continuous parameters, we can use Gen's [`map_optimize`](@ref) function,
1020
+
which uses _automatic differentiation_ to shift the selected parameters in the
1021
+
direction that causes the probability of the trace to increase most sharply:
Putting these updates together, we can write an inference program that uses our RANSAC algorithm from above to get an initial trace, then tunes it using optimization:
1027
+
Putting these updates together, we can write an inference program that uses our
1028
+
RANSAC algorithm from above to get an initial trace, then tunes it using
1029
+
optimization:
1024
1030
1025
1031
```@example mcmc_map_tutorial
1026
1032
using StatsBase: mean
@@ -1058,7 +1064,8 @@ println("Score after ransac: $(ransac_score). Final score: $(final_score).")
1058
1064
gif(viz)
1059
1065
```
1060
1066
1061
-
Below, we evaluate the algorithm and we see that it gets our best scores yet, which is what it's meant to do:
1067
+
Below, we evaluate the algorithm and we see that it gets our best scores yet,
1068
+
which is what it's meant to do:
1062
1069
1063
1070
```@example mcmc_map_tutorial
1064
1071
function map_inference(xs, ys, observations)
@@ -1279,4 +1286,7 @@ for (index, value) in enumerate([(1, 0), (-1, 0), ransac(xs, ys_bimodal, RANSACP
1279
1286
end
1280
1287
```
1281
1288
1282
-
Although this experiment is imperfect, we can broadly see that the drift kernel often explores both modes within a single run, whereas this is rarer for the MAP kernel (in 25 runs, the MAP kernel visits on average 1.08 of the 2 modes, whereas the drift kernel visits 1.6).
1289
+
Although this experiment is imperfect, we can broadly see that the drift kernel
1290
+
often explores both modes within a single run, whereas this is rarer for the MAP
1291
+
kernel (in 25 runs, the MAP kernel visits on average 1.08 of the 2 modes,
0 commit comments