Skip to content

Commit a12fb9c

Browse files
Introduction to JuliaGPs + Turing (#423)
* Initial pass * Applying formatting suggestions Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * Tweak AD code and discussion --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
1 parent 4482872 commit a12fb9c

File tree

3 files changed

+2250
-0
lines changed

3 files changed

+2250
-0
lines changed
Lines changed: 165 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,165 @@
1+
---
2+
title: Introduction to Gaussian Processes using JuliaGPs and Turing.jl
3+
permalink: /tutorials/:name/
4+
redirect_from: tutorials/15-gaussian-processes/
5+
weave_options:
6+
error : false
7+
---
8+
9+
[JuliaGPs](https://github.com/JuliaGaussianProcesses/#welcome-to-juliagps) packages integrate well with Turing.jl because they implement the Distributions.jl
10+
interface.
11+
You should be able to understand what is going on in this tutorial if you know what a GP is.
12+
For a more in-depth understanding of the
13+
[JuliaGPs](https://github.com/JuliaGaussianProcesses/#welcome-to-juliagps) functionality
14+
used here, please consult the
15+
[JuliaGPs](https://github.com/JuliaGaussianProcesses/#welcome-to-juliagps) docs.
16+
17+
In this tutorial, we will model the putting dataset discussed in chapter 21 of
18+
[Bayesian Data Analysis](http://www.stat.columbia.edu/%7Egelman/book/).
19+
The dataset comprises the result of measuring how often a golfer successfully gets the ball
20+
in the hole, depending on how far away from it they are.
21+
The goal of inference is to estimate the probability of any given shot being successful at a
22+
given distance.
23+
24+
Let's download the data and take a look at it:
25+
26+
```julia
27+
using CSV, DataDeps, DataFrames
28+
29+
ENV["DATADEPS_ALWAYS_ACCEPT"] = true
30+
register(
31+
DataDep(
32+
"putting",
33+
"Putting data from BDA",
34+
"http://www.stat.columbia.edu/~gelman/book/data/golf.dat",
35+
"fc28d83896af7094d765789714524d5a389532279b64902866574079c1a977cc",
36+
),
37+
)
38+
39+
fname = joinpath(datadep"putting", "golf.dat")
40+
df = CSV.read(fname, DataFrame; delim=' ', ignorerepeated=true)
41+
df[1:5, :]
42+
```
43+
44+
We've printed the first 5 rows of the dataset (which comprises only 19 rows in total).
45+
Observe it has three columns:
46+
47+
1. `distance` -- how far away from the hole. I'll refer to `distance` as `d` throughout the rest of this tutorial
48+
2. `n` -- how many shots were taken from a given distance
49+
3. `y` -- how many shots were successful from a given distance
50+
51+
We will use a Binomial model for the data, whose success probability is parametrised by a
52+
transformation of a GP. Something along the lines of:
53+
$$
54+
f \sim \operatorname{GP}(0, k) \\
55+
y_j \mid f(d_j) \sim \operatorname{Binomial}(n_j, g(f(d_j))) \\
56+
g(x) := \frac{1}{1 + e^{-x}}
57+
$$
58+
59+
To do this, let's define our Turing.jl model:
60+
61+
```julia
62+
using AbstractGPs, LogExpFunctions, Turing
63+
64+
@model function putting_model(d, n; jitter=1e-4)
65+
v ~ Gamma(2, 1)
66+
l ~ Gamma(4, 1)
67+
f = GP(v * with_lengthscale(SEKernel(), l))
68+
f_latent ~ f(d, jitter)
69+
y ~ product_distribution(Binomial.(n, logistic.(f_latent)))
70+
return (fx=f(d, jitter), f_latent=f_latent, y=y)
71+
end
72+
```
73+
74+
We first define an `AbstractGPs.GP`, which represents a distribution over functions, and
75+
is entirely separate from Turing.jl.
76+
We place a prior over its variance `v` and length-scale `l`.
77+
`f(d, jitter)` constructs the multivariate Gaussian comprising the random variables
78+
in `f` whose indices are in `d` (+ a bit of independent Gaussian noise with variance
79+
`jitter` -- see [the docs](https://juliagaussianprocesses.github.io/AbstractGPs.jl/dev/api/#FiniteGP-and-AbstractGP)
80+
for more details).
81+
`f(d, jitter) isa AbstractMvNormal`, and is the bit of AbstractGPs.jl that implements the
82+
Distributions.jl interface, so it's legal to put it on the right hand side
83+
of a `~`.
84+
From this you should deduce that `f_latent` is distributed according to a multivariate
85+
Gaussian.
86+
The remaining lines comprise standard Turing.jl code that is encountered in other tutorials
87+
and Turing documentation.
88+
89+
Before performing inference, we might want to inspect the prior that our model places over
90+
the data, to see whether there is anything that is obviously wrong.
91+
These kinds of prior predictive checks are straightforward to perform using Turing.jl, since
92+
it is possible to sample from the prior easily by just calling the model:
93+
94+
```julia
95+
m = putting_model(Float64.(df.distance), df.n)
96+
m().y
97+
```
98+
99+
We make use of this to see what kinds of datasets we simulate from the prior:
100+
101+
```julia
102+
using Plots
103+
104+
function plot_data(d, n, y, xticks, yticks)
105+
ylims = (0, round(maximum(n), RoundUp; sigdigits=2))
106+
margin = -0.5 * Plots.mm
107+
plt = plot(; xticks=xticks, yticks=yticks, ylims=ylims, margin=margin, grid=false)
108+
bar!(plt, d, n; color=:red, label="", alpha=0.5)
109+
bar!(plt, d, y; label="", color=:blue, alpha=0.7)
110+
return plt
111+
end
112+
113+
# Construct model and run some prior predictive checks.
114+
m = putting_model(Float64.(df.distance), df.n)
115+
hists = map(1:20) do j
116+
xticks = j > 15 ? :auto : nothing
117+
yticks = rem(j, 5) == 1 ? :auto : nothing
118+
return plot_data(df.distance, df.n, m().y, xticks, yticks)
119+
end
120+
plot(hists...; layout=(4, 5))
121+
```
122+
123+
In this case, the only prior knowledge I have is that the proportion of successful shots
124+
ought to decrease monotonically as the distance from the hole increases, which should show
125+
up in the data as the blue lines generally going down as we move from left to right on each
126+
graph.
127+
Unfortunately, there is not a simple way to enforce monotonicity in the samples from a GP,
128+
and we can see this in some of the plots above, so we must hope that we have enough data to
129+
ensure that this relationship approximately holds under the posterior.
130+
In any case, you can judge for yourself whether you think this is the most useful
131+
visualisation that we can perform -- if you think there is something better to look at,
132+
please let us know!
133+
134+
Moving on, we generate samples from the posterior using the default `NUTS` sampler.
135+
We'll make use of [ReverseDiff.jl](https://github.com/JuliaDiff/ReverseDiff.jl), as it has
136+
better performance than [ForwardDiff.jl](https://github.com/JuliaDiff/ForwardDiff.jl/) on
137+
this example. See Turing.jl's docs on Automatic Differentiation for more info.
138+
139+
```julia
140+
using Random, ReverseDiff
141+
142+
Turing.setadbackend(:reversediff)
143+
Turing.setrdcache(true)
144+
145+
m_post = m | (y=df.y,)
146+
chn = sample(Xoshiro(123456), m_post, NUTS(), 1_000)
147+
```
148+
149+
We can use these samples and the `posterior` function from `AbstractGPs` to sample from the
150+
posterior probability of success at any distance we choose:
151+
152+
```julia
153+
d_pred = 1:0.2:21
154+
samples = map(generated_quantities(m_post, chn)[1:10:end]) do x
155+
return logistic.(rand(posterior(x.fx, x.f_latent)(d_pred, 1e-4)))
156+
end
157+
p = plot()
158+
plot!(d_pred, reduce(hcat, samples); label="", color=:blue, alpha=0.2)
159+
scatter!(df.distance, df.y ./ df.n; label="", color=:red)
160+
```
161+
162+
We can see that the general trend is indeed down as the distance from the hole increases,
163+
and that if we move away from the data, the posterior uncertainty quickly inflates.
164+
This suggests that the model is probably going to do a reasonable job of interpolating
165+
between observed data, but less good a job at extrapolating to larger distances.

0 commit comments

Comments
 (0)