Skip to content

Commit 8492874

Browse files
committed
write guides for correlation structure and conditional MLmodel
1 parent 8e6e08b commit 8492874

File tree

17 files changed

+733
-876
lines changed

17 files changed

+733
-876
lines changed

docs/make.jl

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -23,8 +23,8 @@ makedocs(;
2323
#"Test quarto markdown" => "tutorials/test1.md",
2424
],
2525
"How to" => [
26-
#".. model independent parameters" => "tutorials/how_to_guides/blocks_corr_site.md",
27-
#".. model site-global corr" => "tutorials/how_to_guides/corr_site_global.md",
26+
".. model independent parameters" => "tutorials/blocks_corr.md",
27+
".. model site-global corr" => "tutorials/corr_site_global.md",
2828
],
2929
"Explanation" => [
3030
#"Theory" => "explanation/theory_hvi.md", TODO activate when paper is published

docs/src/tutorials/basic_cpu.qmd

Lines changed: 20 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -316,7 +316,25 @@ For the parameters, one row corresponds to
316316
one site. For the drivers and predictions, one column corresponds to one site.
317317

318318

319-
{{< include _pbm_matrix.qmd >}}
319+
```{julia}
320+
function f_doubleMM_sites(θc::CA.ComponentMatrix, xPc::CA.ComponentMatrix)
321+
# extract several covariates from xP
322+
ST = typeof(CA.getdata(xPc)[1:1,:]) # workaround for non-type-stable Symbol-indexing
323+
S1 = (CA.getdata(xPc[:S1,:])::ST)
324+
S2 = (CA.getdata(xPc[:S2,:])::ST)
325+
#
326+
# extract the parameters as row-repeated vectors
327+
n_obs = size(S1, 1)
328+
VT = typeof(CA.getdata(θc)[:,1]) # workaround for non-type-stable Symbol-indexing
329+
(r0, r1, K1, K2) = map((:r0, :r1, :K1, :K2)) do par
330+
p1 = CA.getdata(θc[:, par]) ::VT
331+
repeat(p1', n_obs) # matrix: same for each concentration row in S1
332+
end
333+
#
334+
# each variable is a matrix (n_obs x n_site)
335+
r0 .+ r1 .* S1 ./ (K1 .+ S1) .* S2 ./ (K2 .+ S2)
336+
end
337+
```
320338

321339
Again, the function should not rely on the order of parameters but use symbolic indexing
322340
to extract the parameter vectors. For type stability of this symbolic indexing,
@@ -345,7 +363,7 @@ epochs of the optimization.
345363
```{julia}
346364
(; probo) = solve(probo_sites, solver; rng,
347365
callback = callback_loss(100), # output during fitting
348-
epochs = 10,
366+
epochs = 20,
349367
#is_inferred = Val(true), # activate type-checks
350368
);
351369
```

docs/src/tutorials/blocks_corr.md

Lines changed: 171 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,171 @@
1+
# How to model indenpendent parameter-blocks in the posterior
2+
3+
4+
``` @meta
5+
CurrentModule = HybridVariationalInference
6+
```
7+
8+
This guide shows how to configure independent parameter-blocks in the correlations
9+
of the posterior.
10+
11+
## Motivation
12+
13+
Modelling all correlations among global and site PBM-parameters respectively
14+
requires many degrees of freedom.
15+
16+
To decrease the number of parameters to estimate, HVI allows to decompose the
17+
correlations into independent sub-blocks of parameters.
18+
19+
First load necessary packages.
20+
21+
``` julia
22+
using HybridVariationalInference
23+
using ComponentArrays: ComponentArrays as CA
24+
using Bijectors
25+
using SimpleChains
26+
using MLUtils
27+
using JLD2
28+
using Random
29+
using CairoMakie
30+
using PairPlots # scatterplot matrices
31+
```
32+
33+
This tutorial reuses and modifies the fitted object saved at the end of the
34+
[Basic workflow without GPU](@ref) tutorial.
35+
36+
``` julia
37+
fname = "intermediate/basic_cpu_results.jld2"
38+
print(abspath(fname))
39+
prob = probo_cor = load(fname, "probo");
40+
```
41+
42+
## Specifying blocks in correlation structure
43+
44+
HVI models the posterior of the parameters at unconstrained scale using a
45+
multivariate normal distribution. It estimates a parameterization of the
46+
associated blocks in the correlation matrx and requires a specification
47+
of the block-structure.
48+
49+
This is done by specifying the positions of the end of the blocks for
50+
the global (P) and the site-specific parameters (M) respectively using
51+
a `NamedTuple` of integer vectors.
52+
53+
The defaults specifies a single entry, meaning, there is only one big
54+
block respectively, spanning all parameters.
55+
56+
``` julia
57+
cor_ends0 = (P=[length(prob.θP)], M=[length(prob.θM)])
58+
```
59+
60+
(P = [1], M = [2])
61+
62+
The following specification models one-entry blocks for each each parameter
63+
in the correlation block the site parameters, i.e. treating all parameters
64+
independently with not modelling any correlations between them.
65+
66+
``` julia
67+
cor_ends = (P=[length(prob.θP)], M=1:length(prob.θM))
68+
```
69+
70+
(P = [1], M = 1:2)
71+
72+
## Reinitialize parameters for the posterior approximation.
73+
74+
HVI uses additional fitted parameters to represent the means and the
75+
covariance matrix of the posterior distribution of model parameters.
76+
With fewer correlations, also the number of those parameters changes,
77+
and those parameters must be reinitialized after changing the block structure in
78+
the correlation matrix.
79+
80+
Here, we obtain construct initial estimates. using [`init_hybrid_ϕunc`](@ref)
81+
82+
``` julia
83+
ϕunc = init_hybrid_ϕunc(cor_ends, zero(eltype(prob.θM)))
84+
```
85+
86+
In this two-site parameter case, the the blocked structure saves only one degree of freedom:
87+
88+
``` julia
89+
length(ϕunc), length(probo_cor.ϕunc)
90+
```
91+
92+
(5, 6)
93+
94+
## Update the problem and redo the inversion
95+
96+
``` julia
97+
prob_ind = HybridProblem(prob; cor_ends, ϕunc)
98+
```
99+
100+
``` julia
101+
using OptimizationOptimisers
102+
import Zygote
103+
104+
solver = HybridPosteriorSolver(; alg=Adam(0.02), n_MC=3)
105+
106+
(; probo) = solve(prob_ind, solver;
107+
callback = callback_loss(100), # output during fitting
108+
epochs = 20,
109+
); probo_ind = probo;
110+
```
111+
112+
## Compare the correated vs. uncorrelated posterior
113+
114+
First, draw a sample.
115+
116+
``` julia
117+
n_sample_pred = 400
118+
(y_cor, θsP_cor, θsMs_cor) = (; y, θsP, θsMs) = predict_hvi(
119+
Random.default_rng(), probo_cor; n_sample_pred)
120+
(y_ind, θsP_ind, θsMs_ind) = (; y, θsP, θsMs) = predict_hvi(
121+
Random.default_rng(), probo_ind; n_sample_pred)
122+
```
123+
124+
``` julia
125+
i_site = 1
126+
θ1 = vcat(θsP_ind, θsMs_ind[i_site,:,:])
127+
θ1_nt = NamedTuple(k => CA.getdata(θ1[k,:]) for k in keys(θ1[:,1])) #
128+
plt = pairplot(θ1_nt)
129+
```
130+
131+
![](blocks_corr_files/figure-commonmark/cell-11-output-1.png)
132+
133+
The corner plot of the independent-parameters estimate shows
134+
no correlations between site parameters, *r*₁ and *K*₁.
135+
136+
``` julia
137+
i_out = 4
138+
fig = Figure(); ax = Axis(fig[1,1], xlabel="mean(y)",ylabel="sd(y)")
139+
ymean_cor = [mean(y_cor[i_out,s,:]) for s in axes(y_cor, 2)]
140+
ysd_cor = [std(y_cor[i_out,s,:]) for s in axes(y_cor, 2)]
141+
scatter!(ax, ymean_cor, ysd_cor, label="correlated")
142+
ymean_ind = [mean(y_ind[i_out,s,:]) for s in axes(y_ind, 2)]
143+
ysd_ind = [std(y_ind[i_out,s,:]) for s in axes(y_ind, 2)]
144+
scatter!(ax, ymean_ind, ysd_ind, label="independent")
145+
axislegend(ax, unique=true)
146+
fig
147+
```
148+
149+
![](blocks_corr_files/figure-commonmark/cell-12-output-1.png)
150+
151+
``` julia
152+
plot_sd_vs_mean = (par) -> begin
153+
fig = Figure(); ax = Axis(fig[1,1], xlabel="mean($par)",ylabel="sd($par)")
154+
θmean_cor = [mean(θsMs_cor[s,par,:]) for s in axes(θsMs_cor, 1)]
155+
θsd_cor = [std(θsMs_cor[s,par,:]) for s in axes(θsMs_cor, 1)]
156+
scatter!(ax, θmean_cor, θsd_cor, label="correlated")
157+
θmean_ind = [mean(θsMs_ind[s,par,:]) for s in axes(θsMs_ind, 1)]
158+
θsd_ind = [std(θsMs_ind[s,par,:]) for s in axes(θsMs_ind, 1)]
159+
scatter!(ax, θmean_ind, θsd_ind, label="independent")
160+
axislegend(ax, unique=true)
161+
fig
162+
end
163+
plot_sd_vs_mean(:K1)
164+
```
165+
166+
![](blocks_corr_files/figure-commonmark/cell-13-output-1.png)
167+
168+
The inversion that neglects correlations among site parameters results in
169+
the same magnitude of estimated uncertainty of predictions.
170+
However, the uncertainty of the model parameters is severely underestimated
171+
in this example.

docs/src/tutorials/blocks_corr.qmd

Lines changed: 171 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,171 @@
1+
---
2+
title: "How to model indenpendent parameter-blocks in the posterior"
3+
engine: julia
4+
execute:
5+
echo: true
6+
output: false
7+
daemon: 3600
8+
format:
9+
commonmark:
10+
variant: -raw_html
11+
wrap: preserve
12+
bibliography: twutz_txt.bib
13+
---
14+
15+
``` @meta
16+
CurrentModule = HybridVariationalInference
17+
```
18+
19+
This guide shows how to configure independent parameter-blocks in the correlations
20+
of the posterior.
21+
22+
## Motivation
23+
Modelling all correlations among global and site PBM-parameters respectively
24+
requires many degrees of freedom.
25+
26+
To decrease the number of parameters to estimate, HVI allows to decompose the
27+
correlations into independent sub-blocks of parameters.
28+
29+
First load necessary packages.
30+
```{julia}
31+
using HybridVariationalInference
32+
using ComponentArrays: ComponentArrays as CA
33+
using Bijectors
34+
using SimpleChains
35+
using MLUtils
36+
using JLD2
37+
using Random
38+
using CairoMakie
39+
using PairPlots # scatterplot matrices
40+
```
41+
42+
This tutorial reuses and modifies the fitted object saved at the end of the
43+
[Basic workflow without GPU](@ref) tutorial.
44+
45+
```{julia}
46+
fname = "intermediate/basic_cpu_results.jld2"
47+
print(abspath(fname))
48+
prob = probo_cor = load(fname, "probo");
49+
```
50+
51+
## Specifying blocks in correlation structure
52+
HVI models the posterior of the parameters at unconstrained scale using a
53+
multivariate normal distribution. It estimates a parameterization of the
54+
associated blocks in the correlation matrx and requires a specification
55+
of the block-structure.
56+
57+
This is done by specifying the positions of the end of the blocks for
58+
the global (P) and the site-specific parameters (M) respectively using
59+
a `NamedTuple` of integer vectors.
60+
61+
The defaults specifies a single entry, meaning, there is only one big
62+
block respectively, spanning all parameters.
63+
```{julia}
64+
#| output: true
65+
cor_ends0 = (P=[length(prob.θP)], M=[length(prob.θM)])
66+
```
67+
68+
The following specification models one-entry blocks for each each parameter
69+
in the correlation block the site parameters, i.e. treating all parameters
70+
independently with not modelling any correlations between them.
71+
72+
```{julia}
73+
#| output: true
74+
cor_ends = (P=[length(prob.θP)], M=1:length(prob.θM))
75+
```
76+
77+
## Reinitialize parameters for the posterior approximation.
78+
HVI uses additional fitted parameters to represent the means and the
79+
covariance matrix of the posterior distribution of model parameters.
80+
With fewer correlations, also the number of those parameters changes,
81+
and those parameters must be reinitialized after changing the block structure in
82+
the correlation matrix.
83+
84+
Here, we obtain construct initial estimates. using [`init_hybrid_ϕunc`](@ref)
85+
86+
```{julia}
87+
ϕunc = init_hybrid_ϕunc(cor_ends, zero(eltype(prob.θM)))
88+
```
89+
90+
In this two-site parameter case, the the blocked structure saves only one degree of freedom:
91+
92+
```{julia}
93+
#| output: true
94+
length(ϕunc), length(probo_cor.ϕunc)
95+
```
96+
97+
## Update the problem and redo the inversion
98+
99+
100+
```{julia}
101+
prob_ind = HybridProblem(prob; cor_ends, ϕunc)
102+
```
103+
104+
```{julia}
105+
using OptimizationOptimisers
106+
import Zygote
107+
108+
solver = HybridPosteriorSolver(; alg=Adam(0.02), n_MC=3)
109+
110+
(; probo) = solve(prob_ind, solver;
111+
callback = callback_loss(100), # output during fitting
112+
epochs = 20,
113+
); probo_ind = probo;
114+
```
115+
116+
## Compare the correated vs. uncorrelated posterior
117+
118+
First, draw a sample.
119+
```{julia}
120+
n_sample_pred = 400
121+
(y_cor, θsP_cor, θsMs_cor) = (; y, θsP, θsMs) = predict_hvi(
122+
Random.default_rng(), probo_cor; n_sample_pred)
123+
(y_ind, θsP_ind, θsMs_ind) = (; y, θsP, θsMs) = predict_hvi(
124+
Random.default_rng(), probo_ind; n_sample_pred)
125+
```
126+
127+
```{julia}
128+
#| output: true
129+
i_site = 1
130+
θ1 = vcat(θsP_ind, θsMs_ind[i_site,:,:])
131+
θ1_nt = NamedTuple(k => CA.getdata(θ1[k,:]) for k in keys(θ1[:,1])) #
132+
plt = pairplot(θ1_nt)
133+
```
134+
The corner plot of the independent-parameters estimate shows
135+
no correlations between site parameters, $r_1$ and $K_1$.
136+
137+
```{julia}
138+
#| output: true
139+
i_out = 4
140+
fig = Figure(); ax = Axis(fig[1,1], xlabel="mean(y)",ylabel="sd(y)")
141+
ymean_cor = [mean(y_cor[i_out,s,:]) for s in axes(y_cor, 2)]
142+
ysd_cor = [std(y_cor[i_out,s,:]) for s in axes(y_cor, 2)]
143+
scatter!(ax, ymean_cor, ysd_cor, label="correlated")
144+
ymean_ind = [mean(y_ind[i_out,s,:]) for s in axes(y_ind, 2)]
145+
ysd_ind = [std(y_ind[i_out,s,:]) for s in axes(y_ind, 2)]
146+
scatter!(ax, ymean_ind, ysd_ind, label="independent")
147+
axislegend(ax, unique=true)
148+
fig
149+
```
150+
151+
```{julia}
152+
#| output: true
153+
plot_sd_vs_mean = (par) -> begin
154+
fig = Figure(); ax = Axis(fig[1,1], xlabel="mean($par)",ylabel="sd($par)")
155+
θmean_cor = [mean(θsMs_cor[s,par,:]) for s in axes(θsMs_cor, 1)]
156+
θsd_cor = [std(θsMs_cor[s,par,:]) for s in axes(θsMs_cor, 1)]
157+
scatter!(ax, θmean_cor, θsd_cor, label="correlated")
158+
θmean_ind = [mean(θsMs_ind[s,par,:]) for s in axes(θsMs_ind, 1)]
159+
θsd_ind = [std(θsMs_ind[s,par,:]) for s in axes(θsMs_ind, 1)]
160+
scatter!(ax, θmean_ind, θsd_ind, label="independent")
161+
axislegend(ax, unique=true)
162+
fig
163+
end
164+
plot_sd_vs_mean(:K1)
165+
```
166+
167+
The inversion that neglects correlations among site parameters results in
168+
the same magnitude of estimated uncertainty of predictions.
169+
However, the uncertainty of the model parameters is severely underestimated
170+
in this example.
171+
99.8 KB
Loading
93.6 KB
Loading
123 KB
Loading
67.8 KB
Loading

0 commit comments

Comments
 (0)