You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/theory.md
+28-1Lines changed: 28 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -53,4 +53,31 @@ The standard Gibbs sampler in `pybmc` assumes a Gaussian likelihood and conjugat
53
53
54
54
### Gibbs Sampler with Simplex Constraints
55
55
56
-
`pybmc` also provides a Gibbs sampler that enforces simplex constraints on the model weights (i.e., \(\sum w_k = 1\) and \(w_k \ge 0\)). This is achieved by performing a random walk in the space of the transformed parameters and using a Metropolis-Hastings step to accept or reject proposals that fall outside the valid simplex region.
56
+
`pybmc` also provides a Gibbs sampler that enforces simplex constraints on the model weights (i.e., \(\sum w_k = 1\) and \(w_k \ge 0\)). This is achieved by performing a random walk in the space of the transformed parameters and using a Metropolis-Hastings step to accept or reject proposals that fall outside the valid simplex region.
57
+
58
+
#### When to Use Each Mode
59
+
60
+
| Mode | Description | Use When |
61
+
|------|-------------|----------|
62
+
|**Unconstrained** (default) | Weights can take any real value | Maximum flexibility; some models may get negative weights to cancel out biases |
63
+
|**Simplex**| Weights satisfy \(w_k \ge 0\) and \(\sum w_k = 1\)| You need interpretable weights that form a proper mixture; predictions should stay within the range of individual models |
64
+
65
+
#### Simplex Constraint Implementation
66
+
67
+
The simplex constraint is enforced through a Metropolis-within-Gibbs algorithm. In the SVD-reduced coefficient space, the relationship between the regression coefficients \(\boldsymbol{\beta}\) and the model weights \(\boldsymbol{\omega}\) is:
where \(\hat{V}\) contains the (normalized) right singular vectors and \(K\) is the number of models. The term \(\frac{1}{K}\) represents the equal-weight baseline.
74
+
75
+
At each iteration, the algorithm:
76
+
77
+
1.**Proposes** a new coefficient vector \(\boldsymbol{\beta}^*\) from a multivariate normal centered on the current value.
78
+
2.**Projects** the proposal to weight space via \(\boldsymbol{\omega}^* = \boldsymbol{\beta}^* \hat{V} + \frac{1}{K}\).
79
+
3.**Rejects** the proposal if any \(\omega_k^* < 0\) (the sum-to-one constraint is automatically satisfied by the SVD structure and the \(\frac{1}{K}\) offset).
80
+
4.**Accepts** valid proposals with probability \(\min\!\bigl(1,\; \exp\!\bigl[\bigl(\ell(\boldsymbol{\beta}^*) - \ell(\boldsymbol{\beta})\bigr) / \sigma^2\bigr]\bigr)\), where \(\ell\) is the log-likelihood.
81
+
5.**Samples** the error variance \(\sigma^2\) from its inverse-gamma full conditional.
82
+
83
+
The `burn` parameter controls the number of burn-in iterations discarded before collecting samples, and the `stepsize` parameter scales the proposal covariance matrix to tune the acceptance rate.
# Override to simplex for this specific training run
167
+
bmc.train(training_options={
168
+
"iterations": 50000,
169
+
"sampler": "simplex",
170
+
"burn": 10000,
171
+
"stepsize": 0.001,
172
+
})
173
+
```
174
+
175
+
### Inspecting Model Weights
176
+
177
+
After training, you can inspect the inferred model weights using `get_weights()`:
178
+
179
+
```python
180
+
# Get a summary (mean, std, median per model)
181
+
summary = bmc.get_weights()
182
+
for model, mean_w, std_w inzip(summary["models"], summary["mean"], summary["std"]):
183
+
print(f"{model}: {mean_w:.4f} ± {std_w:.4f}")
184
+
185
+
# Get the full weight matrix (n_samples × n_models) for custom analysis
186
+
weight_matrix = bmc.get_weights(summary=False)
187
+
```
188
+
189
+
In simplex mode, every row of the weight matrix is guaranteed to satisfy
190
+
\(w_k \ge 0\) and \(\sum_k w_k = 1\).
191
+
115
192
## 4. Make Predictions
116
193
117
194
After training, we can use the `predict` method to generate predictions with uncertainty quantification. The method returns the full posterior draws, as well as DataFrames for the lower, median, and upper credible intervals.
0 commit comments