You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* Dealing indentation problems (and probably others), 268~ left
* Critical error fixed. 267~ left
* Fixing files with 2 possible references (cross-reference). 204~ warnings left
* Fixin glossary problems (and others). ~153 warnings left
* Fixing Unknown interpreted text error. 151~ warnings left
* fixing cross-reference target not found like. 124~ warnings left
* fixing local id not found like. 104~ warnings left
* fixing duplicate citation like. ~60 warnings left...
* solving various. 48~ left
* fixing various. 42~ left
* All warnings fixed, of which some are just hiden in conf.py because they can not be fixed, its normal to have them or are not related to the documentations build
Copy file name to clipboardExpand all lines: docs/algorithms.md
+8-12Lines changed: 8 additions & 12 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,19 +1,21 @@
1
1
# Algorithms
2
2
3
+
(fit)=
3
4
## Fit
4
5
5
-
In this section we describe how to fit a `leaspy` model with your data. Leaspy uses the [MCMC-SAEM algorithm](./glossary.md#mcmc-saem) to fit a model by jointly estimating the fixed effects and the distribution of the random effects. It is particularly well suited to this kind of models where the likelihood involves latent variables and is not available in closed form.
6
+
In this section we describe how to fit a `leaspy` model with your data. Leaspy uses the {term}`MCMC-SAEM algorithm <MCMC-SAEM>` to fit a model by jointly estimating the fixed effects and the distribution of the random effects. It is particularly well suited to this kind of models where the likelihood involves latent variables and is not available in closed form.
6
7
7
8
The algorithm is an adaptation of the Expectation-Maximisation (EM) algorithm that relies on an iterative procedure that alternates between the following main steps:
8
9
9
-
- Expectation/Stochastic Approximation Step: the algorithm uses [Markov Chain Monte Carlo (MCMC)](./glossary.md#mcmc) to generate samples of the latent variables (random effects) conditional on the current parameter estimates. In particular, Gibbs sampling is employed, which iteratively updates each latent variable conditional on the current values of the others, allowing efficient exploration of the latent space. To avoid convergence to local maxima, a temperature scheme is applied: the sampling distribution is initially “flattened” during the burn-in phase, to allow exploration of a wider range of values, and the temperature is gradually reduced over iterations so that the chain focuses increasingly on high-likelihood regions. The sufficient statistics of the complete-data log-likelihood are then computed using a stochastic approximation scheme.
10
+
- Expectation/Stochastic Approximation Step: the algorithm uses {term}`Markov Chain Monte Carlo (MCMC) <MCMC>` to generate samples of the latent variables (random effects) conditional on the current parameter estimates. In particular, Gibbs sampling is employed, which iteratively updates each latent variable conditional on the current values of the others, allowing efficient exploration of the latent space. To avoid convergence to local maxima, a temperature scheme is applied: the sampling distribution is initially “flattened” during the burn-in phase, to allow exploration of a wider range of values, and the temperature is gradually reduced over iterations so that the chain focuses increasingly on high-likelihood regions. The sufficient statistics of the complete-data log-likelihood are then computed using a stochastic approximation scheme.
10
11
- Maximization Step: Given the updated sufficient statistics, the fixed effects and variance components are re-estimated by maximizing the approximate complete-data log-likelihood.
11
12
12
13
By iterating these steps, the MCMC-SAEM algorithm converges to the maximum likelihood estimates of the model parameters.
13
14
15
+
(prerequisites)=
14
16
### Prerequisites
15
17
16
-
Depending on the model you want to fit, you need a dataframe with a specific structure (see [logistic](./models.md#logistic-data), [joint](./models.md#joint-data), and [mixture](./models.md#mixture-data) models).
18
+
Depending on the model you want to fit, you need a dataframe with a specific structure (see [logistic](logistic-data), [joint](joint-data), and [mixture](mixture-data) models).
17
19
18
20
### Running Task
19
21
@@ -25,7 +27,7 @@ Let's use the logistic model as an example.
25
27
from leaspy.models import LogisticModel
26
28
```
27
29
28
-
We need to specify the arguments `name`, `dimension` (the number of outcomes $K$ in your dataset) and the `obs_models` (valid choices for the logistic model are 'gaussian-diagonal' to estimate one noise coefficient per outcome or 'gaussian-scalar' to estimate one noise coefficient for all the outcomes). When we fit a multivariate model we also need to specify `source_dimension` that corresponds to the degrees of freedom of intermarker spacing parameters. We refer you to the [mathematical background section](./mathematics.md#individual-trajectory--spatial-random-effects) for more details. We generally suggest a number of sources close to the square root of the number of outcomes ($\sqrt{dimension}$).
30
+
We need to specify the arguments `name`, `dimension` (the number of outcomes $K$ in your dataset) and the `obs_models` (valid choices for the logistic model are 'gaussian-diagonal' to estimate one noise coefficient per outcome or 'gaussian-scalar' to estimate one noise coefficient for all the outcomes). When we fit a multivariate model we also need to specify `source_dimension` that corresponds to the degrees of freedom of intermarker spacing parameters. We refer you to the [mathematical background section](individual-trajectory-spatial-random-effects) for more details. We generally suggest a number of sources close to the square root of the number of outcomes ($\sqrt{dimension}$).
29
31
30
32
You can also add a `seed` or control other arguments for the output and the logs like `save_periodicity`, `path`, etc.
31
33
@@ -34,7 +36,7 @@ model = LogisticModel(name="my-model", source_dimension=1, dimension=2, obs_mode
Note that the joint and mixture models require additional model-specific arguments. Please refer to their respective documentation for details: [joint model](./models.md#model-summary) and [mixture model](./models.md#id20).
39
+
Note that the joint and mixture models require additional model-specific arguments. Please refer to their respective documentation for details: [joint model](joint-model-summary) and [mixture model](mixture-model-summary).
-__More of a bayesian one:__ random effects are estimated using a Gibbs sampler with an option on the burn-in phase and temperature scheme (see [fit description](##Fit)). Currently, the package enables to extract the mean or the mode of the posterior distribution. They can be used with the same procedure using `mean_posterior` or `mode_posterior` flag.
98
+
-__More of a bayesian one:__ random effects are estimated using a Gibbs sampler with an option on the burn-in phase and temperature scheme (see [fit description](#fit)). Currently, the package enables to extract the mean or the mode of the posterior distribution. They can be used with the same procedure using `mean_posterior` or `mode_posterior` flag.
0 commit comments