TuringLang · penelopeysm · Oct 10, 2024 · Oct 6, 2024 · Oct 7, 2024 · Oct 7, 2024
diff --git a/tutorials/04-hidden-markov-model/index.qmd b/tutorials/04-hidden-markov-model/index.qmd
@@ -14,7 +14,7 @@ This tutorial illustrates training Bayesian [Hidden Markov Models](https://en.wi
 
 In this tutorial, we assume there are $k$ discrete hidden states; the observations are continuous and normally distributed - centered around the hidden states. This assumption reduces the number of parameters to be estimated in the emission matrix.
 
-Let's load the libraries we'll need. We also set a random seed (for reproducibility) and the automatic differentiation backend to forward mode (more [here](../{{<meta using-turing-autodiff>}}) on why this is useful).
+Let's load the libraries we'll need. We also set a random seed (for reproducibility) and the automatic differentiation backend to forward mode (more [here]( {{<meta doc-base-url>}}/{{<meta using-turing-autodiff>}} ) on why this is useful).
 
 ```{julia}
 # Load libraries.
@@ -125,7 +125,7 @@ We will use a combination of two samplers ([HMC](https://turinglang.org/dev/docs
 
 In this case, we use HMC for `m` and `T`, representing the emission and transition matrices respectively. We use the Particle Gibbs sampler for `s`, the state sequence. You may wonder why it is that we are not assigning `s` to the HMC sampler, and why it is that we need compositional Gibbs sampling at all.
 
-The parameter `s` is not a continuous variable. It is a vector of **integers**, and thus Hamiltonian methods like HMC and [NUTS](https://turinglang.org/dev/docs/library/#Turing.Inference.NUTS) won't work correctly. Gibbs allows us to apply the right tools to the best effect. If you are a particularly advanced user interested in higher performance, you may benefit from setting up your Gibbs sampler to use [different automatic differentiation](https://turinglang.org/dev/docs/using-turing/autodiff#compositional-sampling-with-differing-ad-modes) backends for each parameter space.
+The parameter `s` is not a continuous variable. It is a vector of **integers**, and thus Hamiltonian methods like HMC and [NUTS](https://turinglang.org/dev/docs/library/#Turing.Inference.NUTS) won't work correctly. Gibbs allows us to apply the right tools to the best effect. If you are a particularly advanced user interested in higher performance, you may benefit from setting up your Gibbs sampler to use [different automatic differentiation]( {{<meta doc-base-url>}}/{{<meta using-turing-autodiff>}}#compositional-sampling-with-differing-ad-modes) backends for each parameter space.
 
 Time to run our sampler.
 

diff --git a/tutorials/06-infinite-mixture-model/index.qmd b/tutorials/06-infinite-mixture-model/index.qmd
@@ -81,7 +81,7 @@ x &\sim \mathrm{Normal}(\mu_z, \Sigma)
 \end{align}
 $$
 
-which resembles the model in the [Gaussian mixture model tutorial](../{{<meta gaussian-mixture-model>}}) with a slightly different notation.
+which resembles the model in the [Gaussian mixture model tutorial]( {{<meta doc-base-url>}}/{{<meta gaussian-mixture-model>}}) with a slightly different notation.
 
 ## Infinite Mixture Model
 

diff --git a/tutorials/08-multinomial-logistic-regression/index.qmd b/tutorials/08-multinomial-logistic-regression/index.qmd
@@ -145,7 +145,7 @@ chain
 ::: {.callout-warning collapse="true"}
 ## Sampling With Multiple Threads
 The `sample()` call above assumes that you have at least `nchains` threads available in your Julia instance. If you do not, the multiple chains
-will run sequentially, and you may notice a warning. For more information, see [the Turing documentation on sampling multiple chains.](../../{{<meta using-turing>}})
+will run sequentially, and you may notice a warning. For more information, see [the Turing documentation on sampling multiple chains.]( {{<meta doc-base-url>}}/{{<meta using-turing>}}#sampling-multiple-chains )
 :::
 
 Since we ran multiple chains, we may as well do a spot check to make sure each chain converges around similar points.

diff --git a/tutorials/09-variational-inference/index.qmd b/tutorials/09-variational-inference/index.qmd
@@ -13,7 +13,7 @@ Pkg.instantiate();
 In this post we'll have a look at what's know as **variational inference (VI)**, a family of _approximate_ Bayesian inference methods, and how to use it in Turing.jl as an alternative to other approaches such as MCMC. In particular, we will focus on one of the more standard VI methods called **Automatic Differentation Variational Inference (ADVI)**.
 
 Here we will focus on how to use VI in Turing and not much on the theory underlying VI.
-If you are interested in understanding the mathematics you can checkout [our write-up](../../{{<meta using-turing-variational-inference>}}) or any other resource online (there a lot of great ones).
+If you are interested in understanding the mathematics you can checkout [our write-up]( {{<meta doc-base-url>}}/{{<meta using-turing-variational-inference>}} ) or any other resource online (there a lot of great ones).
 
 Using VI in Turing.jl is very straight forward.
 If `model` denotes a definition of a `Turing.Model`, performing VI is as simple as
@@ -26,7 +26,7 @@ q = vi(m, vi_alg)  # perform VI on `m` using the VI method `vi_alg`, which retur
 
 Thus it's no more work than standard MCMC sampling in Turing.
 
-To get a bit more into what we can do with `vi`, we'll first have a look at a simple example and then we'll reproduce the [tutorial on Bayesian linear regression](../../{{<meta linear-regression>}}) using VI instead of MCMC. Finally we'll look at some of the different parameters of `vi` and how you for example can use your own custom variational family.
+To get a bit more into what we can do with `vi`, we'll first have a look at a simple example and then we'll reproduce the [tutorial on Bayesian linear regression]( {{<meta doc-base-url>}}/{{<meta linear-regression>}}) using VI instead of MCMC. Finally we'll look at some of the different parameters of `vi` and how you for example can use your own custom variational family.
 
 We first import the packages to be used:
 

diff --git a/tutorials/14-minituring/index.qmd b/tutorials/14-minituring/index.qmd
@@ -82,7 +82,7 @@ Thus depending on the inference algorithm we want to use different `assume` and
 We can achieve this by providing this `context` information as a function argument to `assume` and `observe`.
 
 **Note:** *Although the context system in this tutorial is inspired by DynamicPPL, it is very simplistic.
-We expand this mini Turing example in the [contexts](../{{<meta contexts>}}) tutorial with some more complexity, to illustrate how and why contexts are central to Turing's design. For the full details one still needs to go to the actual source of DynamicPPL though.*
+We expand this mini Turing example in the [contexts]( {{<meta doc-base-url>}}/{{<meta contexts>}} ) tutorial with some more complexity, to illustrate how and why contexts are central to Turing's design. For the full details one still needs to go to the actual source of DynamicPPL though.*
 
 Here we can see the implementation of a sampler that draws values of unobserved variables from the prior and computes the log-probability for every variable.
 

diff --git a/tutorials/docs-00-getting-started/index.qmd b/tutorials/docs-00-getting-started/index.qmd
@@ -82,5 +82,5 @@ The underlying theory of Bayesian machine learning is not explained in detail in
 A thorough introduction to the field is [*Pattern Recognition and Machine Learning*](https://www.springer.com/us/book/9780387310732) (Bishop, 2006); an online version is available [here (PDF, 18.1 MB)](https://www.microsoft.com/en-us/research/uploads/prod/2006/01/Bishop-Pattern-Recognition-and-Machine-Learning-2006.pdf).
 :::
 
-The next page on [Turing's core functionality](../../{{<meta using-turing>}}) explains the basic features of the Turing language.
-From there, you can either look at [worked examples of how different models are implemented in Turing](../../{{<meta tutorials-intro>}}), or [specific tips and tricks that can help you get the most out of Turing](../../{{<meta using-turing-mode-estimation>}}).
+The next page on [Turing's core functionality]( {{<meta doc-base-url>}}/{{<meta using-turing>}} ) explains the basic features of the Turing language.
+From there, you can either look at [worked examples of how different models are implemented in Turing]( {{<meta doc-base-url>}}/{{<meta tutorials-intro>}} ), or [specific tips and tricks that can help you get the most out of Turing]( {{<meta doc-base-url>}}/{{<meta using-turing-mode-estimation>}} ).
diff --git a/tutorials/docs-04-for-developers-abstractmcmc-turing/index.qmd b/tutorials/docs-04-for-developers-abstractmcmc-turing/index.qmd
@@ -33,7 +33,7 @@ n_samples = 1000
 chn = sample(mod, alg, n_samples, progress=false)
 ```
 
-The function `sample` is part of the AbstractMCMC interface. As explained in the [interface guide](../{{<meta using-turing-interface>}}), building a sampling method that can be used by `sample` consists in overloading the structs and functions in `AbstractMCMC`. The interface guide also gives a standalone example of their implementation, [`AdvancedMH.jl`]().
+The function `sample` is part of the AbstractMCMC interface. As explained in the [interface guide]( {{<meta doc-base-url>}}/{{<meta using-turing-interface>}} ), building a sampling method that can be used by `sample` consists in overloading the structs and functions in `AbstractMCMC`. The interface guide also gives a standalone example of their implementation, [`AdvancedMH.jl`]().
 
 Turing sampling methods (most of which are written [here](https://github.com/TuringLang/Turing.jl/tree/master/src/mcmc)) also implement `AbstractMCMC`. Turing defines a particular architecture for `AbstractMCMC` implementations, that enables working with models defined by the `@model` macro, and uses DynamicPPL as a backend. The goal of this page is to describe this architecture, and how you would go about implementing your own sampling method in Turing, using Importance Sampling as an example. I don't go into all the details: for instance, I don't address selectors or parallelism.
 

diff --git a/tutorials/docs-07-for-developers-variational-inference/index.qmd b/tutorials/docs-07-for-developers-variational-inference/index.qmd
@@ -7,7 +7,7 @@ engine: julia
 
 In this post, we'll examine variational inference (VI), a family of approximate Bayesian inference methods. We will focus on one of the more standard VI methods, Automatic Differentiation Variational Inference (ADVI).
 
-Here, we'll examine the theory behind VI, but if you're interested in using ADVI in Turing, [check out this tutorial](../../{{<meta variational-inference>}}).
+Here, we'll examine the theory behind VI, but if you're interested in using ADVI in Turing, [check out this tutorial]( {{<meta doc-base-url>}}/{{<meta variational-inference>}} ).
 
 # Motivation
 
@@ -380,4 +380,4 @@ $$
 
 And maximizing this wrt. $\mu$ and $\Sigma$ is what's referred to as **Automatic Differentiation Variational Inference (ADVI)**!
 
-Now if you want to try it out, [check out the tutorial on how to use ADVI in Turing.jl](../../{{<meta variational-inference>}})!
+Now if you want to try it out, [check out the tutorial on how to use ADVI in Turing.jl]( {{<meta doc-base-url>}}/{{<meta variational-inference>}} )!
diff --git a/tutorials/docs-12-using-turing-guide/index.qmd b/tutorials/docs-12-using-turing-guide/index.qmd
@@ -427,7 +427,7 @@ mle_estimate = maximum_likelihood(model)
 map_estimate = maximum_a_posteriori(model)
 ```
 
-For more details see the [mode estimation page](../{{<meta using-turing-mode-estimation>}}).
+For more details see the [mode estimation page]( {{<meta doc-base-url>}}/{{<meta using-turing-mode-estimation>}} ).
 
 ## Beyond the Basics
 
@@ -453,7 +453,7 @@ simple_choice_f = simple_choice([1.5, 2.0, 0.3])
 chn = sample(simple_choice_f, Gibbs(HMC(0.2, 3, :p), PG(20, :z)), 1000)
 ```
 
-The `Gibbs` sampler can be used to specify unique automatic differentiation backends for different variable spaces. Please see the [Automatic Differentiation](../{{<meta using-turing-autodiff>}}) article for more.
+The `Gibbs` sampler can be used to specify unique automatic differentiation backends for different variable spaces. Please see the [Automatic Differentiation]( {{<meta doc-base-url>}}/{{<meta using-turing-autodiff>}} ) article for more.
 
 For more details of compositional sampling in Turing.jl, please check the corresponding [paper](https://proceedings.mlr.press/v84/ge18b.html).
 

diff --git a/tutorials/docs-13-using-turing-performance-tips/index.qmd b/tutorials/docs-13-using-turing-performance-tips/index.qmd
@@ -49,7 +49,7 @@ supports several AD backends, including [ForwardDiff](https://github.com/JuliaDi
 
 For many common types of models, the default ForwardDiff backend performs great, and there is no need to worry about changing it. However, if you need more speed, you can try
 different backends via the standard [ADTypes](https://github.com/SciML/ADTypes.jl) interface by passing an `AbstractADType` to the sampler with the optional `adtype` argument, e.g.
-`NUTS(adtype = AutoZygote())`. See [Automatic Differentiation](../../{{<meta using-turing-autodiff>}}) for details. Generally, `adtype = AutoForwardDiff()` is likely to be the fastest and most reliable for models with
+`NUTS(adtype = AutoZygote())`. See [Automatic Differentiation] {{<meta doc-base-url>}}/{{<meta using-turing-autodiff>}} ) for details. Generally, `adtype = AutoForwardDiff()` is likely to be the fastest and most reliable for models with
 few parameters (say, less than 20 or so), while reverse-mode backends such as `AutoZygote()` or `AutoReverseDiff()` will perform better for models with many parameters or linear algebra
 operations. If in doubt, it's easy to try a few different backends to see how they compare.