Deploy preview for PR 603

penelopeysm · penelopeysm · commit a8f0a7723091 · 2025-05-21T14:50:18.000Z
diff --git a/pr-previews/603/search.json b/pr-previews/603/search.json
@@ -1099,7 +1099,7 @@
     "href": "usage/troubleshooting/index.html#t0001",
     "title": "Troubleshooting",
     "section": "T0001",
-    "text": "T0001\n\nfailed to find valid initial parameters in {N} tries. This may indicate an error with the model or AD backend…\n\nThis error is seen when a Hamiltonian Monte Carlo sampler is unable to determine a valid set of initial parameters for the sampling. Here, ‘valid’ means that the log probability density of the model, as well as its gradient with respect to each parameter, is finite and not NaN.\n\nNaN gradient\nOne of the most common causes of this error is having a NaN gradient. To find out whether this is happening, you can evaluate the gradient manually. Here is an example with a model that is known to be problematic:\n\nusing Turing\nusing DynamicPPL.TestUtils.AD: run_ad\n\n@model function t0001_bad()\n    a ~ Normal()\n    x ~ truncated(Normal(a), 0, Inf)\nend\n\nmodel = t0001_bad()\nadtype = AutoForwardDiff()\nresult = run_ad(model, adtype; test=false, benchmark=false)\nresult.grad_actual\n\n\n[ Info: Running AD on t0001_bad with ADTypes.AutoForwardDiff()\n       params : [2.347078466381167, 1.4302538119954007]\n       actual : (-4.8318669216945285, [NaN, NaN])\n\n\n\n\n2-element Vector{Float64}:\n NaN\n NaN\n\n\n(See the DynamicPPL docs for more details on the run_ad function and its return type.)\nIn this case, the NaN gradient is caused by the Inf argument to truncated. (See, e.g., this issue on Distributions.jl.) Here, the upper bound of Inf is not needed, so it can be removed:\n\n@model function t0001_good()\n    a ~ Normal()\n    x ~ truncated(Normal(a); lower=0)\nend\n\nmodel = t0001_good()\nadtype = AutoForwardDiff()\nrun_ad(model, adtype; test=false, benchmark=false).grad_actual\n\n\n[ Info: Running AD on t0001_good with ADTypes.AutoForwardDiff()\n       params : [1.7972278889990705, 1.1034333085857766]\n       actual : (-3.0535117243908374, [-0.6622784511026973, -2.6694582270335094])\n\n\n\n\n2-element Vector{Float64}:\n -0.6622784511026973\n -2.6694582270335094\n\n\nMore generally, you could try using a different AD backend; if you don’t know why a model is returning NaN gradients, feel free to open an issue.\n\n\n-Inf log density\nAnother cause of this error is having models with very extreme parameters. This example is taken from this Turing.jl issue:\n\n@model function t0001_bad2()\n      x ~ Exponential(100)\n      y ~ Uniform(0, x)\nend\n\nt0001_bad2 (generic function with 2 methods)\n\n\nThe problem here is that HMC attempts to find initial values for x inside the region of [-2, 2], after x has been transformed to unconstrained space. For a distribution of Exponential(100), the appropriate transformation is log(x) (see the variable transformation docs for more info). Thus, HMC attempts to find initial values of log(x) in the region of [-2, 2], which corresponds to x in the region of [exp(-2), exp(2)] = [0.135, 7.39]. However, all of these values of x will give rise to a zero probability density for y because the value of y = 50.0 is outside the support of Uniform(0, x). Thus, the log density of the model is -Inf, as can be seen with logjoint:\n\nmodel = t0001_bad2() | (y = 50.0,)\nlogjoint(model, (x = exp(-2),))\n\n-Inf\n\n\n\nlogjoint(model, (x = exp(2),))\n\n-Inf\n\n\nThe most direct way of fixing this is to manually provide a set of initial parameters that are valid. For example, you can obtain a set of initial parameters with rand(Dict, model), and then pass this as the initial_params keyword argument to sample. Otherwise, though, you may want to consider reparameterising the model to avoid such issues.",
+    "text": "T0001\n\nfailed to find valid initial parameters in {N} tries. This may indicate an error with the model or AD backend…\n\nThis error is seen when a Hamiltonian Monte Carlo sampler is unable to determine a valid set of initial parameters for the sampling. Here, ‘valid’ means that the log probability density of the model, as well as its gradient with respect to each parameter, is finite and not NaN.\n\nNaN gradient\nOne of the most common causes of this error is having a NaN gradient. To find out whether this is happening, you can evaluate the gradient manually. Here is an example with a model that is known to be problematic:\n\nusing Turing\nusing DynamicPPL.TestUtils.AD: run_ad\n\n@model function t0001_bad()\n    a ~ Normal()\n    x ~ truncated(Normal(a), 0, Inf)\nend\n\nmodel = t0001_bad()\nadtype = AutoForwardDiff()\nresult = run_ad(model, adtype; test=false, benchmark=false)\nresult.grad_actual\n\n\n[ Info: Running AD on t0001_bad with ADTypes.AutoForwardDiff()\n       params : [2.618395892494956, -0.0007016468404532925]\n       actual : (-6.57288826373337, [NaN, NaN])\n\n\n\n\n2-element Vector{Float64}:\n NaN\n NaN\n\n\n(See the DynamicPPL docs for more details on the run_ad function and its return type.)\nIn this case, the NaN gradient is caused by the Inf argument to truncated. (See, e.g., this issue on Distributions.jl.) Here, the upper bound of Inf is not needed, so it can be removed:\n\n@model function t0001_good()\n    a ~ Normal()\n    x ~ truncated(Normal(a); lower=0)\nend\n\nmodel = t0001_good()\nadtype = AutoForwardDiff()\nrun_ad(model, adtype; test=false, benchmark=false).grad_actual\n\n\n[ Info: Running AD on t0001_good with ADTypes.AutoForwardDiff()\n       params : [-0.06679197373919739, 0.13819543914962595]\n       actual : (-1.6921446454003433, [0.4408976311984438, -0.3950536225228891])\n\n\n\n\n2-element Vector{Float64}:\n  0.4408976311984438\n -0.3950536225228891\n\n\nMore generally, you could try using a different AD backend; if you don’t know why a model is returning NaN gradients, feel free to open an issue.\n\n\n-Inf log density\nAnother cause of this error is having models with very extreme parameters. This example is taken from this Turing.jl issue:\n\n@model function t0001_bad2()\n      x ~ Exponential(100)\n      y ~ Uniform(0, x)\nend\n\nt0001_bad2 (generic function with 2 methods)\n\n\nThe problem here is that HMC attempts to find initial values for parameters inside the region of [-2, 2], after the parameters have been transformed to unconstrained space. For a distribution of Exponential(100), the appropriate transformation is log(x) (see the variable transformation docs for more info).\nThus, HMC attempts to find initial values of log(x) in the region of [-2, 2], which corresponds to x in the region of [exp(-2), exp(2)] = [0.135, 7.39]. However, all of these values of x will give rise to a zero probability density for y because the value of y = 50.0 is outside the support of Uniform(0, x). Thus, the log density of the model is -Inf, as can be seen with logjoint:\n\nmodel = t0001_bad2() | (y = 50.0,)\nlogjoint(model, (x = exp(-2),))\n\n-Inf\n\n\n\nlogjoint(model, (x = exp(2),))\n\n-Inf\n\n\nThe most direct way of fixing this is to manually provide a set of initial parameters that are valid. For example, you can obtain a set of initial parameters with rand(Dict, model), and then pass this as the initial_params keyword argument to sample. Otherwise, though, you may want to consider reparameterising the model to avoid such issues.",
     "crumbs": [
       "Get Started",
       "User Guide",
diff --git a/pr-previews/603/sitemap.xml b/pr-previews/603/sitemap.xml
@@ -2,162 +2,162 @@
 <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
   <url>
     <loc>https://turinglang.org/docs/getting-started/index.html</loc>
-    <lastmod>2025-05-21T14:33:22.967Z</lastmod>
+    <lastmod>2025-05-21T14:46:17.066Z</lastmod>
   </url>
   <url>
     <loc>https://turinglang.org/docs/developers/compiler/minituring-contexts/index.html</loc>
-    <lastmod>2025-05-21T14:33:22.964Z</lastmod>
+    <lastmod>2025-05-21T14:46:17.062Z</lastmod>
   </url>
   <url>
     <loc>https://turinglang.org/docs/developers/compiler/design-overview/index.html</loc>
-    <lastmod>2025-05-21T14:33:22.963Z</lastmod>
+    <lastmod>2025-05-21T14:46:17.062Z</lastmod>
   </url>
   <url>
     <loc>https://turinglang.org/docs/developers/inference/variational-inference/index.html</loc>
-    <lastmod>2025-05-21T14:33:22.964Z</lastmod>
+    <lastmod>2025-05-21T14:46:17.063Z</lastmod>
   </url>
   <url>
     <loc>https://turinglang.org/docs/developers/inference/abstractmcmc-turing/index.html</loc>
-    <lastmod>2025-05-21T14:33:22.964Z</lastmod>
+    <lastmod>2025-05-21T14:46:17.063Z</lastmod>
   </url>
   <url>
     <loc>https://turinglang.org/docs/developers/transforms/dynamicppl/index.html</loc>
-    <lastmod>2025-05-21T14:33:22.967Z</lastmod>
+    <lastmod>2025-05-21T14:46:17.065Z</lastmod>
   </url>
   <url>
     <loc>https://turinglang.org/docs/developers/transforms/bijectors/index.html</loc>
-    <lastmod>2025-05-21T14:33:22.965Z</lastmod>
+    <lastmod>2025-05-21T14:46:17.063Z</lastmod>
   </url>
   <url>
     <loc>https://turinglang.org/docs/tutorials/variational-inference/index.html</loc>
-    <lastmod>2025-05-21T14:33:22.969Z</lastmod>
+    <lastmod>2025-05-21T14:46:17.067Z</lastmod>
   </url>
   <url>
     <loc>https://turinglang.org/docs/tutorials/hidden-markov-models/index.html</loc>
-    <lastmod>2025-05-21T14:33:22.968Z</lastmod>
+    <lastmod>2025-05-21T14:46:17.067Z</lastmod>
   </url>
   <url>
     <loc>https://turinglang.org/docs/tutorials/gaussian-process-latent-variable-models/index.html</loc>
-    <lastmod>2025-05-21T14:33:22.968Z</lastmod>
+    <lastmod>2025-05-21T14:46:17.067Z</lastmod>
   </url>
   <url>
     <loc>https://turinglang.org/docs/tutorials/bayesian-time-series-analysis/index.html</loc>
-    <lastmod>2025-05-21T14:33:22.968Z</lastmod>
+    <lastmod>2025-05-21T14:46:17.066Z</lastmod>
   </url>
   <url>
     <loc>https://turinglang.org/docs/tutorials/gaussian-mixture-models/index.html</loc>
-    <lastmod>2025-05-21T14:33:22.968Z</lastmod>
+    <lastmod>2025-05-21T14:46:17.067Z</lastmod>
   </url>
   <url>
     <loc>https://turinglang.org/docs/tutorials/gaussian-processes-introduction/index.html</loc>
-    <lastmod>2025-05-21T14:33:22.968Z</lastmod>
+    <lastmod>2025-05-21T14:46:17.067Z</lastmod>
   </url>
   <url>
     <loc>https://turinglang.org/docs/tutorials/coin-flipping/index.html</loc>
-    <lastmod>2025-05-21T14:33:22.968Z</lastmod>
+    <lastmod>2025-05-21T14:46:17.066Z</lastmod>
   </url>
   <url>
     <loc>https://turinglang.org/docs/tutorials/bayesian-differential-equations/index.html</loc>
-    <lastmod>2025-05-21T14:33:22.967Z</lastmod>
+    <lastmod>2025-05-21T14:46:17.066Z</lastmod>
   </url>
   <url>
     <loc>https://turinglang.org/docs/usage/modifying-logprob/index.html</loc>
-    <lastmod>2025-05-21T14:33:22.969Z</lastmod>
+    <lastmod>2025-05-21T14:46:17.068Z</lastmod>
   </url>
   <url>
     <loc>https://turinglang.org/docs/usage/automatic-differentiation/index.html</loc>
-    <lastmod>2025-05-21T14:33:22.969Z</lastmod>
+    <lastmod>2025-05-21T14:46:17.068Z</lastmod>
   </url>
   <url>
     <loc>https://turinglang.org/docs/usage/sampler-visualisation/index.html</loc>
-    <lastmod>2025-05-21T14:33:22.970Z</lastmod>
+    <lastmod>2025-05-21T14:46:17.068Z</lastmod>
   </url>
   <url>
     <loc>https://turinglang.org/docs/usage/dynamichmc/index.html</loc>
-    <lastmod>2025-05-21T14:33:22.969Z</lastmod>
+    <lastmod>2025-05-21T14:46:17.068Z</lastmod>
   </url>
   <url>
     <loc>https://turinglang.org/docs/usage/tracking-extra-quantities/index.html</loc>
-    <lastmod>2025-05-21T14:33:22.970Z</lastmod>
+    <lastmod>2025-05-21T14:46:17.068Z</lastmod>
   </url>
   <url>
     <loc>https://turinglang.org/docs/usage/custom-distribution/index.html</loc>
-    <lastmod>2025-05-21T14:33:22.969Z</lastmod>
+    <lastmod>2025-05-21T14:46:17.068Z</lastmod>
   </url>
   <url>
     <loc>https://turinglang.org/docs/usage/external-samplers/index.html</loc>
-    <lastmod>2025-05-21T14:33:22.969Z</lastmod>
+    <lastmod>2025-05-21T14:46:17.068Z</lastmod>
   </url>
   <url>
     <loc>https://turinglang.org/docs/usage/performance-tips/index.html</loc>
-    <lastmod>2025-05-21T14:33:22.969Z</lastmod>
+    <lastmod>2025-05-21T14:46:17.068Z</lastmod>
   </url>
   <url>
     <loc>https://turinglang.org/docs/usage/mode-estimation/index.html</loc>
-    <lastmod>2025-05-21T14:33:22.969Z</lastmod>
+    <lastmod>2025-05-21T14:46:17.068Z</lastmod>
   </url>
   <url>
     <loc>https://turinglang.org/docs/usage/probability-interface/index.html</loc>
-    <lastmod>2025-05-21T14:33:22.970Z</lastmod>
+    <lastmod>2025-05-21T14:46:17.068Z</lastmod>
   </url>
   <url>
     <loc>https://turinglang.org/docs/usage/troubleshooting/index.html</loc>
-    <lastmod>2025-05-21T14:33:22.970Z</lastmod>
+    <lastmod>2025-05-21T14:46:17.068Z</lastmod>
   </url>
   <url>
     <loc>https://turinglang.org/docs/tutorials/probabilistic-pca/index.html</loc>
-    <lastmod>2025-05-21T14:33:22.969Z</lastmod>
+    <lastmod>2025-05-21T14:46:17.067Z</lastmod>
   </url>
   <url>
     <loc>https://turinglang.org/docs/tutorials/bayesian-poisson-regression/index.html</loc>
-    <lastmod>2025-05-21T14:33:22.968Z</lastmod>
+    <lastmod>2025-05-21T14:46:17.066Z</lastmod>
   </url>
   <url>
     <loc>https://turinglang.org/docs/tutorials/bayesian-logistic-regression/index.html</loc>
-    <lastmod>2025-05-21T14:33:22.967Z</lastmod>
+    <lastmod>2025-05-21T14:46:17.066Z</lastmod>
   </url>
   <url>
     <loc>https://turinglang.org/docs/tutorials/bayesian-neural-networks/index.html</loc>
-    <lastmod>2025-05-21T14:33:22.968Z</lastmod>
+    <lastmod>2025-05-21T14:46:17.066Z</lastmod>
   </url>
   <url>
     <loc>https://turinglang.org/docs/tutorials/infinite-mixture-models/index.html</loc>
-    <lastmod>2025-05-21T14:33:22.968Z</lastmod>
+    <lastmod>2025-05-21T14:46:17.067Z</lastmod>
   </url>
   <url>
     <loc>https://turinglang.org/docs/tutorials/bayesian-linear-regression/index.html</loc>
-    <lastmod>2025-05-21T14:33:22.967Z</lastmod>
+    <lastmod>2025-05-21T14:46:17.066Z</lastmod>
   </url>
   <url>
     <loc>https://turinglang.org/docs/tutorials/multinomial-logistic-regression/index.html</loc>
-    <lastmod>2025-05-21T14:33:22.969Z</lastmod>
+    <lastmod>2025-05-21T14:46:17.067Z</lastmod>
   </url>
   <url>
     <loc>https://turinglang.org/docs/core-functionality/index.html</loc>
-    <lastmod>2025-05-21T14:33:22.963Z</lastmod>
+    <lastmod>2025-05-21T14:46:17.062Z</lastmod>
   </url>
   <url>
     <loc>https://turinglang.org/docs/developers/transforms/distributions/index.html</loc>
-    <lastmod>2025-05-21T14:33:22.965Z</lastmod>
+    <lastmod>2025-05-21T14:46:17.063Z</lastmod>
   </url>
   <url>
     <loc>https://turinglang.org/docs/developers/inference/abstractmcmc-interface/index.html</loc>
-    <lastmod>2025-05-21T14:33:22.964Z</lastmod>
+    <lastmod>2025-05-21T14:46:17.063Z</lastmod>
   </url>
   <url>
     <loc>https://turinglang.org/docs/developers/inference/implementing-samplers/index.html</loc>
-    <lastmod>2025-05-21T14:33:22.964Z</lastmod>
+    <lastmod>2025-05-21T14:46:17.063Z</lastmod>
   </url>
   <url>
     <loc>https://turinglang.org/docs/developers/contributing/index.html</loc>
-    <lastmod>2025-05-21T14:33:22.964Z</lastmod>
+    <lastmod>2025-05-21T14:46:17.062Z</lastmod>
   </url>
   <url>
     <loc>https://turinglang.org/docs/developers/compiler/model-manual/index.html</loc>
-    <lastmod>2025-05-21T14:33:22.964Z</lastmod>
+    <lastmod>2025-05-21T14:46:17.062Z</lastmod>
   </url>
   <url>
     <loc>https://turinglang.org/docs/developers/compiler/minituring-compiler/index.html</loc>
-    <lastmod>2025-05-21T14:33:22.964Z</lastmod>
+    <lastmod>2025-05-21T14:46:17.062Z</lastmod>
   </url>
 </urlset>
diff --git a/pr-previews/603/usage/troubleshooting/index.html b/pr-previews/603/usage/troubleshooting/index.html
@@ -744,8 +744,8 @@ <h3 class="anchored" data-anchor-id="nan-gradient"><code>NaN</code> gradient</h3
 <div class="cell-output cell-output-stdout">
 <div class="ansi-escaped-output">
 <pre><span class="ansi-cyan-fg ansi-bold">[ </span><span class="ansi-cyan-fg ansi-bold">Info: </span>Running AD on t0001_bad with ADTypes.AutoForwardDiff()
-       params : [2.347078466381167, 1.4302538119954007]
-       actual : (-4.8318669216945285, [NaN, NaN])
+       params : [2.618395892494956, -0.0007016468404532925]
+       actual : (-6.57288826373337, [NaN, NaN])
 </pre>
 </div>
 </div>
@@ -769,15 +769,15 @@ <h3 class="anchored" data-anchor-id="nan-gradient"><code>NaN</code> gradient</h3
 <div class="cell-output cell-output-stdout">
 <div class="ansi-escaped-output">
 <pre><span class="ansi-cyan-fg ansi-bold">[ </span><span class="ansi-cyan-fg ansi-bold">Info: </span>Running AD on t0001_good with ADTypes.AutoForwardDiff()
-       params : [1.7972278889990705, 1.1034333085857766]
-       actual : (-3.0535117243908374, [-0.6622784511026973, -2.6694582270335094])
+       params : [-0.06679197373919739, 0.13819543914962595]
+       actual : (-1.6921446454003433, [0.4408976311984438, -0.3950536225228891])
 </pre>
 </div>
 </div>
 <div class="cell-output cell-output-display" data-execution_count="1">
 <pre><code>2-element Vector{Float64}:
- -0.6622784511026973
- -2.6694582270335094</code></pre>
+  0.4408976311984438
+ -0.3950536225228891</code></pre>
 </div>
 </div>
 <p>More generally, you could try using a different AD backend; if you don’t know why a model is returning <code>NaN</code> gradients, feel free to open an issue.</p>
@@ -794,7 +794,8 @@ <h3 class="anchored" data-anchor-id="inf-log-density"><code>-Inf</code> log dens
 <pre><code>t0001_bad2 (generic function with 2 methods)</code></pre>
 </div>
 </div>
-<p>The problem here is that HMC attempts to find initial values for <code>x</code> inside the region of <code>[-2, 2]</code>, after <code>x</code> has been transformed to unconstrained space. For a distribution of <code>Exponential(100)</code>, the appropriate transformation is <code>log(x)</code> (see the <a href="../../developers/transforms/distributions">variable transformation docs</a> for more info). Thus, HMC attempts to find initial values of <code>log(x)</code> in the region of <code>[-2, 2]</code>, which corresponds to <code>x</code> in the region of <code>[exp(-2), exp(2)]</code> = <code>[0.135, 7.39]</code>. However, all of these values of <code>x</code> will give rise to a zero probability density for <code>y</code> because the value of <code>y = 50.0</code> is outside the support of <code>Uniform(0, x)</code>. Thus, the log density of the model is <code>-Inf</code>, as can be seen with <code>logjoint</code>:</p>
+<p>The problem here is that HMC attempts to find initial values for parameters inside the region of <code>[-2, 2]</code>, <em>after</em> the parameters have been transformed to unconstrained space. For a distribution of <code>Exponential(100)</code>, the appropriate transformation is <code>log(x)</code> (see the <a href="../../developers/transforms/distributions">variable transformation docs</a> for more info).</p>
+<p>Thus, HMC attempts to find initial values of <code>log(x)</code> in the region of <code>[-2, 2]</code>, which corresponds to <code>x</code> in the region of <code>[exp(-2), exp(2)]</code> = <code>[0.135, 7.39]</code>. However, all of these values of <code>x</code> will give rise to a zero probability density for <code>y</code> because the value of <code>y = 50.0</code> is outside the support of <code>Uniform(0, x)</code>. Thus, the log density of the model is <code>-Inf</code>, as can be seen with <code>logjoint</code>:</p>
 <div id="12" class="cell" data-execution_count="1">
 <div class="sourceCode cell-code" id="cb8"><pre class="sourceCode julia code-with-copy"><code class="sourceCode julia"><span id="cb8-1"><a href="#cb8-1" aria-hidden="true" tabindex="-1"></a>model <span class="op">=</span> <span class="fu">t0001_bad2</span>() <span class="op">|</span> (y <span class="op">=</span> <span class="fl">50.0</span>,)</span>
 <span id="cb8-2"><a href="#cb8-2" aria-hidden="true" tabindex="-1"></a><span class="fu">logjoint</span>(model, (x <span class="op">=</span> <span class="fu">exp</span>(<span class="op">-</span><span class="fl">2</span>),))</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>