Skip to content

Commit a8f0a77

Browse files
committed
Deploy preview for PR 603
1 parent e2e115d commit a8f0a77

File tree

3 files changed

+49
-48
lines changed

3 files changed

+49
-48
lines changed

pr-previews/603/search.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1099,7 +1099,7 @@
10991099
"href": "usage/troubleshooting/index.html#t0001",
11001100
"title": "Troubleshooting",
11011101
"section": "T0001",
1102-
"text": "T0001\n\nfailed to find valid initial parameters in {N} tries. This may indicate an error with the model or AD backend…\n\nThis error is seen when a Hamiltonian Monte Carlo sampler is unable to determine a valid set of initial parameters for the sampling. Here, ‘valid’ means that the log probability density of the model, as well as its gradient with respect to each parameter, is finite and not NaN.\n\nNaN gradient\nOne of the most common causes of this error is having a NaN gradient. To find out whether this is happening, you can evaluate the gradient manually. Here is an example with a model that is known to be problematic:\n\nusing Turing\nusing DynamicPPL.TestUtils.AD: run_ad\n\n@model function t0001_bad()\n a ~ Normal()\n x ~ truncated(Normal(a), 0, Inf)\nend\n\nmodel = t0001_bad()\nadtype = AutoForwardDiff()\nresult = run_ad(model, adtype; test=false, benchmark=false)\nresult.grad_actual\n\n\n[ Info: Running AD on t0001_bad with ADTypes.AutoForwardDiff()\n params : [2.347078466381167, 1.4302538119954007]\n actual : (-4.8318669216945285, [NaN, NaN])\n\n\n\n\n2-element Vector{Float64}:\n NaN\n NaN\n\n\n(See the DynamicPPL docs for more details on the run_ad function and its return type.)\nIn this case, the NaN gradient is caused by the Inf argument to truncated. (See, e.g., this issue on Distributions.jl.) Here, the upper bound of Inf is not needed, so it can be removed:\n\n@model function t0001_good()\n a ~ Normal()\n x ~ truncated(Normal(a); lower=0)\nend\n\nmodel = t0001_good()\nadtype = AutoForwardDiff()\nrun_ad(model, adtype; test=false, benchmark=false).grad_actual\n\n\n[ Info: Running AD on t0001_good with ADTypes.AutoForwardDiff()\n params : [1.7972278889990705, 1.1034333085857766]\n actual : (-3.0535117243908374, [-0.6622784511026973, -2.6694582270335094])\n\n\n\n\n2-element Vector{Float64}:\n -0.6622784511026973\n -2.6694582270335094\n\n\nMore generally, you could try using a different AD backend; if you don’t know why a model is returning NaN gradients, feel free to open an issue.\n\n\n-Inf log density\nAnother cause of this error is having models with very extreme parameters. This example is taken from this Turing.jl issue:\n\n@model function t0001_bad2()\n x ~ Exponential(100)\n y ~ Uniform(0, x)\nend\n\nt0001_bad2 (generic function with 2 methods)\n\n\nThe problem here is that HMC attempts to find initial values for x inside the region of [-2, 2], after x has been transformed to unconstrained space. For a distribution of Exponential(100), the appropriate transformation is log(x) (see the variable transformation docs for more info). Thus, HMC attempts to find initial values of log(x) in the region of [-2, 2], which corresponds to x in the region of [exp(-2), exp(2)] = [0.135, 7.39]. However, all of these values of x will give rise to a zero probability density for y because the value of y = 50.0 is outside the support of Uniform(0, x). Thus, the log density of the model is -Inf, as can be seen with logjoint:\n\nmodel = t0001_bad2() | (y = 50.0,)\nlogjoint(model, (x = exp(-2),))\n\n-Inf\n\n\n\nlogjoint(model, (x = exp(2),))\n\n-Inf\n\n\nThe most direct way of fixing this is to manually provide a set of initial parameters that are valid. For example, you can obtain a set of initial parameters with rand(Dict, model), and then pass this as the initial_params keyword argument to sample. Otherwise, though, you may want to consider reparameterising the model to avoid such issues.",
1102+
"text": "T0001\n\nfailed to find valid initial parameters in {N} tries. This may indicate an error with the model or AD backend…\n\nThis error is seen when a Hamiltonian Monte Carlo sampler is unable to determine a valid set of initial parameters for the sampling. Here, ‘valid’ means that the log probability density of the model, as well as its gradient with respect to each parameter, is finite and not NaN.\n\nNaN gradient\nOne of the most common causes of this error is having a NaN gradient. To find out whether this is happening, you can evaluate the gradient manually. Here is an example with a model that is known to be problematic:\n\nusing Turing\nusing DynamicPPL.TestUtils.AD: run_ad\n\n@model function t0001_bad()\n a ~ Normal()\n x ~ truncated(Normal(a), 0, Inf)\nend\n\nmodel = t0001_bad()\nadtype = AutoForwardDiff()\nresult = run_ad(model, adtype; test=false, benchmark=false)\nresult.grad_actual\n\n\n[ Info: Running AD on t0001_bad with ADTypes.AutoForwardDiff()\n params : [2.618395892494956, -0.0007016468404532925]\n actual : (-6.57288826373337, [NaN, NaN])\n\n\n\n\n2-element Vector{Float64}:\n NaN\n NaN\n\n\n(See the DynamicPPL docs for more details on the run_ad function and its return type.)\nIn this case, the NaN gradient is caused by the Inf argument to truncated. (See, e.g., this issue on Distributions.jl.) Here, the upper bound of Inf is not needed, so it can be removed:\n\n@model function t0001_good()\n a ~ Normal()\n x ~ truncated(Normal(a); lower=0)\nend\n\nmodel = t0001_good()\nadtype = AutoForwardDiff()\nrun_ad(model, adtype; test=false, benchmark=false).grad_actual\n\n\n[ Info: Running AD on t0001_good with ADTypes.AutoForwardDiff()\n params : [-0.06679197373919739, 0.13819543914962595]\n actual : (-1.6921446454003433, [0.4408976311984438, -0.3950536225228891])\n\n\n\n\n2-element Vector{Float64}:\n 0.4408976311984438\n -0.3950536225228891\n\n\nMore generally, you could try using a different AD backend; if you don’t know why a model is returning NaN gradients, feel free to open an issue.\n\n\n-Inf log density\nAnother cause of this error is having models with very extreme parameters. This example is taken from this Turing.jl issue:\n\n@model function t0001_bad2()\n x ~ Exponential(100)\n y ~ Uniform(0, x)\nend\n\nt0001_bad2 (generic function with 2 methods)\n\n\nThe problem here is that HMC attempts to find initial values for parameters inside the region of [-2, 2], after the parameters have been transformed to unconstrained space. For a distribution of Exponential(100), the appropriate transformation is log(x) (see the variable transformation docs for more info).\nThus, HMC attempts to find initial values of log(x) in the region of [-2, 2], which corresponds to x in the region of [exp(-2), exp(2)] = [0.135, 7.39]. However, all of these values of x will give rise to a zero probability density for y because the value of y = 50.0 is outside the support of Uniform(0, x). Thus, the log density of the model is -Inf, as can be seen with logjoint:\n\nmodel = t0001_bad2() | (y = 50.0,)\nlogjoint(model, (x = exp(-2),))\n\n-Inf\n\n\n\nlogjoint(model, (x = exp(2),))\n\n-Inf\n\n\nThe most direct way of fixing this is to manually provide a set of initial parameters that are valid. For example, you can obtain a set of initial parameters with rand(Dict, model), and then pass this as the initial_params keyword argument to sample. Otherwise, though, you may want to consider reparameterising the model to avoid such issues.",
11031103
"crumbs": [
11041104
"Get Started",
11051105
"User Guide",

pr-previews/603/sitemap.xml

Lines changed: 40 additions & 40 deletions
Original file line numberDiff line numberDiff line change
@@ -2,162 +2,162 @@
22
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
33
<url>
44
<loc>https://turinglang.org/docs/getting-started/index.html</loc>
5-
<lastmod>2025-05-21T14:33:22.967Z</lastmod>
5+
<lastmod>2025-05-21T14:46:17.066Z</lastmod>
66
</url>
77
<url>
88
<loc>https://turinglang.org/docs/developers/compiler/minituring-contexts/index.html</loc>
9-
<lastmod>2025-05-21T14:33:22.964Z</lastmod>
9+
<lastmod>2025-05-21T14:46:17.062Z</lastmod>
1010
</url>
1111
<url>
1212
<loc>https://turinglang.org/docs/developers/compiler/design-overview/index.html</loc>
13-
<lastmod>2025-05-21T14:33:22.963Z</lastmod>
13+
<lastmod>2025-05-21T14:46:17.062Z</lastmod>
1414
</url>
1515
<url>
1616
<loc>https://turinglang.org/docs/developers/inference/variational-inference/index.html</loc>
17-
<lastmod>2025-05-21T14:33:22.964Z</lastmod>
17+
<lastmod>2025-05-21T14:46:17.063Z</lastmod>
1818
</url>
1919
<url>
2020
<loc>https://turinglang.org/docs/developers/inference/abstractmcmc-turing/index.html</loc>
21-
<lastmod>2025-05-21T14:33:22.964Z</lastmod>
21+
<lastmod>2025-05-21T14:46:17.063Z</lastmod>
2222
</url>
2323
<url>
2424
<loc>https://turinglang.org/docs/developers/transforms/dynamicppl/index.html</loc>
25-
<lastmod>2025-05-21T14:33:22.967Z</lastmod>
25+
<lastmod>2025-05-21T14:46:17.065Z</lastmod>
2626
</url>
2727
<url>
2828
<loc>https://turinglang.org/docs/developers/transforms/bijectors/index.html</loc>
29-
<lastmod>2025-05-21T14:33:22.965Z</lastmod>
29+
<lastmod>2025-05-21T14:46:17.063Z</lastmod>
3030
</url>
3131
<url>
3232
<loc>https://turinglang.org/docs/tutorials/variational-inference/index.html</loc>
33-
<lastmod>2025-05-21T14:33:22.969Z</lastmod>
33+
<lastmod>2025-05-21T14:46:17.067Z</lastmod>
3434
</url>
3535
<url>
3636
<loc>https://turinglang.org/docs/tutorials/hidden-markov-models/index.html</loc>
37-
<lastmod>2025-05-21T14:33:22.968Z</lastmod>
37+
<lastmod>2025-05-21T14:46:17.067Z</lastmod>
3838
</url>
3939
<url>
4040
<loc>https://turinglang.org/docs/tutorials/gaussian-process-latent-variable-models/index.html</loc>
41-
<lastmod>2025-05-21T14:33:22.968Z</lastmod>
41+
<lastmod>2025-05-21T14:46:17.067Z</lastmod>
4242
</url>
4343
<url>
4444
<loc>https://turinglang.org/docs/tutorials/bayesian-time-series-analysis/index.html</loc>
45-
<lastmod>2025-05-21T14:33:22.968Z</lastmod>
45+
<lastmod>2025-05-21T14:46:17.066Z</lastmod>
4646
</url>
4747
<url>
4848
<loc>https://turinglang.org/docs/tutorials/gaussian-mixture-models/index.html</loc>
49-
<lastmod>2025-05-21T14:33:22.968Z</lastmod>
49+
<lastmod>2025-05-21T14:46:17.067Z</lastmod>
5050
</url>
5151
<url>
5252
<loc>https://turinglang.org/docs/tutorials/gaussian-processes-introduction/index.html</loc>
53-
<lastmod>2025-05-21T14:33:22.968Z</lastmod>
53+
<lastmod>2025-05-21T14:46:17.067Z</lastmod>
5454
</url>
5555
<url>
5656
<loc>https://turinglang.org/docs/tutorials/coin-flipping/index.html</loc>
57-
<lastmod>2025-05-21T14:33:22.968Z</lastmod>
57+
<lastmod>2025-05-21T14:46:17.066Z</lastmod>
5858
</url>
5959
<url>
6060
<loc>https://turinglang.org/docs/tutorials/bayesian-differential-equations/index.html</loc>
61-
<lastmod>2025-05-21T14:33:22.967Z</lastmod>
61+
<lastmod>2025-05-21T14:46:17.066Z</lastmod>
6262
</url>
6363
<url>
6464
<loc>https://turinglang.org/docs/usage/modifying-logprob/index.html</loc>
65-
<lastmod>2025-05-21T14:33:22.969Z</lastmod>
65+
<lastmod>2025-05-21T14:46:17.068Z</lastmod>
6666
</url>
6767
<url>
6868
<loc>https://turinglang.org/docs/usage/automatic-differentiation/index.html</loc>
69-
<lastmod>2025-05-21T14:33:22.969Z</lastmod>
69+
<lastmod>2025-05-21T14:46:17.068Z</lastmod>
7070
</url>
7171
<url>
7272
<loc>https://turinglang.org/docs/usage/sampler-visualisation/index.html</loc>
73-
<lastmod>2025-05-21T14:33:22.970Z</lastmod>
73+
<lastmod>2025-05-21T14:46:17.068Z</lastmod>
7474
</url>
7575
<url>
7676
<loc>https://turinglang.org/docs/usage/dynamichmc/index.html</loc>
77-
<lastmod>2025-05-21T14:33:22.969Z</lastmod>
77+
<lastmod>2025-05-21T14:46:17.068Z</lastmod>
7878
</url>
7979
<url>
8080
<loc>https://turinglang.org/docs/usage/tracking-extra-quantities/index.html</loc>
81-
<lastmod>2025-05-21T14:33:22.970Z</lastmod>
81+
<lastmod>2025-05-21T14:46:17.068Z</lastmod>
8282
</url>
8383
<url>
8484
<loc>https://turinglang.org/docs/usage/custom-distribution/index.html</loc>
85-
<lastmod>2025-05-21T14:33:22.969Z</lastmod>
85+
<lastmod>2025-05-21T14:46:17.068Z</lastmod>
8686
</url>
8787
<url>
8888
<loc>https://turinglang.org/docs/usage/external-samplers/index.html</loc>
89-
<lastmod>2025-05-21T14:33:22.969Z</lastmod>
89+
<lastmod>2025-05-21T14:46:17.068Z</lastmod>
9090
</url>
9191
<url>
9292
<loc>https://turinglang.org/docs/usage/performance-tips/index.html</loc>
93-
<lastmod>2025-05-21T14:33:22.969Z</lastmod>
93+
<lastmod>2025-05-21T14:46:17.068Z</lastmod>
9494
</url>
9595
<url>
9696
<loc>https://turinglang.org/docs/usage/mode-estimation/index.html</loc>
97-
<lastmod>2025-05-21T14:33:22.969Z</lastmod>
97+
<lastmod>2025-05-21T14:46:17.068Z</lastmod>
9898
</url>
9999
<url>
100100
<loc>https://turinglang.org/docs/usage/probability-interface/index.html</loc>
101-
<lastmod>2025-05-21T14:33:22.970Z</lastmod>
101+
<lastmod>2025-05-21T14:46:17.068Z</lastmod>
102102
</url>
103103
<url>
104104
<loc>https://turinglang.org/docs/usage/troubleshooting/index.html</loc>
105-
<lastmod>2025-05-21T14:33:22.970Z</lastmod>
105+
<lastmod>2025-05-21T14:46:17.068Z</lastmod>
106106
</url>
107107
<url>
108108
<loc>https://turinglang.org/docs/tutorials/probabilistic-pca/index.html</loc>
109-
<lastmod>2025-05-21T14:33:22.969Z</lastmod>
109+
<lastmod>2025-05-21T14:46:17.067Z</lastmod>
110110
</url>
111111
<url>
112112
<loc>https://turinglang.org/docs/tutorials/bayesian-poisson-regression/index.html</loc>
113-
<lastmod>2025-05-21T14:33:22.968Z</lastmod>
113+
<lastmod>2025-05-21T14:46:17.066Z</lastmod>
114114
</url>
115115
<url>
116116
<loc>https://turinglang.org/docs/tutorials/bayesian-logistic-regression/index.html</loc>
117-
<lastmod>2025-05-21T14:33:22.967Z</lastmod>
117+
<lastmod>2025-05-21T14:46:17.066Z</lastmod>
118118
</url>
119119
<url>
120120
<loc>https://turinglang.org/docs/tutorials/bayesian-neural-networks/index.html</loc>
121-
<lastmod>2025-05-21T14:33:22.968Z</lastmod>
121+
<lastmod>2025-05-21T14:46:17.066Z</lastmod>
122122
</url>
123123
<url>
124124
<loc>https://turinglang.org/docs/tutorials/infinite-mixture-models/index.html</loc>
125-
<lastmod>2025-05-21T14:33:22.968Z</lastmod>
125+
<lastmod>2025-05-21T14:46:17.067Z</lastmod>
126126
</url>
127127
<url>
128128
<loc>https://turinglang.org/docs/tutorials/bayesian-linear-regression/index.html</loc>
129-
<lastmod>2025-05-21T14:33:22.967Z</lastmod>
129+
<lastmod>2025-05-21T14:46:17.066Z</lastmod>
130130
</url>
131131
<url>
132132
<loc>https://turinglang.org/docs/tutorials/multinomial-logistic-regression/index.html</loc>
133-
<lastmod>2025-05-21T14:33:22.969Z</lastmod>
133+
<lastmod>2025-05-21T14:46:17.067Z</lastmod>
134134
</url>
135135
<url>
136136
<loc>https://turinglang.org/docs/core-functionality/index.html</loc>
137-
<lastmod>2025-05-21T14:33:22.963Z</lastmod>
137+
<lastmod>2025-05-21T14:46:17.062Z</lastmod>
138138
</url>
139139
<url>
140140
<loc>https://turinglang.org/docs/developers/transforms/distributions/index.html</loc>
141-
<lastmod>2025-05-21T14:33:22.965Z</lastmod>
141+
<lastmod>2025-05-21T14:46:17.063Z</lastmod>
142142
</url>
143143
<url>
144144
<loc>https://turinglang.org/docs/developers/inference/abstractmcmc-interface/index.html</loc>
145-
<lastmod>2025-05-21T14:33:22.964Z</lastmod>
145+
<lastmod>2025-05-21T14:46:17.063Z</lastmod>
146146
</url>
147147
<url>
148148
<loc>https://turinglang.org/docs/developers/inference/implementing-samplers/index.html</loc>
149-
<lastmod>2025-05-21T14:33:22.964Z</lastmod>
149+
<lastmod>2025-05-21T14:46:17.063Z</lastmod>
150150
</url>
151151
<url>
152152
<loc>https://turinglang.org/docs/developers/contributing/index.html</loc>
153-
<lastmod>2025-05-21T14:33:22.964Z</lastmod>
153+
<lastmod>2025-05-21T14:46:17.062Z</lastmod>
154154
</url>
155155
<url>
156156
<loc>https://turinglang.org/docs/developers/compiler/model-manual/index.html</loc>
157-
<lastmod>2025-05-21T14:33:22.964Z</lastmod>
157+
<lastmod>2025-05-21T14:46:17.062Z</lastmod>
158158
</url>
159159
<url>
160160
<loc>https://turinglang.org/docs/developers/compiler/minituring-compiler/index.html</loc>
161-
<lastmod>2025-05-21T14:33:22.964Z</lastmod>
161+
<lastmod>2025-05-21T14:46:17.062Z</lastmod>
162162
</url>
163163
</urlset>

pr-previews/603/usage/troubleshooting/index.html

Lines changed: 8 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -744,8 +744,8 @@ <h3 class="anchored" data-anchor-id="nan-gradient"><code>NaN</code> gradient</h3
744744
<div class="cell-output cell-output-stdout">
745745
<div class="ansi-escaped-output">
746746
<pre><span class="ansi-cyan-fg ansi-bold">[ </span><span class="ansi-cyan-fg ansi-bold">Info: </span>Running AD on t0001_bad with ADTypes.AutoForwardDiff()
747-
params : [2.347078466381167, 1.4302538119954007]
748-
actual : (-4.8318669216945285, [NaN, NaN])
747+
params : [2.618395892494956, -0.0007016468404532925]
748+
actual : (-6.57288826373337, [NaN, NaN])
749749
</pre>
750750
</div>
751751
</div>
@@ -769,15 +769,15 @@ <h3 class="anchored" data-anchor-id="nan-gradient"><code>NaN</code> gradient</h3
769769
<div class="cell-output cell-output-stdout">
770770
<div class="ansi-escaped-output">
771771
<pre><span class="ansi-cyan-fg ansi-bold">[ </span><span class="ansi-cyan-fg ansi-bold">Info: </span>Running AD on t0001_good with ADTypes.AutoForwardDiff()
772-
params : [1.7972278889990705, 1.1034333085857766]
773-
actual : (-3.0535117243908374, [-0.6622784511026973, -2.6694582270335094])
772+
params : [-0.06679197373919739, 0.13819543914962595]
773+
actual : (-1.6921446454003433, [0.4408976311984438, -0.3950536225228891])
774774
</pre>
775775
</div>
776776
</div>
777777
<div class="cell-output cell-output-display" data-execution_count="1">
778778
<pre><code>2-element Vector{Float64}:
779-
-0.6622784511026973
780-
-2.6694582270335094</code></pre>
779+
0.4408976311984438
780+
-0.3950536225228891</code></pre>
781781
</div>
782782
</div>
783783
<p>More generally, you could try using a different AD backend; if you don’t know why a model is returning <code>NaN</code> gradients, feel free to open an issue.</p>
@@ -794,7 +794,8 @@ <h3 class="anchored" data-anchor-id="inf-log-density"><code>-Inf</code> log dens
794794
<pre><code>t0001_bad2 (generic function with 2 methods)</code></pre>
795795
</div>
796796
</div>
797-
<p>The problem here is that HMC attempts to find initial values for <code>x</code> inside the region of <code>[-2, 2]</code>, after <code>x</code> has been transformed to unconstrained space. For a distribution of <code>Exponential(100)</code>, the appropriate transformation is <code>log(x)</code> (see the <a href="../../developers/transforms/distributions">variable transformation docs</a> for more info). Thus, HMC attempts to find initial values of <code>log(x)</code> in the region of <code>[-2, 2]</code>, which corresponds to <code>x</code> in the region of <code>[exp(-2), exp(2)]</code> = <code>[0.135, 7.39]</code>. However, all of these values of <code>x</code> will give rise to a zero probability density for <code>y</code> because the value of <code>y = 50.0</code> is outside the support of <code>Uniform(0, x)</code>. Thus, the log density of the model is <code>-Inf</code>, as can be seen with <code>logjoint</code>:</p>
797+
<p>The problem here is that HMC attempts to find initial values for parameters inside the region of <code>[-2, 2]</code>, <em>after</em> the parameters have been transformed to unconstrained space. For a distribution of <code>Exponential(100)</code>, the appropriate transformation is <code>log(x)</code> (see the <a href="../../developers/transforms/distributions">variable transformation docs</a> for more info).</p>
798+
<p>Thus, HMC attempts to find initial values of <code>log(x)</code> in the region of <code>[-2, 2]</code>, which corresponds to <code>x</code> in the region of <code>[exp(-2), exp(2)]</code> = <code>[0.135, 7.39]</code>. However, all of these values of <code>x</code> will give rise to a zero probability density for <code>y</code> because the value of <code>y = 50.0</code> is outside the support of <code>Uniform(0, x)</code>. Thus, the log density of the model is <code>-Inf</code>, as can be seen with <code>logjoint</code>:</p>
798799
<div id="12" class="cell" data-execution_count="1">
799800
<div class="sourceCode cell-code" id="cb8"><pre class="sourceCode julia code-with-copy"><code class="sourceCode julia"><span id="cb8-1"><a href="#cb8-1" aria-hidden="true" tabindex="-1"></a>model <span class="op">=</span> <span class="fu">t0001_bad2</span>() <span class="op">|</span> (y <span class="op">=</span> <span class="fl">50.0</span>,)</span>
800801
<span id="cb8-2"><a href="#cb8-2" aria-hidden="true" tabindex="-1"></a><span class="fu">logjoint</span>(model, (x <span class="op">=</span> <span class="fu">exp</span>(<span class="op">-</span><span class="fl">2</span>),))</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>

0 commit comments

Comments
 (0)