diff --git a/HISTORY.md b/HISTORY.md index 381b264a6e..35310bfa1b 100644 --- a/HISTORY.md +++ b/HISTORY.md @@ -18,6 +18,75 @@ As long as the above functions are defined correctly, Turing will be able to use The `Turing.Inference.isgibbscomponent(::MySampler)` interface function still exists, but in this version the default has been changed to `true`, so you should not need to overload this. +## **AdvancedVI 0.6** + +Turing.jl v0.42 updates `AdvancedVI.jl` compatibility to 0.6 (we skipped the breaking 0.5 update as it does not introduce new features). +`AdvancedVI.jl@0.6` introduces major structural changes including breaking changes to the interface and multiple new features. +The summary of the changes below are the things that affect the end-users of Turing. +For a more comprehensive list of changes, please refer to the [changelogs](https://github.com/TuringLang/AdvancedVI.jl/blob/main/HISTORY.md) in `AdvancedVI`. + +### Breaking Changes + +A new level of interface for defining different variational algorithms has been introduced in `AdvancedVI` v0.5. As a result, the function `Turing.vi` now receives a keyword argument `algorithm`. The object `algorithm <: AdvancedVI.AbstractVariationalAlgorithm` should now contain all the algorithm-specific configurations. Therefore, keyword arguments of `vi` that were algorithm-specific such as `objective`, `operator`, `averager` and so on, have been moved as fields of the relevant `<: AdvancedVI.AbstractVariationalAlgorithm` structs. +For example, + +```julia +vi(model, q, n_iters; objective=RepGradELBO(10), operator=AdvancedVI.ClipScale()) +``` + +is now + +```julia +vi( + model, + q, + n_iters; + algorithm=KLMinRepGradDescent(adtype; n_samples=10, operator=AdvancedVI.ClipScale()), +) +``` + +Similarly, + +```julia +vi( + model, + q, + n_iters; + objective=RepGradELBO(10; entropy=AdvancedVI.ClosedFormEntropyZeroGradient()), + operator=AdvancedVI.ProximalLocationScaleEntropy(), +) +``` + +is now + +```julia +vi(model, q, n_iters; algorithm=KLMinRepGradProxDescent(adtype; n_samples=10)) +``` + +Additionally, + + - The default hyperparameters of `DoG`and `DoWG` have been altered. + - The deprecated `AdvancedVI@0.2`-era interface is now removed. + - `estimate_objective` now returns the value to be minimized by the optimization algorithm. For example, for ELBO maximization algorithms, `estimate_objective` will return the *negative ELBO*. This is breaking change from the previous behavior where the ELBO was returns. + - The initial value for the `q_meanfield_gaussian`, `q_fullrank_gaussian`, and `q_locationscale` have changed. Specificially, the default initial value for the scale matrix has been changed from `I` to `0.6*I`. + - When using algorithms that expect to operate in unconstrained spaces, the user is now explicitly expected to provide a `Bijectors.TransformedDistribution` wrapping an unconstrained distribution. (Refer to the docstring of `vi`.) + +### New Features + +`AdvancedVI@0.6` adds numerous new features including the following new VI algorithms: + + - `KLMinWassFwdBwd`: Also known as "Wasserstein variational inference," this algorithm minimizes the KL divergence under the Wasserstein-2 metric. + - `KLMinNaturalGradDescent`: This algorithm, also known as "online variational Newton," is the canonical "black-box" natural gradient variational inference algorithm, which minimizes the KL divergence via mirror descent under the KL divergence as the Bregman divergence. + - `KLMinSqrtNaturalGradDescent`: This is a recent variant of `KLMinNaturalGradDescent` that operates in the Cholesky-factor parameterization of Gaussians instead of precision matrices. + - `FisherMinBatchMatch`: This algorithm called "batch-and-match," minimizes the variation of the 2nd order fisher divergence via a proximal point-type algorithm. + +Any of the new algorithms above can readily be used by simply swappin the `algorithm` keyword argument of `vi`. +For example, to use batch-and-match: + +```julia +vi(model, q, n_iters; algorithm=FisherMinBatchMatch()) +``` + # 0.41.1 The `ModeResult` struct returned by `maximum_a_posteriori` and `maximum_likelihood` can now be wrapped in `InitFromParams()`. diff --git a/Project.toml b/Project.toml index 23a8af183d..30335f6a2b 100644 --- a/Project.toml +++ b/Project.toml @@ -55,7 +55,7 @@ Accessors = "0.1" AdvancedHMC = "0.8.3" AdvancedMH = "0.8.9" AdvancedPS = "0.7" -AdvancedVI = "0.4" +AdvancedVI = "0.6" BangBang = "0.4.2" Bijectors = "0.14, 0.15" Compat = "4.15.0" diff --git a/docs/src/api.md b/docs/src/api.md index 885d587ea6..2eda3be6f4 100644 --- a/docs/src/api.md +++ b/docs/src/api.md @@ -114,6 +114,12 @@ See the [docs of AdvancedVI.jl](https://turinglang.org/AdvancedVI.jl/stable/) fo | `q_locationscale` | [`Turing.Variational.q_locationscale`](@ref) | Find a numerically non-degenerate initialization for a location-scale variational family | | `q_meanfield_gaussian` | [`Turing.Variational.q_meanfield_gaussian`](@ref) | Find a numerically non-degenerate initialization for a mean-field Gaussian family | | `q_fullrank_gaussian` | [`Turing.Variational.q_fullrank_gaussian`](@ref) | Find a numerically non-degenerate initialization for a full-rank Gaussian family | +| `KLMinRepGradDescent` | [`Turing.Variational.KLMinRepGradDescent`](@ref) | KL divergence minimization via stochastic gradient descent with the reparameterization gradient | +| `KLMinRepGradProxDescent` | [`Turing.Variational.KLMinRepGradProxDescent`](@ref) | KL divergence minimization via stochastic proximal gradient descent with the reparameterization gradient over location-scale variational families | +| `KLMinScoreGradDescent` | [`Turing.Variational.KLMinScoreGradDescent`](@ref) | KL divergence minimization via stochastic gradient descent with the score gradient | +| `KLMinWassFwdBwd` | [`Turing.Variational.KLMinWassFwdBwd`](@ref) | KL divergence minimization via Wasserstein proximal gradient descent | +| `KLMinNaturalGradDescent` | [`Turing.Variational.KLMinNaturalGradDescent`](@ref) | KL divergence minimization via natural gradient descent | +| `KLMinSqrtNaturalGradDescent` | [`Turing.Variational.KLMinSqrtNaturalGradDescent`](@ref) | KL divergence minimization via natural gradient descent in the square-root parameterization | ### Automatic differentiation types diff --git a/src/Turing.jl b/src/Turing.jl index 58a58eb2af..0528788fed 100644 --- a/src/Turing.jl +++ b/src/Turing.jl @@ -116,10 +116,15 @@ export externalsampler, # Variational inference - AdvancedVI vi, - ADVI, q_locationscale, q_meanfield_gaussian, q_fullrank_gaussian, + KLMinRepGradProxDescent, + KLMinRepGradDescent, + KLMinScoreGradDescent, + KLMinNaturalGradDescent, + KLMinSqrtNaturalGradDescent, + KLMinWassFwdBwd, # ADTypes AutoForwardDiff, AutoReverseDiff, diff --git a/src/variational/VariationalInference.jl b/src/variational/VariationalInference.jl index d516319684..5dde445c66 100644 --- a/src/variational/VariationalInference.jl +++ b/src/variational/VariationalInference.jl @@ -1,21 +1,41 @@ module Variational -using DynamicPPL +using AdvancedVI: + AdvancedVI, + KLMinRepGradDescent, + KLMinRepGradProxDescent, + KLMinScoreGradDescent, + KLMinWassFwdBwd, + KLMinNaturalGradDescent, + KLMinSqrtNaturalGradDescent + using ADTypes +using Bijectors: Bijectors using Distributions +using DynamicPPL: DynamicPPL using LinearAlgebra -using LogDensityProblems +using LogDensityProblems: LogDensityProblems using Random - -import ..Turing: DEFAULT_ADTYPE, PROGRESS - -import AdvancedVI -import Bijectors - -export vi, q_locationscale, q_meanfield_gaussian, q_fullrank_gaussian - -include("deprecated.jl") +using ..Turing: DEFAULT_ADTYPE, PROGRESS + +export vi, + q_locationscale, + q_meanfield_gaussian, + q_fullrank_gaussian, + KLMinRepGradProxDescent, + KLMinRepGradDescent, + KLMinScoreGradDescent, + KLMinWassFwdBwd, + KLMinNaturalGradDescent, + KLMinSqrtNaturalGradDescent + +requires_unconstrained_space(::AdvancedVI.AbstractVariationalAlgorithm) = false +requires_unconstrained_space(::AdvancedVI.KLMinRepGradProxDescent) = true +requires_unconstrained_space(::AdvancedVI.KLMinRepGradDescent) = true +requires_unconstrained_space(::AdvancedVI.KLMinWassFwdBwd) = true +requires_unconstrained_space(::AdvancedVI.KLMinNaturalGradDescent) = true +requires_unconstrained_space(::AdvancedVI.KLMinSqrtNaturalGradDescent) = true """ q_initialize_scale( @@ -62,7 +82,7 @@ function q_initialize_scale( num_max_trials::Int=10, reduce_factor::Real=one(eltype(scale)) / 2, ) - prob = LogDensityFunction(model) + prob = DynamicPPL.LogDensityFunction(model) ℓπ = Base.Fix1(LogDensityProblems.logdensity, prob) varinfo = DynamicPPL.VarInfo(model) @@ -97,7 +117,9 @@ end Find a numerically non-degenerate variational distribution `q` for approximating the target `model` within the location-scale variational family formed by the type of `scale` and `basedist`. The distribution can be manually specified by setting `location`, `scale`, and `basedist`. -Otherwise, it chooses a standard Gaussian by default. +Otherwise, it chooses a Gaussian with zero-mean and scale `0.6*I` (covariance of `0.6^2*I`) by default. +This guarantees that the samples from the initial variational approximation will fall in the range of (-2, 2) with 99.9% probability, which mimics the behavior of the `Turing.InitFromUniform()` strategy. + Whether the default choice is used or not, the `scale` may be adjusted via `q_initialize_scale` so that the log-densities of `model` are finite over the samples from `q`. If `meanfield` is set as `true`, the scale of `q` is restricted to be a diagonal matrix and only the diagonal of `scale` is used. @@ -145,9 +167,11 @@ function q_locationscale( L = if isnothing(scale) if meanfield - q_initialize_scale(rng, model, μ, Diagonal(ones(num_params)), basedist; kwargs...) + q_initialize_scale( + rng, model, μ, Diagonal(fill(0.6, num_params)), basedist; kwargs... + ) else - L0 = LowerTriangular(Matrix{Float64}(I, num_params, num_params)) + L0 = LowerTriangular(Matrix{Float64}(0.6 * I, num_params, num_params)) q_initialize_scale(rng, model, μ, L0, basedist; kwargs...) end else @@ -178,6 +202,10 @@ end Find a numerically non-degenerate mean-field Gaussian `q` for approximating the target `model`. +If the `scale` set as `nothing`, the default value will be a zero-mean Gaussian with a `Diagonal` scale matrix (the "mean-field" approximation) no larger than `0.6*I` (covariance of `0.6^2*I`). +This guarantees that the samples from the initial variational approximation will fall in the range of (-2, 2) with 99.9% probability, which mimics the behavior of the `Turing.InitFromUniform()` strategy. +Whether the default choice is used or not, the `scale` may be adjusted via `q_initialize_scale` so that the log-densities of `model` are finite over the samples from `q`. + # Arguments - `model`: The target `DynamicPPL.Model`. @@ -217,6 +245,10 @@ end Find a numerically non-degenerate Gaussian `q` with a scale with full-rank factors (traditionally referred to as a "full-rank family") for approximating the target `model`. +If the `scale` set as `nothing`, the default value will be a zero-mean Gaussian with a `LowerTriangular` scale matrix (resulting in a covariance with "full-rank" factors) no larger than `0.6*I` (covariance of `0.6^2*I`). +This guarantees that the samples from the initial variational approximation will fall in the range of (-2, 2) with 99.9% probability, which mimics the behavior of the `Turing.InitFromUniform()` strategy. +Whether the default choice is used or not, the `scale` may be adjusted via `q_initialize_scale` so that the log-densities of `model` are finite over the samples from `q`. + # Arguments - `model`: The target `DynamicPPL.Model`. @@ -248,76 +280,82 @@ end """ vi( [rng::Random.AbstractRNG,] - model::DynamicPPL.Model; + model::DynamicPPL.Model, q, - n_iterations::Int; - objective::AdvancedVI.AbstractVariationalObjective = AdvancedVI.RepGradELBO( - 10; entropy = AdvancedVI.ClosedFormEntropyZeroGradient() + max_iter::Int; + adtype::ADTypes.AbstractADType=DEFAULT_ADTYPE, + algorithm::AdvancedVI.AbstractVariationalAlgorithm = KLMinRepGradProxDescent( + adtype; n_samples=10 ), show_progress::Bool = Turing.PROGRESS[], - optimizer::Optimisers.AbstractRule = AdvancedVI.DoWG(), - averager::AdvancedVI.AbstractAverager = AdvancedVI.PolynomialAveraging(), - operator::AdvancedVI.AbstractOperator = AdvancedVI.ProximalLocationScaleEntropy(), - adtype::ADTypes.AbstractADType = Turing.DEFAULT_ADTYPE, kwargs... ) -Approximating the target `model` via variational inference by optimizing `objective` with the initialization `q`. +Approximate the target `model` via the variational inference algorithm `algorithm` by starting from the initial variational approximation `q`. This is a thin wrapper around `AdvancedVI.optimize`. +If the chosen variational inference algorithm operates in an unconstrained space, then the provided initial variational approximation `q` must be a `Bijectors.TransformedDistribution` of an unconstrained distribution. +For example, the initialization supplied by `q_meanfield_gaussian`,`q_fullrank_gaussian`, `q_locationscale`. + +The default `algorithm`, `KLMinRepGradProxDescent` ([relevant docs](https://turinglang.org/AdvancedVI.jl/dev/klminrepgradproxdescent/)), assumes `q` uses `AdvancedVI.MvLocationScale`, which can be constructed by invoking `q_fullrank_gaussian` or `q_meanfield_gaussian`. +For other variational families, refer the documentation of `AdvancedVI` to determine the best algorithm and other options. + # Arguments - `model`: The target `DynamicPPL.Model`. - `q`: The initial variational approximation. -- `n_iterations`: Number of optimization steps. +- `max_iter`: Maximum number of steps. # Keyword Arguments -- `objective`: Variational objective to be optimized. +- `adtype`: Automatic differentiation backend to be applied to the log-density. The default value for `algorithm` also uses this backend for differentiation the variational objective. +- `algorithm`: Variational inference algorithm. - `show_progress`: Whether to show the progress bar. -- `optimizer`: Optimization algorithm. -- `averager`: Parameter averaging strategy. -- `operator`: Operator applied after each optimization step. -- `adtype`: Automatic differentiation backend. See the docs of `AdvancedVI.optimize` for additional keyword arguments. # Returns -- `q`: Variational distribution formed by the last iterate of the optimization run. -- `q_avg`: Variational distribution formed by the averaged iterates according to `averager`. -- `state`: Collection of states used for optimization. This can be used to resume from a past call to `vi`. -- `info`: Information generated during the optimization run. +- `q`: Output variational distribution of `algorithm`. +- `state`: Collection of states used by `algorithm`. This can be used to resume from a past call to `vi`. +- `info`: Information generated while executing `algorithm`. """ function vi( rng::Random.AbstractRNG, model::DynamicPPL.Model, q, - n_iterations::Int; - objective=AdvancedVI.RepGradELBO( - 10; entropy=AdvancedVI.ClosedFormEntropyZeroGradient() + max_iter::Int, + args...; + adtype::ADTypes.AbstractADType=DEFAULT_ADTYPE, + algorithm::AdvancedVI.AbstractVariationalAlgorithm=KLMinRepGradProxDescent( + adtype; n_samples=10 ), + unconstrained::Bool=requires_unconstrained_space(algorithm), show_progress::Bool=PROGRESS[], - optimizer=AdvancedVI.DoWG(), - averager=AdvancedVI.PolynomialAveraging(), - operator=AdvancedVI.ProximalLocationScaleEntropy(), - adtype::ADTypes.AbstractADType=DEFAULT_ADTYPE, kwargs..., ) - return AdvancedVI.optimize( - rng, - LogDensityFunction(model), - objective, - q, - n_iterations; - show_progress=show_progress, - adtype, - optimizer, - averager, - operator, - kwargs..., + prob, q, trans = if unconstrained + @assert q isa Bijectors.TransformedDistribution "The algorithm $(algorithm) operates in an unconstrained space. Therefore, the initial variational approximation is expected to be a Bijectors.TransformedDistribution of an unconstrained distribution." + vi = DynamicPPL.ldf_default_varinfo(model, DynamicPPL.getlogjoint_internal) + vi = DynamicPPL.link!!(vi, model) + prob = DynamicPPL.LogDensityFunction( + model, DynamicPPL.getlogjoint_internal, vi; adtype + ) + prob, q.dist, q.transform + else + prob = DynamicPPL.LogDensityFunction(model; adtype) + prob, q, nothing + end + q, info, state = AdvancedVI.optimize( + rng, algorithm, max_iter, prob, q, args...; show_progress=show_progress, kwargs... ) + q = if unconstrained + Bijectors.TransformedDistribution(q, trans) + else + q + end + return q, info, state end -function vi(model::DynamicPPL.Model, q, n_iterations::Int; kwargs...) - return vi(Random.default_rng(), model, q, n_iterations; kwargs...) +function vi(model::DynamicPPL.Model, q, max_iter::Int; kwargs...) + return vi(Random.default_rng(), model, q, max_iter; kwargs...) end end diff --git a/src/variational/deprecated.jl b/src/variational/deprecated.jl deleted file mode 100644 index 9a9f4777b5..0000000000 --- a/src/variational/deprecated.jl +++ /dev/null @@ -1,61 +0,0 @@ - -import DistributionsAD -export ADVI - -Base.@deprecate meanfield(model) q_meanfield_gaussian(model) - -struct ADVI{AD} - "Number of samples used to estimate the ELBO in each optimization step." - samples_per_step::Int - "Maximum number of gradient steps." - max_iters::Int - "AD backend used for automatic differentiation." - adtype::AD -end - -function ADVI( - samples_per_step::Int=1, - max_iters::Int=1000; - adtype::ADTypes.AbstractADType=ADTypes.AutoForwardDiff(), -) - Base.depwarn( - "The type ADVI will be removed in future releases. Please refer to the new interface for `vi`", - :ADVI; - force=true, - ) - return ADVI{typeof(adtype)}(samples_per_step, max_iters, adtype) -end - -function vi(model::DynamicPPL.Model, alg::ADVI; kwargs...) - Base.depwarn( - "This specialization along with the type `ADVI` will be deprecated in future releases. Please refer to the new interface for `vi`.", - :vi; - force=true, - ) - q = q_meanfield_gaussian(Random.default_rng(), model) - objective = AdvancedVI.RepGradELBO( - alg.samples_per_step; entropy=AdvancedVI.ClosedFormEntropy() - ) - operator = AdvancedVI.IdentityOperator() - _, q_avg, _, _ = vi(model, q, alg.max_iters; objective, operator, kwargs...) - return q_avg -end - -function vi( - model::DynamicPPL.Model, - alg::ADVI, - q::Bijectors.TransformedDistribution{<:DistributionsAD.TuringDiagMvNormal}; - kwargs..., -) - Base.depwarn( - "This specialization along with the type `ADVI` will be deprecated in future releases. Please refer to the new interface for `vi`.", - :vi; - force=true, - ) - objective = AdvancedVI.RepGradELBO( - alg.samples_per_step; entropy=AdvancedVI.ClosedFormEntropy() - ) - operator = AdvancedVI.IdentityOperator() - _, q_avg, _, _ = vi(model, q, alg.max_iters; objective, operator, kwargs...) - return q_avg -end diff --git a/test/Project.toml b/test/Project.toml index 73361d794e..c780ef8482 100644 --- a/test/Project.toml +++ b/test/Project.toml @@ -44,7 +44,7 @@ AbstractMCMC = "5.9" AbstractPPL = "0.11, 0.12, 0.13" AdvancedMH = "0.8.9" AdvancedPS = "0.7" -AdvancedVI = "0.4" +AdvancedVI = "0.6" Aqua = "0.8" BangBang = "0.4" Bijectors = "0.14, 0.15" diff --git a/test/variational/advi.jl b/test/variational/vi.jl similarity index 61% rename from test/variational/advi.jl rename to test/variational/vi.jl index ed8f745df2..1815e5953c 100644 --- a/test/variational/advi.jl +++ b/test/variational/vi.jl @@ -10,12 +10,16 @@ using Distributions: Dirichlet, Normal using LinearAlgebra using MCMCChains: Chains using Random +using ReverseDiff using StableRNGs: StableRNG using Test: @test, @testset using Turing using Turing.Variational @testset "ADVI" begin + adtype = AutoReverseDiff() + operator = AdvancedVI.ClipScale() + @testset "q initialization" begin m = gdemo_default d = length(Turing.DynamicPPL.VarInfo(m)[:]) @@ -41,86 +45,62 @@ using Turing.Variational @testset "default interface" begin for q0 in [q_meanfield_gaussian(gdemo_default), q_fullrank_gaussian(gdemo_default)] - _, q, _, _ = vi(gdemo_default, q0, 100; show_progress=Turing.PROGRESS[]) + q, _, _ = vi(gdemo_default, q0, 100; show_progress=Turing.PROGRESS[], adtype) c1 = rand(q, 10) end end - @testset "custom interface $name" for (name, objective, operator, optimizer) in [ - ( - "ADVI with closed-form entropy", - AdvancedVI.RepGradELBO(10), - AdvancedVI.ProximalLocationScaleEntropy(), - AdvancedVI.DoG(), - ), + @testset "custom algorithm $name" for (name, algorithm) in [ + ("KLMinRepGradProxDescent", KLMinRepGradProxDescent(adtype; n_samples=10)), + ("KLMinRepGradDescent", KLMinRepGradDescent(adtype; operator, n_samples=10)), + ("KLMinNaturalGradDescent", KLMinNaturalGradDescent(; stepsize=1e-3, n_samples=10)), ( - "ADVI with proximal entropy", - AdvancedVI.RepGradELBO(10; entropy=AdvancedVI.ClosedFormEntropyZeroGradient()), - AdvancedVI.ClipScale(), - AdvancedVI.DoG(), - ), - ( - "ADVI with STL entropy", - AdvancedVI.RepGradELBO(10; entropy=AdvancedVI.StickingTheLandingEntropy()), - AdvancedVI.ClipScale(), - AdvancedVI.DoG(), + "KLMinSqrtNaturalGradDescent", + KLMinSqrtNaturalGradDescent(; stepsize=1e-3, n_samples=10), ), + ("KLMinWassFwdBwd", KLMinWassFwdBwd(; stepsize=1e-3, n_samples=10)), ] T = 1000 - q, q_avg, _, _ = vi( + q, _, _ = vi( gdemo_default, q_meanfield_gaussian(gdemo_default), T; - objective, - optimizer, - operator, + algorithm, + adtype, show_progress=Turing.PROGRESS[], ) - N = 1000 - c1 = rand(q_avg, N) c2 = rand(q, N) end - @testset "inference $name" for (name, objective, operator, optimizer) in [ + @testset "inference $name" for (name, algorithm) in [ + ("KLMinRepGradProxDescent", KLMinRepGradProxDescent(adtype; n_samples=10)), + ("KLMinRepGradDescent", KLMinRepGradDescent(adtype; operator, n_samples=10)), + ("KLMinNaturalGradDescent", KLMinNaturalGradDescent(; stepsize=1e-3, n_samples=10)), ( - "ADVI with closed-form entropy", - AdvancedVI.RepGradELBO(10), - AdvancedVI.ProximalLocationScaleEntropy(), - AdvancedVI.DoG(), - ), - ( - "ADVI with proximal entropy", - RepGradELBO(10; entropy=AdvancedVI.ClosedFormEntropyZeroGradient()), - AdvancedVI.ClipScale(), - AdvancedVI.DoG(), - ), - ( - "ADVI with STL entropy", - AdvancedVI.RepGradELBO(10; entropy=AdvancedVI.StickingTheLandingEntropy()), - AdvancedVI.ClipScale(), - AdvancedVI.DoG(), + "KLMinSqrtNaturalGradDescent", + KLMinSqrtNaturalGradDescent(; stepsize=1e-3, n_samples=10), ), + ("KLMinWassFwdBwd", KLMinWassFwdBwd(; stepsize=1e-3, n_samples=10)), ] rng = StableRNG(0x517e1d9bf89bf94f) T = 1000 - q, q_avg, _, _ = vi( + q, _, _ = vi( rng, gdemo_default, q_meanfield_gaussian(gdemo_default), T; - optimizer, + algorithm, + adtype, show_progress=Turing.PROGRESS[], ) N = 1000 - for q_out in [q_avg, q] - samples = transpose(rand(rng, q_out, N)) - chn = Chains(reshape(samples, size(samples)..., 1), ["s", "m"]) + samples = transpose(rand(rng, q, N)) + chn = Chains(reshape(samples, size(samples)..., 1), ["s", "m"]) - check_gdemo(chn; atol=0.5) - end + check_gdemo(chn; atol=0.5) end # regression test for: @@ -143,7 +123,7 @@ using Turing.Variational @test all(x0 .≈ x0_inv) # And regression for https://github.com/TuringLang/Turing.jl/issues/2160. - _, q, _, _ = vi(rng, m, q_meanfield_gaussian(m), 1000) + q, _, _ = vi(rng, m, q_meanfield_gaussian(m), 1000; adtype) x = rand(rng, q, 1000) @test mean(eachcol(x)) ≈ [0.5, 0.5] atol = 0.1 end @@ -158,7 +138,7 @@ using Turing.Variational end model = demo_issue2205() | (y=1.0,) - _, q, _, _ = vi(rng, model, q_meanfield_gaussian(model), 1000) + q, _, _ = vi(rng, model, q_meanfield_gaussian(model), 1000; adtype) # True mean. mean_true = 1 / 2 var_true = 1 / 2