Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 12 additions & 0 deletions docs/Project.toml
Original file line number Diff line number Diff line change
@@ -1,17 +1,26 @@
[deps]
AlgebraOfGraphics = "cbdf2221-f076-402e-a563-3d30da359d67"
BenchmarkTools = "6e4b80f9-dd63-53aa-95a3-0cdb28fa8baf"
BoxCox = "1248164d-f7a6-4bdb-8e8d-8c4a187b3ce6"
CairoMakie = "13f3f980-e62b-5c42-98c6-ff1f3baf88f0"
DataFrames = "a93c6f00-e57d-5684-b7b6-d8193f3e46c0"
DisplayAs = "0b91fe84-8a4c-11e9-3e1d-67c38462b6d6"
Distributed = "8ba89e20-285c-5b6f-9357-94700520ee1b"
Distributions = "31c24e10-a181-5473-b8eb-7969acd0382f"
Documenter = "e30172f5-a6a5-5a46-863b-614d45cd2de4"
Effects = "8f03c58b-bd97-4933-a826-f71b64d2cca2"
FiniteDiff = "6a86dc24-6348-571c-b903-95158fe2bd41"
ForwardDiff = "f6369f11-7733-5829-9624-2563aa707210"
FreqTables = "da1fdf0e-e0ff-5433-a45f-9bb5ff651cb1"
Gadfly = "c91e804a-d5a3-530f-b6f0-dfbca275c004"
MixedModels = "ff71e718-51f3-5ec2-a782-8ffcbfa3c316"
MixedModelsDatasets = "7e9fb7ac-9f67-43bf-b2c8-96ba0796cbb6"
MixedModelsExtras = "781a26e1-49f4-409a-8f4c-c3159d78c17e"
ProgressMeter = "92933f4c-e287-5a05-a399-4b506db050ca"
Random = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c"
RegressionFormulae = "545c379f-4ec2-4339-9aea-38f2fb6a8ba2"
StandardizedPredictors = "5064a6a7-f8c2-40e2-8bdc-797ec6f1ae18"
Statistics = "10745b16-79ce-11e8-11f9-7d13ad32a3b2"
StatsAPI = "82ae8749-77ed-4fe6-ae5f-f523153014b0"
StatsBase = "2913bbd2-ae8a-5f71-8c99-4fb6c76f3a91"
StatsModels = "3eaba693-59b7-5ba5-a881-562e759f1c8d"
Expand All @@ -24,3 +33,6 @@ FreqTables = "0.4"
Gadfly = "1"
StatsAPI = "1.5"
StatsBase = "0.33"

[sources.MixedModels]
path = ".."
5 changes: 5 additions & 0 deletions docs/make.jl
Original file line number Diff line number Diff line change
@@ -1,9 +1,12 @@
using Documenter
using MixedModels

using Distributions
using FiniteDiff
using ForwardDiff
using StatsAPI
using StatsBase
using StatsModels

makedocs(;
sitename="MixedModels",
Expand All @@ -22,7 +25,9 @@ makedocs(;
"rankdeficiency.md",
"mime.md",
"derivatives.md",
"limitations.md",
"formula_syntax.md",
"ecosystem.md",
"api.md",
],
)
Expand Down
260 changes: 260 additions & 0 deletions docs/src/ecosystem.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,260 @@
# Additional Functionality in Other Packages

```@setup Ecosystem
using DisplayAs
```

Several packages extend the functionality of MixedModels.jl, both in ways specific to mixed models and in ways applicable to more general regression models. In the following, we will use the models from the previous sections to showcase this functionality.

```@example Ecosystem
using MixedModels
progress = false
```

```@example Ecosystem
insteval = MixedModels.dataset("insteval")
ie1 = fit(MixedModel,
@formula(y ~ 1 + studage + lectage + service + (1|s) + (1|d) + (1|dept)),
insteval; progress)

```

```@example Ecosystem
ie2 = fit(MixedModel,
@formula(y ~ 1 + studage + lectage + service +
(1 | s) +
(1 + service | d) +
(1 + service | dept)),
insteval; progress)
```

```@example Ecosystem
sleepstudy = MixedModels.dataset("sleepstudy")
ss1 = fit(MixedModel, @formula(reaction ~ 1 + days + (1|subj)), sleepstudy; progress)
```

```@example Ecosystem
ss2 = fit(MixedModel, @formula(reaction ~ 1 + days + (1 + days|subj)), sleepstudy; progress)
```

```@example Ecosystem
using DataFrames
contra = DataFrame(MixedModels.dataset("contra"))
contra[!, :anych] .= contra[!, :livch] .!= "0"
contrasts = Dict(:livch => EffectsCoding(; base="0"),
:urban => HelmertCoding(),
:anych => HelmertCoding())
gm1 = fit(MixedModel,
@formula(use ~ 1 + urban + anych * age + abs2(age) + (1 | dist & urban)),
contra,
Bernoulli();
contrasts,
progress)
```

## MixedModelsExtras.jl

https://palday.github.io/MixedModelsExtras.jl/v2

MixedModelsExtras.jl is a collection of odds-and-ends that may be useful when working with mixed effects models, but which we do not want to include in MixedModels.jl at this time.
Some functions may one day migrate to MixedModels.jl, when we are happy with their performance and interface (e.g. `vif`), but some are intentionally omitted from MixedModels.jl (e.g. `r2`, `adjr2`).

```@example Ecosystem
using MixedModelsExtras
```

```@example Ecosystem
r2(ss2; conditional=true)
```

```@example Ecosystem
r2(ss2; conditional=false)
```

```@example Ecosystem
icc(ie2)
```

```@example Ecosystem
icc(ie2, :dept)
```

```@example Ecosystem
vif(ie1)
```

```@example Ecosystem
DataFrame(; coef=fixefnames(ie1)[2:end], VIF=vif(ie1))
```

```@example Ecosystem
gvif(ie1)
```

```@example Ecosystem
DataFrame(; term=termnames(ie1)[2][2:end], GVIF=gvif(ie1))
```

## RegressionFormulae.jl

https://github.com/kleinschmidt/RegressionFormulae.jl

RegressionFormulae.jl provides a few extensions to the somewhat more restricted variant of the Wilkinson-Roger notation found in Julia. In particular, it adds `/` for nested designs within the fixed effects and `^` for computing interactions only up to a certain order.

```@example Ecosystem
using RegressionFormulae

fit(MixedModel,
@formula(y ~ 1 + service / (studage + lectage) +
(1 | s) +
(1 | d) +
(1 | dept)),
insteval; progress)
```

```@example Ecosystem
fit(MixedModel,
@formula(y ~ 1 + (studage + lectage + service)^2 +
(1 | s) +
(1 | d) +
(1 | dept)),
insteval; progress)
```

## BoxCox.jl

https://palday.github.io/BoxCox.jl/v0.3/

BoxCox.jl implements a the Box-Cox transformation in an efficient way. Via package extensions, it supports specializations for MixedModels.jl and several plotting functions, but does not incur a dependency penalty for this functionality when MixedModels.jl or Makie.jl are not loaded.


```@example Ecosystem
using BoxCox

bc = fit(BoxCoxTransformation, ss2)
```

```@example Ecosystem
using CairoMakie
boxcoxplot(bc; conf_level=0.95)
```

The estimated λ is very close to -1, i.e. the reciprocal of reaction time, which has a natural interpretation as speed. In other words, the Box-Cox transformation suggests that we should consider modelling the sleepstudy data as speed (reaction per unit time) instead of reaction time:

```@example Ecosystem
fit(MixedModel, @formula(1000 / reaction ~ 1 + days + (1 + days|subj)), sleepstudy)
```

(We multiply by 1000 to get the responses per _second_ instead of the responses per _millisecond_.)

!!! tip
BoxCox.jl also works with classical linear models.

## Effects.jl

https://beacon-biosignals.github.io/Effects.jl/v1.2/

Effects.jl provides a convenient method to compute *effects*, i.e. predictions and associated prediction intervals computed at points on a reference grid. For models with a nonlinear link function, Effects.jl will also compute appropriate errors on the response scale based on the difference method.

For MixedModels.jl, the predictions are computed based on the fixed effects only.

The functionality of Effects.jl was inspired by the `effects` and `emmeans` packages in R and the methods within are based on @fox:effect:2003.

```@example Ecosystem
using Effects
```


```@example Ecosystem
design = Dict(:age => -15:1:20,
:anych => [true, false])

eff_logit = effects(design, gm1; eff_col="use", level=0.95)
```

```@example Ecosystem
eff_prob = effects(design, gm1; eff_col="use", level=0.95, invlink=AutoInvLink())
```

Effects are particularly nice for visualizing the model fit and its predictions.

```@example Ecosystem
using AlgebraOfGraphics # like ggplot2, but an algebra instead of a grammar
using CairoMakie

plt1 = data(eff_logit) *
mapping(:age, :use; color=:anych) *
(visual(Lines) + mapping(; lower=:lower, upper=:upper) * visual(LinesFill))
draw(plt1)
```

```@example Ecosystem
plt2 = data(eff_prob) *
mapping(:age, :use; color=:anych => "children") *
(visual(Lines) + mapping(; lower=:lower, upper=:upper) * visual(LinesFill))
draw(plt2)
```

```@example Ecosystem
using Statistics: mean
contra_by_age = transform(contra,
:age => ByRow(x -> round(Int, x)),
:use => ByRow(==("Y"));
renamecols=false)
contra_by_age = combine(groupby(contra_by_age, [:age, :anych]),
:use => mean => :use)
plt3 = plt2 +
data(contra_by_age) *
mapping(:age, :use;
color=:anych => "children") * visual(Scatter)

draw(plt3;
axis=(; title="Estimated contraceptive use by age and children",
limits=(nothing, (0, 1)) # ylim=0,1, xlim=auto
))
```

Effects and estimated marginal (least squares) means are closely related and partially concepts. Effects.jl provides convenience function `emmeans` and `empairs` for computing EM means and pairwise differences of EM means.

```@example Ecosystem
emmeans(gm1)
```

```@example Ecosystem
empairs(gm1; dof=Inf)
```

!!! tip
Effects.jl will work with any package that supports the StatsAPI.jl-based `RegressionModel` interface.

## StandardizedPredictors.jl

https://beacon-biosignals.github.io/StandardizedPredictors.jl/v1/

StandardizedPredictors.jl provides a convenient way to express centering, scaling, and z-standardization as a "contrast" via the pseudo-contrasts `Center`, `Scale`, `ZScore`.
Because these use the usual contrast machinery, they work well with any packages that use that machinery correctly (e.g. Effects.jl). The default behavior is to empirically compute the center and scale, but these can also be explicitly provided, either as a number or as a function (e.g. `median` to use the median for centering.)

```@example Ecosystem
using StandardizedPredictors

contrasts = Dict(:days => Center())
fit(MixedModel,
@formula(reaction ~ 1 + days + (1 + days|subj)), sleepstudy;
contrasts)
```

!!! tip
StandardizedPredictors.jl will work with any package that supports the StatsModels.jl-based `@formula` and contrast machinery.

# RCall.jl and JellyMe4.jl

https://juliainterop.github.io/RCall.jl/stable/

https://github.com/palday/JellyMe4.jl/

RCall.jl provides a convenient interface for interoperability with R from Julia. JellyMe4.jl extends the functionality of RCall so that MixedModels.jl-fitted models and lme4-fitted models can be translated to each other. In practical terms, this means that you can enjoy the speed of Julia for model fitting, but use all the extra packages you love from R's larger ecosystem.

<!--
MixedModelsSerializtion.jl
MixedModelsSim.jl
-->
50 changes: 50 additions & 0 deletions docs/src/limitations.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
# Limitations of MixedModels.jl

We expect that MixedModels.jl will generally be best in class for the types of models that it can fit.
We use cutting edge algorithms based on penalized least squares and sparse matrix methods that take advantage of the particular sparsity and structure that arises in the case of the linear mixed effects model with an unconstrained covariance structure.
Glossing over a fair number of technical details, MixedModels.jl uses a different, novel formulation of the underlying numerical problem which tends to be much more efficient computationally and allows us to fit models with multiple crossed, partially crossed or nested grouping variables without any special treatment.

## Very few options for covariance structure

Nonetheless, there is no free lunch and the tradeoff that we make is that it is *much* more difficult to formulate constraints on the covariance structure (whether on the random effects or on the response/residuals) in our formulation. MixedModels.jl currently supports precisely two covariance structures explicitly:

1. unconstrained
2. zero correlation (diagonal covariance structure)

It is also possible to express some models with compound symmetry by clever manipulation of the formula syntax (i.e. `(1+c|g)` for categorical `c` with compound symmetry is the same as `(1|g) + (1|g&c)`).

MixedModels.jl does support constraining the residual variance to known scalar value, which is useful in meta-analysis.

[Metida.jl](https://github.com/PharmCat/Metida.jl) may provide an alternative if this functionality is required (not an endorsement).

## No support for sandwich/robust variance-covariance estimators

[*This may change in the foreseeable future!*](https://github.com/JuliaStats/MixedModels.jl/pull/768)

If this would be a valuable feature, then please [file an issue](https://github.com/JuliaStats/MixedModels.jl/issues/new). Issues are prioritized by the developers' own needs and potential impact for users, so showing a large need for a feature will tend to increase its priority.

[FixedEffectsModels.jl](https://github.com/FixedEffects/FixedEffectModels.jl) may be a viable alternative (not an endorsement). It provides "fast estimation of linear models with IV and high dimensional categorical variables" and provides similar functionality to Stata's `reghdfe` and R's `lfe` and `fixest`.

## No support for generalized linear mixed models with a dispersion parameter

While MixedModels.jl does nominally support any GLM family and link function support by GLM.jl, the results for model families with a dispersion parameter (normal with non-identity link, gamma, inverse Gaussian) are known to be incorrect. The package issues a warning if you attempt to fit such models.

## No support for polytomous responses

Multinomial and ordered responses are not supported. We are unaware of a Julia package offering support for this.

## No support for regularization of the fixed effects

[HighDimMixedModels.jl](https://github.com/solislemuslab/HighDimMixedModels.jl) may provide an alternative if this functionality is required (not an endorsement).

## No support for generalized additive mixed models

Generalized additive models can be expressed a mixed model, so supporting this would require "only" adding a translation layer.

## No support for nonlinear mixed effects models

[Pumas.jl (commercial)](https://pumas.ai/our-products/products-suite/pumas) provides this (not an endorsement).

## No support for Satterthwaite nor Kenward-Roger degree of freedom approximations.

There are some philosophical and practical [issues](https://bbolker.github.io/mixedmodels-misc/glmmFAQ.html#ddf) with these approaches, but an implementation can be found in [MixedModelsSmallSample.jl](https://arnostrouwen.github.io/MixedModelsSmallSample.jl/dev/) (not an endorsement).
4 changes: 2 additions & 2 deletions test/prima.jl
Original file line number Diff line number Diff line change
Expand Up @@ -51,10 +51,10 @@ end
dataset(:contra), Binomial(); progress=false)
prmodel = unfit!(deepcopy(model))
fit!(prmodel; optimizer=:bobyqa, backend=:prima, progress=false)
@test isapprox(loglikelihood(model), loglikelihood(prmodel)) atol=0.0001
@test isapprox(loglikelihood(model), loglikelihood(prmodel)) atol=0.001
refit!(prmodel; fast=true, progress=false)
refit!(model; fast=true, progress=false)
@test isapprox(loglikelihood(model), loglikelihood(prmodel)) atol=0.0001
@test isapprox(loglikelihood(model), loglikelihood(prmodel)) atol=0.001

optsum = deepcopy(prmodel.optsum)
optsum.final = [0.2612]
Expand Down
Loading