|Maintainer:||Ben Bolker, Julia Piaskowski, Emi Tanaka, Phillip Alday, Wolfgang Viechtbauer|
|Contributions:||Suggestions and improvements for this task view are very welcome and can be made through issues or pull requests on GitHub or via e-mail to the maintainer address. For further details see the Contributing guide.|
|Citation:||Ben Bolker, Julia Piaskowski, Emi Tanaka, Phillip Alday, Wolfgang Viechtbauer (2022). CRAN Task View: Mixed, Multilevel, and Hierarchical Models in R. Version 2022-10-31. URL https://CRAN.R-project.org/view=MixedModels.|
|Installation:||The packages from this task view can be installed automatically using the ctv package. For example, |
Contributors: Maintainers plus Michael Agronah, Matthew Fidler, Thierry Onkelinx
Mixed (or mixed-effect) models are a broad class of statistical models used to analyze data where observations can be assigned a priori to discrete groups, and where the parameters describing the differences between groups are treated as random (or latent) variables. They are one category of multilevel, or hierarchical models; longitudinal data are often analyzed in this framework. In econometrics, longitudinal or cross-sectional time series data are often referred to as panel data and are sometimes fitted with mixed models. Mixed models can be fitted in either frequentist or Bayesian frameworks.
This task view only includes models that incorporate continuous (usually although not always Gaussian) latent variables. This excludes packages that handle hidden Markov models, latent Markov models, and finite (discrete) mixture models (some of these are covered by the Cluster task view). Dynamic linear models and other state-space models that do not incorporate a discrete grouping variable are also excluded (some of these are covered by the TimeSeries task view). Bioinformatic applications of mixed models hosted on Bioconductor are excluded as well.
Linear mixed models (LMM) make the following assumptions:
The most commonly used packages and/or functions for frequentist LMMs are:
nlme::lme()provides REML or ML estimation. Allows multiple nested random effects, and provides structures for modeling heteroscedastic and/or correlated errors. Wald estimates of parameter uncertainty.
lmer4::lmer()provides REML or ML estimation. Allows multiple nested or crossed random effects, can compute profile confidence intervals and conduct parametric bootstrapping.
Most Bayesian R packages use Markov chain Monte Carlo (MCMC) estimation: MCMCglmm, rstanarm, and brms; the latter two packages use the Stan infrastructure. blme, built on lme4, uses maximum a posteriori (MAP) estimation. bamlss provides a flexible set of modular functions for Bayesian regression modeling.
Generalized linear mixed models (GLMMs) can be described as hierarchical extensions of generalized linear models (GLMs), or as extensions of LMMs to different response distributions, typically in the exponential family. The random-effect distributions are typically assumed to be Gaussian on the scale of the linear predictor.
MASS::glmmPQL()fits via penalized quasi-likelihood.
lme4::glmer()uses Laplace approximation and adaptive Gauss-Hermite quadrature; fits negative binomial as well as exponential-family models.
Most Bayesian mixed model packages use some form of Markov chain Monte Carlo (or other Monte Carlo methods).
The following packages (in addition to bamlss) find maximum a posteriori fits to Bayesian (G)LMMs by optimization:
vglmer estimates GLMMs by variational Bayesian methods.
Nonlinear mixed models incorporate arbitrary nonlinear responses that cannot be accommodated in the framework of GLMMs. Only a few packages can accommodate generalized nonlinear mixed models (i.e., parametric nonlinear mixed models with non-Gaussian responses). However, many packages allow smooth nonparametric components (see “Additive models” below). Otherwise, users may need to implement GNLMMs themselves in a more general hierarchical modeling framework.
nlme::nlme()from nlme and
lmer4::nlmer()from lme4 fit nonlinear mixed effects models by maximum likelihood.
nlmixr2::nlmixr2()from nlmixr2 fits nonlinear mixed effects model by first order conditional estimation (focei) maximum likelihood approximation (a different approximation than
lmer4:nlmer()), and allows generalized likelihood as well as a selection of built-in link functions.
gnlmm3()from repeated fit GNLMMs by Gauss-Hermite integration.
General estimating equations (GEEs) are an alternative approach to fitting clustered, longitudinal, or otherwise correlated data. These models produce estimates of the marginal effects (averaged across the group-level variation) rather than conditional effects (conditioned on group-level information).
familychosen from one of the
*lssoptions] all allow modeling of the dispersion/scale component.
family = "scat", nlmixr2) allow heavy-tailed response distributions such as Student-t.
yeoJohnson()transformations with maximum likelihood or the EM method “saem”.
corStructfunctions), CARBayesST, sphet, spind, spaMM, glmmfields, glmmTMB, inlabru (spatial point processes via log-Gaussian Cox processes), brms, LMMsolver, bamlss; see also the Spatial and SpatioTemporal CRAN task views.
These packages do not directly provide functions to fit mixed models, but instead implement interfaces to general-purpose sampling and optimization toolboxes that can be used to fit mixed models. While models require extra effort to set up, and often require programming in a domain-specific language other than R, these frameworks are more flexible than most of the other packages listed here.
r.squaredGLMM()function), partR2, performance (
r2()function) (Note that there are many different methods for computing R2 values for (G)LMMs: see e.g. Nakagawa, Johnson and Schielzeth (2017), Jaeger et al. (2017).)
The first and second derivatives of log-likelihood with respect to parameters can be useful for various model evaluation tasks (e.g., computing sensitivities, robust variance-covariance matrices, or delta-method variances).
Many packages include small example data sets (e.g., lme4, nlme). These packages provide previously described data sets often used in evaluating mixed models.
Functions and frameworks for convenient and tabular and graphical output of mixed model results:
These functions provide convenient frameworks to fit and interpret mixed models.
These topics are closely related because there are few available analytical methods for computing statistical power for mixed models; power usually needs to be estimated by simulation.
lme4(for formula arguments); rxode2, mrgsolve, PKPDsim (ODE/pharmacokinetic models)
|Core:||brms, broom.mixed, geepack, glmmTMB, lavaan, lme4, MCMCglmm, multilevelmod, nlme, sommer.|
|Regular:||afex, aod, aods3, ARpLMEC, asremlPlus, bamlss, BGLR, blavaan, blme, blmeco, BMTME, buildmer, cAIC4, car, CARBayesST, CLME, clubSandwich, clusterPower, coxme, CpGassoc, cplm, CRTgeeDR, dalmatian, DHARMa, dhglm, dotwhisker, effects, emmeans, ez, faux, gamlss, gamm4, gee, geeM, geesmv, glmertree, glmm, GLMMadaptive, glmmEP, glmmfields, glmmLasso, GLMMRR, glmulti, gpboost, greta, hglm, HLMdiag, huxtable, iccbeta, influence.ME, influence.SEM, inlabru, insight, JointAI, kinship2, languageR, lmeInfo, LMERConvenienceFunctions, lmeresampler, lmerTest, lmeSplines, lmmpar, longpower, lqmm, marginaleffects, MarginalMediation, margins, MASS, mbest, mclogit, MCMC.qpcr, mdhglm, mdmb, merDeriv, merTools, mgcv, mice, mixedsde, mixlm, mlmmm, mlmRev, mmrm, modelsummary, MplusAutomation, mrgsolve, multgee, multilevelTools, MuMIn, mvctm, mvglmmRank, nimble, nlmeU, nlmixr2, nlmixr2data, ordinal, pan, parameters, partR2, pass.lme, pbkrtest, pedigreemm, performance, pez, Phxnlme, phyr, piecewiseSEM, PKPDsim, plm, QGglmm, qgtools, qrLMM, qrNLMM, QTLRel, R2BayesX, r2glmm, R2jags, R2OpenBUGS, repeated, rjags, RLRsim, robustBLME, robustlmm, rockchalk, rptR, rrBLUP, rstan, rstanarm, RVAideMemoire, rxode2, saemix, SASmixed, sem, semtree, simr, sjPlot, skewlmm, spaMM, sphet, spind, splmm, StroupGLMM, TMB, tmbstan, varTestnlme, VetResearchLMM, vglmer, WeMix, wgeesel.|