Main imputation functions

The workflow of multiple imputation is: multiply-impute the data, apply the complete-data model to each imputed data set, and pool the results to get to the final inference. The main functions for imputing the data are:

mice()

mice: Multivariate Imputation by Chained Equations

mice.mids()

Multivariate Imputation by Chained Equations (Iteration Step)

parlmice()

Wrapper function that runs MICE in parallel

futuremice()

Wrapper function that runs MICE in parallel

Missing data exploration

Functions to count and explore the structure of the missing data.

md.pattern()

Missing data pattern

md.pairs()

Missing data pattern by variable pairs

cc()

Select complete cases

cci()

Complete case indicator

ic()

Select incomplete cases

ici()

Incomplete case indicator

mcar()

Jamshidian and Jalal's Non-Parametric MCAR Test

ncc()

Number of complete cases

nic()

Number of incomplete cases

nimp()

Number of imputations per block

fico()

Fraction of incomplete cases among cases with observed

flux()

Influx and outflux of multivariate missing data patterns

fluxplot()

Fluxplot of the missing data pattern

Elementary imputation functions

The elementary imputation function is the workhorse that creates the actual imputations. Elementary functions are called through the method argument of mice function. Each function imputes one or more columns in the data. There are also mice.impute.xxx functions outside the mice package.

mice.impute.2l.bin()

Imputation by a two-level logistic model using glmer

mice.impute.2l.lmer()

Imputation by a two-level normal model using lmer

mice.impute.2l.norm()

Imputation by a two-level normal model

mice.impute.2l.pan()

Imputation by a two-level normal model using pan

mice.impute.2lonly.mean()

Imputation of most likely value within the class

mice.impute.2lonly.norm()

Imputation at level 2 by Bayesian linear regression

mice.impute.2lonly.pmm()

Imputation at level 2 by predictive mean matching

mice.impute.cart()

Imputation by classification and regression trees

mice.impute.jomoImpute()

Multivariate multilevel imputation using jomo

mice.impute.lasso.logreg()

Imputation by direct use of lasso logistic regression

mice.impute.lasso.norm()

Imputation by direct use of lasso linear regression

mice.impute.lasso.select.logreg()

Imputation by indirect use of lasso logistic regression

mice.impute.lasso.select.norm()

Imputation by indirect use of lasso linear regression

mice.impute.lda()

Imputation by linear discriminant analysis

mice.impute.logreg()

Imputation by logistic regression

mice.impute.logreg.boot()

Imputation by logistic regression using the bootstrap

mice.impute.mean()

Imputation by the mean

mice.impute.midastouch()

Imputation by predictive mean matching with distance aided donor selection

mice.impute.mnar.logreg() mice.impute.mnar.norm()

Imputation under MNAR mechanism by NARFCS

mice.impute.mpmm()

Imputation by multivariate predictive mean matching

mice.impute.norm()

Imputation by Bayesian linear regression

mice.impute.norm.boot()

Imputation by linear regression, bootstrap method

mice.impute.norm.nob()

Imputation by linear regression without parameter uncertainty

mice.impute.norm.predict()

Imputation by linear regression through prediction

mice.impute.panImpute()

Impute multilevel missing data using pan

mice.impute.passive()

Passive imputation

mice.impute.pmm()

Imputation by predictive mean matching

mice.impute.polr()

Imputation of ordered data by polytomous regression

mice.impute.polyreg()

Imputation of unordered data by polytomous regression

mice.impute.quadratic()

Imputation of quadratic terms

mice.impute.rf()

Imputation by random forests

mice.impute.ri()

Imputation by the random indicator method for nonignorable data

mice.impute.sample()

Imputation by simple random sampling

Imputation model helpers

Specification of the imputation models can be made more convenient using the following set of helpers.

construct.blocks()

Construct blocks from formulas and predictorMatrix

make.blocks()

Creates a blocks argument

make.blots()

Creates a blots argument

make.formulas()

Creates a formulas argument

make.method()

Creates a method argument

make.calltype()

Create calltype of the imputation model

make.post()

Creates a post argument

make.predictorMatrix()

Creates a predictorMatrix argument

make.visitSequence()

Creates a visitSequence argument

make.where()

Creates a where argument

name.blocks()

Name imputation blocks

name.formulas()

Name formula list elements

quickpred()

Quick selection of predictors from the data

squeeze()

Squeeze the imputed values to be within specified boundaries.

Plots comparing observed to imputed/amputed data

These plots contrast the observed data with the imputed/amputed data, usually with a blue/red distinction.

bwplot(<mids>)

Box-and-whisker plot of observed and imputed data

densityplot(<mids>)

Density plot of observed and imputed data

mids() plot(<mids>) print(<mids>) summary(<mids>)

Multiply imputed data set (mids)

stripplot(<mids>)

Stripplot of observed and imputed data

xyplot(<mids>)

Scatterplot of observed and imputed data

Repeated analyses and combining analytic estimates

Multiple imputation creates m > 1 completed data sets, fits the model of interest to each of these, and combines the analytic estimates. The following functions assist in executing the analysis and pooling steps:

with(<mids>)

Evaluate an expression in multiple imputed datasets

pool() pool.syn()

Combine estimates by pooling rules

pool.r.squared()

Pools R^2 of m models fitted to multiply-imputed data

pool.scalar() pool.scalar.syn()

Multiple imputation pooling: univariate version

pool.table()

Combines estimates from a tidy table

nelsonaalen()

Cumulative hazard rate or Nelson-Aalen estimator

pool.compare()

Compare two nested models fitted to imputed data

anova(<mira>)

Compare several nested models

fix.coef()

Fix coefficients and update model

D1()

Compare two nested models using D1-statistic

D2()

Compare two nested models using D2-statistic

D3()

Compare two nested models using D3-statistic

Data manipulation

The multiply-imputed data can be combined in various ways, and exported into other formats.

complete(<mids>)

Extracts the completed data from a mids object

cbind() rbind()

Combine R objects by rows and columns

ibind()

Enlarge number of imputations by combining mids objects

as.mids()

Converts an imputed dataset (long format) into a mids object

as.mira()

Create a mira object from repeated analyses

as.mitml.result()

Converts into a mitml.result object

filter(<mids>)

Subset rows of a mids object

mids2mplus()

Export mids object to Mplus

mids2spss()

Export mids object to SPSS

Class descriptions

The data created at the various analytic phases are stored as list objects of a specific class. The most important classes and class-test functions are:

mids() plot(<mids>) print(<mids>) summary(<mids>)

Multiply imputed data set (mids)

mira()

Create an object of class "mira"

mipo() summary(<mipo>) print(<mipo>) print(<mipo.summary>) process_mipo()

mipo: Multiple imputation pooled object

is.mids()

Check for mids object

is.mipo()

Check for mipo object

is.mira()

Check for mira object

is.mitml.result()

Check for mitml.result object

Extraction functions

Helpers to extract and print information from objects of specific classes.

convergence()

Computes convergence diagnostics for a mids object

getfit()

Extract list of fitted models

getqbar()

Extract estimate from mipo object

glance(<mipo>)

Glance method to extract information from a mipo object

mids() plot(<mids>) print(<mids>) summary(<mids>)

Multiply imputed data set (mids)

print(<mira>) print(<mice.anova>) print(<mice.anova.summary>)

Print a mira object

summary(<mira>) summary(<mice.anova>)

Summary of a mira object

tidy(<mipo>)

Tidy method to extract results from a mipo object

Low-level imputation functions

Several functions are dedicated to common low-level operations to generate the imputations:

estimice()

Computes least squares parameters

norm.draw() .norm.draw()

Draws values of beta and sigma by Bayesian linear regression

.pmm.match()

Finds an imputed value from matches in the predictive metric (deprecated)

Multivariate amputation

Amputation is the inverse of imputation, starting with a complete dataset, and creating missing data pattern according to the posited missing data mechanism. Amputation is useful for simulation studies.

ampute()

Generate missing data for simulation purposes

bwplot(<mads>)

Box-and-whisker plot of amputed and non-amputed data

xyplot(<mads>)

Scatterplot of amputed and non-amputed data against weighted sum scores

is.mads()

Check for mads object

mads() print(<mads>) summary(<mads>)

Multivariate amputed data set (mads)

Datasets

Built-in datasets

boys

Growth of Dutch boys

brandsma

Brandsma school data used Snijders and Bosker (2012)

employee

Employee selection data

fdd fdd.pred

SE Fireworks disaster data

fdgs

Fifth Dutch growth study 2009

leiden85

Leiden 85+ study

mammalsleep sleep

Mammal sleep data

mnar_demo_data

MNAR demo data

nhanes

NHANES example - all variables numerical

nhanes2

NHANES example - mixed numerical and discrete variables

pattern pattern1 pattern2 pattern3 pattern4

Datasets with various missing data patterns

popmis

Hox pupil popularity data with missing popularity scores

pops pops.pred

Project on preterm and small for gestational age infants (POPS)

potthoffroy

Potthoff-Roy data

selfreport mgg

Self-reported and measured BMI

tbc tbc.target terneuzen

Terneuzen birth cohort

toenail

Toenail data

toenail2

Toenail data

walking

Walking disability data

windspeed

Subset of Irish wind speed data

Miscellaneous functions

Miscellaneous functions

appendbreak()

Appends specified break to the data

extractBS()

Extract broken stick estimates from a lmer object

glm.mids()

Generalized linear model for mids object

lm.mids()

Linear regression for mids object

matchindex()

Find index of matched donor units

mdc()

Graphical parameter for missing data plots

mice.theme()

Set the theme for the plotting Trellis functions

supports.transparent()

Supports semi-transparent foreground colors?

version()

Echoes the package version number