Main imputation functionsThe workflow of multiple imputation is: multiply-impute the data, apply the complete-data model to each imputed data set, and pool the results to get to the final inference. The main functions for imputing the data are: |
|
---|---|
mice: Multivariate Imputation by Chained Equations |
|
Multivariate Imputation by Chained Equations (Iteration Step) |
|
Wrapper function that runs MICE in parallel |
|
Wrapper function that runs MICE in parallel |
|
Missing data explorationFunctions to count and explore the structure of the missing data. |
|
Missing data pattern |
|
Missing data pattern by variable pairs |
|
Select complete cases |
|
Complete case indicator |
|
Select incomplete cases |
|
Incomplete case indicator |
|
Jamshidian and Jalal's Non-Parametric MCAR Test |
|
Number of complete cases |
|
Number of incomplete cases |
|
Number of imputations per block |
|
Fraction of incomplete cases among cases with observed |
|
Influx and outflux of multivariate missing data patterns |
|
Fluxplot of the missing data pattern |
|
Elementary imputation functionsThe elementary imputation function is the workhorse that creates the actual imputations. Elementary functions are called through the |
|
Imputation by a two-level logistic model using |
|
Imputation by a two-level normal model using |
|
Imputation by a two-level normal model |
|
Imputation by a two-level normal model using |
|
Imputation of most likely value within the class |
|
Imputation at level 2 by Bayesian linear regression |
|
Imputation at level 2 by predictive mean matching |
|
Imputation by classification and regression trees |
|
Multivariate multilevel imputation using |
|
Imputation by direct use of lasso logistic regression |
|
Imputation by direct use of lasso linear regression |
|
Imputation by indirect use of lasso logistic regression |
|
Imputation by indirect use of lasso linear regression |
|
Imputation by linear discriminant analysis |
|
Imputation by logistic regression |
|
Imputation by logistic regression using the bootstrap |
|
Imputation by the mean |
|
Imputation by predictive mean matching with distance aided donor selection |
|
Imputation under MNAR mechanism by NARFCS |
|
Imputation by multivariate predictive mean matching |
|
Imputation by Bayesian linear regression |
|
Imputation by linear regression, bootstrap method |
|
Imputation by linear regression without parameter uncertainty |
|
Imputation by linear regression through prediction |
|
Impute multilevel missing data using |
|
Passive imputation |
|
Imputation by predictive mean matching |
|
Imputation of ordered data by polytomous regression |
|
Imputation of unordered data by polytomous regression |
|
Imputation of quadratic terms |
|
Imputation by random forests |
|
Imputation by the random indicator method for nonignorable data |
|
Imputation by simple random sampling |
|
Imputation model helpersSpecification of the imputation models can be made more convenient using the following set of helpers. |
|
Construct blocks from |
|
Creates a |
|
Creates a |
|
Creates a |
|
Creates a |
|
Create calltype of the imputation model |
|
Creates a |
|
Creates a |
|
Creates a |
|
Creates a |
|
Name imputation blocks |
|
Name formula list elements |
|
Quick selection of predictors from the data |
|
Squeeze the imputed values to be within specified boundaries. |
|
Plots comparing observed to imputed/amputed dataThese plots contrast the observed data with the imputed/amputed data, usually with a blue/red distinction. |
|
Box-and-whisker plot of observed and imputed data |
|
Density plot of observed and imputed data |
|
Multiply imputed data set ( |
|
Stripplot of observed and imputed data |
|
Scatterplot of observed and imputed data |
|
Repeated analyses and combining analytic estimatesMultiple imputation creates m > 1 completed data sets, fits the model of interest to each of these, and combines the analytic estimates. The following functions assist in executing the analysis and pooling steps: |
|
Evaluate an expression in multiple imputed datasets |
|
Combine estimates by pooling rules |
|
Pools R^2 of m models fitted to multiply-imputed data |
|
Multiple imputation pooling: univariate version |
|
Combines estimates from a tidy table |
|
Cumulative hazard rate or Nelson-Aalen estimator |
|
Compare two nested models fitted to imputed data |
|
Compare several nested models |
|
Fix coefficients and update model |
|
Compare two nested models using D1-statistic |
|
Compare two nested models using D2-statistic |
|
Compare two nested models using D3-statistic |
|
Data manipulationThe multiply-imputed data can be combined in various ways, and exported into other formats. |
|
Extracts the completed data from a |
|
Combine R objects by rows and columns |
|
Enlarge number of imputations by combining |
|
Converts an imputed dataset (long format) into a |
|
Create a |
|
Converts into a |
|
Subset rows of a |
|
Export |
|
Export |
|
Class descriptionsThe data created at the various analytic phases are stored as list objects of a specific class. The most important classes and class-test functions are: |
|
Multiply imputed data set ( |
|
Create an object of class "mira" |
|
|
|
Check for |
|
Check for |
|
Check for |
|
Check for |
|
Extraction functionsHelpers to extract and print information from objects of specific classes. |
|
Computes convergence diagnostics for a |
|
Extract list of fitted models |
|
Extract estimate from |
|
Glance method to extract information from a |
|
Multiply imputed data set ( |
|
|
Print a |
Summary of a |
|
Tidy method to extract results from a |
|
Low-level imputation functionsSeveral functions are dedicated to common low-level operations to generate the imputations: |
|
Computes least squares parameters |
|
Draws values of beta and sigma by Bayesian linear regression |
|
Finds an imputed value from matches in the predictive metric (deprecated) |
|
Multivariate amputationAmputation is the inverse of imputation, starting with a complete dataset, and creating missing data pattern according to the posited missing data mechanism. Amputation is useful for simulation studies. |
|
Generate missing data for simulation purposes |
|
Box-and-whisker plot of amputed and non-amputed data |
|
Scatterplot of amputed and non-amputed data against weighted sum scores |
|
Check for |
|
Multivariate amputed data set ( |
|
DatasetsBuilt-in datasets |
|
Growth of Dutch boys |
|
Brandsma school data used Snijders and Bosker (2012) |
|
Employee selection data |
|
SE Fireworks disaster data |
|
Fifth Dutch growth study 2009 |
|
Leiden 85+ study |
|
Mammal sleep data |
|
MNAR demo data |
|
NHANES example - all variables numerical |
|
NHANES example - mixed numerical and discrete variables |
|
Datasets with various missing data patterns |
|
Hox pupil popularity data with missing popularity scores |
|
Project on preterm and small for gestational age infants (POPS) |
|
Potthoff-Roy data |
|
Self-reported and measured BMI |
|
Terneuzen birth cohort |
|
Toenail data |
|
Toenail data |
|
Walking disability data |
|
Subset of Irish wind speed data |
|
Miscellaneous functionsMiscellaneous functions |
|
Appends specified break to the data |
|
Extract broken stick estimates from a |
|
Generalized linear model for |
|
Linear regression for |
|
Find index of matched donor units |
|
Graphical parameter for missing data plots |
|
Set the theme for the plotting Trellis functions |
|
Supports semi-transparent foreground colors? |
|
Echoes the package version number |