This function converts imputed data stored in long format into
an object of class mids
. The original incomplete dataset
needs to be available so that we know where the missing data are.
The function is useful to convert back operations applied to
the imputed data back in a mids
object. It may also be
used to store multiply imputed data sets from other software
into the format used by mice
.
as.mids(long, where = NULL, .imp = ".imp", .id = ".id")
A multiply imputed data set in long format, for example
produced by a call to complete(..., action = 'long', include = TRUE)
,
or by other software.
A data frame or matrix with logicals of the same dimensions
as data
indicating where in the data the imputations should be
created. The default, where = is.na(data)
, specifies that the
missing data should be imputed. The where
argument may be used to
overimpute observed data, or to skip imputations for selected missing values.
Note: Imputation methods that generate imptutations outside of
mice
, like mice.impute.panImpute()
may depend on a complete
predictor space. In that case, a custom where
matrix can not be
specified.
An optional column number or column name in long
,
indicating the imputation index. The values are assumed to be consecutive
integers between 0 and m
. Values 1
through m
correspond to the imputation index, value 0
indicates
the original data (with missings).
By default, the procedure will search for a variable named ".imp"
.
An optional column number or column name in long
,
indicating the subject identification. If not specified, then the
function searches for a variable named ".id"
. If this variable
is found, the values in the column will define the row names in
the data
element of the resulting mids
object.
An object of class mids
The function expects the input data long
to be sorted by
imputation number (variable ".imp"
by default), and in the
same sequence within each imputation block.
# impute the nhanes dataset
imp <- mice(nhanes, print = FALSE)
# extract the data in long format
X <- complete(imp, action = "long", include = TRUE)
# create dataset with .imp variable as numeric
X2 <- X
# nhanes example without .id
test1 <- as.mids(X)
is.mids(test1)
#> [1] TRUE
identical(complete(test1, action = "long", include = TRUE), X)
#> [1] TRUE
# nhanes example without .id where .imp is numeric
test2 <- as.mids(X2)
is.mids(test2)
#> [1] TRUE
identical(complete(test2, action = "long", include = TRUE), X)
#> [1] TRUE
# nhanes example, where we explicitly specify .id as column 2
test3 <- as.mids(X, .id = ".id")
is.mids(test3)
#> [1] TRUE
identical(complete(test3, action = "long", include = TRUE), X)
#> [1] TRUE
# nhanes example with .id where .imp is numeric
test4 <- as.mids(X2, .id = 6)
is.mids(test4)
#> [1] TRUE
identical(complete(test4, action = "long", include = TRUE), X)
#> [1] TRUE
# example without an .id variable
# variable .id not preserved
X3 <- X[, -6]
test5 <- as.mids(X3)
is.mids(test5)
#> [1] TRUE
identical(complete(test5, action = "long", include = TRUE)[, -6], X[, -6])
#> [1] TRUE
# where argument copies also observed data into $imp element
where <- matrix(TRUE, nrow = nrow(nhanes), ncol = ncol(nhanes))
colnames(where) <- colnames(nhanes)
test11 <- as.mids(X, where = where)
identical(complete(test11, action = "long", include = TRUE), X)
#> [1] TRUE