lavPredict.RdThe main purpose of the lavPredict() function is to compute (or
`predict') individual scores for the latent variables in the model
(`factor scores'). NOTE: the goal of this
function is NOT to predict future values of dependent variables as in the
regression framework! (For models with only continuous observed variables, the function lavPredictY() supports this.)
lavPredict(object, newdata = NULL, type = "lv", method = "EBM",
transform = FALSE, se = "none", acov = "none",
label = TRUE, fsm = FALSE, mdist = FALSE, rel = FALSE,
append.data = FALSE, assemble = FALSE,
level = 1L, optim.method = "bfgs", ETA = NULL,
drop.list.single.group = TRUE)An object of class lavaan.
An optional data.frame, containing the same variables as the data.frame used when fitting the model in object.
A character string. If "lv", estimated values for the latent
variables in the model are computed. If "ov", model predicted values for
the indicators of the latent variables in the model are computed. If
"yhat", the estimated value for the observed indicators, given
user-specified values for the latent variables provided by de ETA
argument. If "fy", densities (or probabilities) for each observed
indicator, given user-specified values for the latent variables provided by de
ETA argument.
A character string. In the linear case (when the indicators are
continuous), the possible options are "regression" or "Bartlett".
In the categorical case, the two options are "EBM" for
the Empirical Bayes Modal approach, and "ML" for the maximum
likelihood approach.
Logical. If TRUE, transform the factor scores (per
group) so that their mean and variance-covariance matrix matches the
model-implied mean and variance-covariance matrix. This may be useful if the
individual factor scores will be used in a follow-up (regression) analysis.
Note: the standard errors (if requested) not transformed (yet). The resulting
factor scores are often called correlation-preserving factor scores.
Character. If "none", no standard errors are computed.
If "standard", naive standard errors are computed (assuming the
parameters of the measurement model are known). The standard errors are
returned as an attribute. Currently only available for complete continuous
data.
Similar to the "se" argument, but optionally returns the full
sampling covariance matrix of factor scores as an attribute. Currently
only available for complete continuous data.
Logical. If TRUE, the columns in the output are labeled.
Logical. If TRUE, return the factor score matrix as an attribute. Only for numeric data.
Logical. If TRUE, the (squared)
Mahalanobis distances of the factor scores (if type = "lv") or
the casewise residuals (if type = "resid") are returned as an
attribute.
Logical. Only used if type = "lv". If TRUE,
the factor reliabilities are returned as an attribute. (The squared
values are often called the factor determinacies.)
Logical. Only used when type = "lv". If TRUE,
the original data (or the data provided
in the newdata argument) is appended to the factor scores.
Logical. If TRUE, the separate multiple groups are reassembled again to form a single data.frame with a group column, having the same dimensions are the original (or newdata) dataset.
Integer. Only used in a multilevel SEM.
If level = 1, only factor scores for latent variable
defined at the first (within) level are computed; if level = 2,
only factor scores for latent variables defined at the second (between) level
are computed.
Character string. Only used in the categorical case.
If "nlminb" (the default in 0.5), the "nlminb()" function is used
for the optimization. If "bfgs" or "BFGS" (the default in 0.6),
the "optim()" function is used with the BFGS method.
An optional matrix or list, containing latent variable values
for each observation. Used for computations when type = "ov".
Logical. If FALSE, the results are
returned as
a list, where each element corresponds to a group (even if there is only
a single group). If TRUE, the list will be unlisted if there is
only a single group.
The predict() function calls the lavPredict() function
with its default options.
If there are no latent variables in the model, type = "ov" will
simply return the values of the observed variables. Note that this function
can not be used to `predict' values of dependent variables, given the
values of independent values (in the regression sense). In other words,
the structural component is completely ignored (for now).
lavPredictY to predict y-variables given x-variables.
data(HolzingerSwineford1939)
## fit model
HS.model <- ' visual =~ x1 + x2 + x3
textual =~ x4 + x5 + x6
speed =~ x7 + x8 + x9 '
fit <- cfa(HS.model, data = HolzingerSwineford1939)
#> Warning: lavaan->lav_model_vcov():
#> The variance-covariance matrix of the estimated parameters (vcov) does not
#> appear to be positive definite! The smallest eigenvalue (= -1.747972e-02)
#> is smaller than zero. This may be a symptom that the model is not
#> identified.
head(lavPredict(fit))
#> visual textual speed
#> [1,] -0.81767524 -0.13754501 0.06150726
#> [2,] 0.04951940 -1.01272402 0.62549360
#> [3,] -0.76139670 -1.87228634 -0.84057276
#> [4,] 0.41934153 0.01848569 -0.27133710
#> [5,] -0.41590481 -0.12225009 0.19432951
#> [6,] 0.02325632 -1.32981727 0.70885348
head(lavPredict(fit, type = "ov"))
#> x1 x2 x3 x4 x5 x6 x7 x8
#> [1,] 4.118094 5.635456 1.654027 2.923363 4.187433 2.0581851 4.247409 5.599652
#> [2,] 4.985289 6.115449 2.286533 2.048184 3.213292 1.2476414 4.811396 6.265128
#> [3,] 4.174373 5.666607 1.695075 1.188622 2.256533 0.4515610 3.345329 4.535242
#> [4,] 5.355111 6.320146 2.556271 3.079394 4.361108 2.2026924 3.914565 5.206912
#> [5,] 4.519865 5.857836 1.947067 2.938658 4.204458 2.0723504 4.380232 5.756376
#> [6,] 4.959026 6.100912 2.267378 1.731091 2.860343 0.9539666 4.894756 6.363489
#> x9
#> [1,] 5.440645
#> [2,] 6.050613
#> [3,] 4.465019
#> [4,] 5.080664
#> [5,] 5.584297
#> [6,] 6.140770
## ------------------------------------------
## merge factor scores to original data.frame
## ------------------------------------------
idx <- lavInspect(fit, "case.idx")
fscores <- lavPredict(fit)
## loop over factors
for (fs in colnames(fscores)) {
HolzingerSwineford1939[idx, fs] <- fscores[ , fs]
}
head(HolzingerSwineford1939)
#> id sex ageyr agemo school grade x1 x2 x3 x4 x5 x6
#> 1 1 1 13 1 Pasteur 7 3.333333 7.75 0.375 2.333333 5.75 1.2857143
#> 2 2 2 13 7 Pasteur 7 5.333333 5.25 2.125 1.666667 3.00 1.2857143
#> 3 3 2 13 1 Pasteur 7 4.500000 5.25 1.875 1.000000 1.75 0.4285714
#> 4 4 1 13 2 Pasteur 7 5.333333 7.75 3.000 2.666667 4.50 2.4285714
#> 5 5 2 12 2 Pasteur 7 4.833333 4.75 0.875 2.666667 4.00 2.5714286
#> 6 6 2 14 1 Pasteur 7 5.333333 5.00 2.250 1.000000 3.00 0.8571429
#> x7 x8 x9 visual textual speed
#> 1 3.391304 5.75 6.361111 -0.81767524 -0.13754501 0.06150726
#> 2 3.782609 6.25 7.916667 0.04951940 -1.01272402 0.62549360
#> 3 3.260870 3.90 4.416667 -0.76139670 -1.87228634 -0.84057276
#> 4 3.000000 5.30 4.861111 0.41934153 0.01848569 -0.27133710
#> 5 3.695652 6.30 5.916667 -0.41590481 -0.12225009 0.19432951
#> 6 4.347826 6.65 7.500000 0.02325632 -1.32981727 0.70885348
## multigroup models return a list of factor scores (one per group)
data(HolzingerSwineford1939)
mgfit <- update(fit, group = "school", group.equal = c("loadings","intercepts"))
#> Warning: lavaan->lav_lavaan_step11_estoptim():
#> Model estimation FAILED! Returning starting values.
#> Error in lav_mvnorm_loglik_samplestats(sample.mean = lavsamplestats@mean[[g]], sample.cov = lavsamplestats@cov[[g]], sample.nobs = lavsamplestats@nobs[[g]], Mu = Mu, Sigma = lavimplied$cov[[g]], x.idx = lavsamplestats@x.idx[[g]], x.mean = lavsamplestats@mean.x[[g]], x.cov = lavsamplestats@cov.x[[g]], Sinv.method = "eigen", Sigma.inv = NULL): non-conformable arguments
idx <- lavInspect(mgfit, "case.idx") # list: 1 vector per group
#> Error: object 'mgfit' not found
fscores <- lavPredict(mgfit) # list: 1 matrix per group
#> Error: object 'mgfit' not found
## loop over groups and factors
for (g in seq_along(fscores)) {
for (fs in colnames(fscores[[g]])) {
HolzingerSwineford1939[ idx[[g]], fs] <- fscores[[g]][ , fs]
}
}
head(HolzingerSwineford1939)
#> id sex ageyr agemo school grade x1 x2 x3 x4 x5 x6
#> 1 1 1 13 1 Pasteur 7 3.333333 7.75 0.375 2.333333 5.75 1.2857143
#> 2 2 2 13 7 Pasteur 7 5.333333 5.25 2.125 1.666667 3.00 1.2857143
#> 3 3 2 13 1 Pasteur 7 4.500000 5.25 1.875 1.000000 1.75 0.4285714
#> 4 4 1 13 2 Pasteur 7 5.333333 7.75 3.000 2.666667 4.50 2.4285714
#> 5 5 2 12 2 Pasteur 7 4.833333 4.75 0.875 2.666667 4.00 2.5714286
#> 6 6 2 14 1 Pasteur 7 5.333333 5.00 2.250 1.000000 3.00 0.8571429
#> x7 x8 x9 visual textual speed
#> 1 3.391304 5.75 6.361111 -0.81767524 -0.13754501 0.06150726
#> 2 3.782609 6.25 7.916667 0.04951940 -1.01272402 0.62549360
#> 3 3.260870 3.90 4.416667 -0.76139670 -1.87228634 -0.84057276
#> 4 3.000000 5.30 4.861111 0.41934153 0.01848569 -0.27133710
#> 5 3.695652 6.30 5.916667 -0.41590481 -0.12225009 0.19432951
#> 6 4.347826 6.65 7.500000 0.02325632 -1.32981727 0.70885348
## -------------------------------------
## Use factor scores in susequent models
## -------------------------------------
## see Examples in semTools package: ?plausibleValues