This function extracts the variance-covariance of estimated parameters from a model
estimated with femlm, feols or feglm.
# S3 method for class 'fixest'
vcov(
object,
vcov = NULL,
se = NULL,
cluster,
ssc = NULL,
attr = FALSE,
forceCovariance = FALSE,
keepBounded = FALSE,
nthreads = getFixest_nthreads(),
vcov_fix = TRUE,
...
)A fixest object. Obtained using the functions femlm, feols or feglm.
Versatile argument to specify the VCOV. In general, it is either a character
scalar equal to a VCOV type, either a formula of the form: vcov_type ~ variables. The
VCOV types implemented are: "iid", "hetero" (or "HC1"), "cluster", "twoway",
"NW" (or "newey_west"), "DK" (or "driscoll_kraay"), and "conley". It also accepts
object from vcov_cluster, vcov_NW, NW,
vcov_DK, DK, vcov_conley and
conley. It also accepts covariance matrices computed externally.
Finally it accepts functions to compute the covariances. See the vcov documentation
in the vignette.
Character scalar. Which kind of standard error should be computed:
“standard”, “hetero”, “cluster”, “twoway”, “threeway”
or “fourway”? By default if there are clusters in the estimation:
se = "cluster", otherwise se = "iid". Note that this argument is deprecated,
you should use vcov instead.
Tells how to cluster the standard-errors (if clustering is requested).
Can be either a list of vectors, a character vector of variable names, a formula or
an integer vector. Assume we want to perform 2-way clustering over var1 and var2
contained in the data.frame base used for the estimation. All the following
cluster arguments are valid and do the same thing:
cluster = base[, c("var1", "var2")], cluster = c("var1", "var2"), cluster = ~var1+var2.
If the two variables were used as fixed-effects in the estimation, you can leave it
blank with vcov = "twoway" (assuming var1 [resp. var2] was
the 1st [resp. 2nd] fixed-effect). You can interact two variables using ^ with
the following syntax: cluster = ~var1^var2 or cluster = "var1^var2".
An object of class ssc.type obtained with the function ssc. Represents
how the degree of freedom correction should be done.You must use the function ssc
for this argument. The arguments and defaults of the function ssc are:
adj = TRUE, fixef.K="nested", cluster.adj = TRUE, cluster.df = "min",
t.df = "min", fixef.force_exact=FALSE). See the help of the function ssc for details.
Logical, defaults to FALSE. Whether to include the attributes describing how
the VCOV was computed.
(Advanced users.) Logical, default is FALSE. In the peculiar case
where the obtained Hessian is not invertible (usually because of collinearity of
some variables), use this option to force the covariance matrix, by using a generalized
inverse of the Hessian. This can be useful to spot where possible problems come from.
(Advanced users – feNmlm with non-linear part and bounded
coefficients only.) Logical, default is FALSE. If TRUE, then the bounded coefficients
(if any) are treated as unrestricted coefficients and their S.E. is computed (otherwise
it is not).
The number of threads. Can be: a) an integer lower than, or equal to,
the maximum number of threads; b) 0: meaning all available threads will be used;
c) a number strictly between 0 and 1 which represents the fraction of all threads to use.
The default is to use 50% of all threads. You can set permanently the number
of threads used within this package using the function setFixest_nthreads.
Logical scalar, default is TRUE. If the VCOV ends up not being
positive definite, whether to "fix" it using an eigenvalue decomposition
(a la Cameron, Gelbach & Miller 2011).
Other arguments to be passed to summary.fixest.
The computation of the VCOV matrix is first done in summary.fixest.
It returns a \(K\times K\) square matrix where \(K\) is the number of variables
of the fitted model.
If attr = TRUE, this matrix has an attribute “type” specifying how this
variance/covariance matrix has been computed.
For an explanation on how the standard-errors are computed and what is the exact meaning of the arguments, please have a look at the dedicated vignette: On standard-errors.
Ding, Peng, 2021, "The Frisch–Waugh–Lovell theorem for standard errors." Statistics & Probability Letters 168.
You can also compute VCOVs with the following functions: vcov_cluster,
vcov_hac, vcov_conley.
See also the main estimation functions femlm, feols or feglm.
summary.fixest, confint.fixest, resid.fixest, predict.fixest, fixef.fixest.
# Load panel data
data(base_did)
# Simple estimation on a panel
est = feols(y ~ x1, base_did)
# ======== #
# IID VCOV #
# ======== #
# By default the VCOV assumes iid errors:
se(vcov(est))
#> (Intercept) x1
#> 0.1491554 0.0501191
# You can make the call for an iid VCOV explicitly:
se(vcov(est, "iid"))
#> (Intercept) x1
#> 0.1491554 0.0501191
#
# Heteroskedasticity-robust VCOV
#
# By default the VCOV assumes iid errors:
se(vcov(est, "hetero"))
#> (Intercept) x1
#> 0.14902573 0.05101605
# => note that it also accepts vcov = "White" and vcov = "HC1" as aliases.
# =============== #
# Clustered VCOVs #
# =============== #
# To cluster the VCOV, you can use a formula of the form cluster ~ var1 + var2 etc
# Let's cluster by the panel ID:
se(vcov(est, cluster ~ id))
#> (Intercept) x1
#> 0.1943525 0.0467892
# Alternative ways:
# -> cluster is implicitly assumed when a one-sided formula is provided
se(vcov(est, ~ id))
#> (Intercept) x1
#> 0.1943525 0.0467892
# -> using the argument cluster instead of vcov
se(vcov(est, cluster = ~ id))
#> (Intercept) x1
#> 0.1943525 0.0467892
# For two-/three- way clustering, just add more variables:
se(vcov(est, ~ id + period))
#> (Intercept) x1
#> 0.61496508 0.04721779
# -------------------|
# Implicit deduction |
# -------------------|
# When the estimation contains FEs, the dimension on which to cluster
# is directly inferred from the FEs used in the estimation, so you don't need
# to explicitly add them.
est_fe = feols(y ~ x1 | id + period, base_did)
# Clustered along "id"
se(vcov(est_fe, "cluster"))
#> x1
#> 0.04578726
# Clustered along "id" and "period"
se(vcov(est_fe, "twoway"))
#> x1
#> 0.03417711
# =========== #
# Panel VCOVs #
# =========== #
# ---------------------|
# Newey West (NW) VCOV |
# ---------------------|
# To obtain NW VCOVs, use a formula of the form NW ~ id + period
se(vcov(est, NW ~ id + period))
#> (Intercept) x1
#> 0.17411100 0.05269927
# If you want to change the lag:
se(vcov(est, NW(3) ~ id + period))
#> (Intercept) x1
#> 0.19450009 0.05104156
# Alternative way:
# -> using the vcov_NW function
se(vcov(est, vcov_NW(unit = "id", time = "period", lag = 3)))
#> (Intercept) x1
#> 0.19450009 0.05104156
# -------------------------|
# Driscoll-Kraay (DK) VCOV |
# -------------------------|
# To obtain DK VCOVs, use a formula of the form DK ~ period
se(vcov(est, DK ~ period))
#> (Intercept) x1
#> 0.78953790 0.03611533
# If you want to change the lag:
se(vcov(est, DK(3) ~ period))
#> (Intercept) x1
#> 0.97148590 0.02841491
# Alternative way:
# -> using the vcov_DK function
se(vcov(est, vcov_DK(time = "period", lag = 3)))
#> (Intercept) x1
#> 0.97148590 0.02841491
# -------------------|
# Implicit deduction |
# -------------------|
# When the estimation contains a panel identifier, you don't need
# to re-write them later on
est_panel = feols(y ~ x1, base_did, panel.id = ~id + period)
# Both methods, NM and DK, now work automatically
se(vcov(est_panel, "NW"))
#> (Intercept) x1
#> 0.17411100 0.05269927
se(vcov(est_panel, "DK"))
#> (Intercept) x1
#> 0.78953790 0.03611533
# =================================== #
# VCOVs robust to spatial correlation #
# =================================== #
data(quakes)
est_geo = feols(depth ~ mag, quakes)
# ------------|
# Conley VCOV |
# ------------|
# To obtain a Conley VCOV, use a formula of the form conley(cutoff) ~ lat + lon
# with lat/lon the latitude/longitude variable names in the data set
se(vcov(est_geo, conley(100) ~ lat + long))
#> (Intercept) mag
#> 108.90052 19.23233
# Alternative way:
# -> using the vcov_DK function
se(vcov(est_geo, vcov_conley(lat = "lat", lon = "long", cutoff = 100)))
#> (Intercept) mag
#> 108.90052 19.23233
# -------------------|
# Implicit deduction |
# -------------------|
# By default the latitude and longitude are directly fetched in the data based
# on pattern matching. So you don't have to specify them.
# Furhter, an automatic cutoff is deduced by default.
# The following works:
se(vcov(est_geo, "conley"))
#> (Intercept) mag
#> 110.67271 20.17456
# ======================== #
# Small Sample Corrections #
# ======================== #
# You can change the way the small sample corrections are done with the argument ssc.
# The argument ssc must be created by the ssc function
se(vcov(est, ssc = ssc(adj = FALSE)))
#> (Intercept) x1
#> 0.14908629 0.05009587
# You can add directly the call to ssc in the vcov formula.
# You need to add it like a variable:
se(vcov(est, iid ~ ssc(adj = FALSE)))
#> (Intercept) x1
#> 0.14908629 0.05009587
se(vcov(est, DK ~ period + ssc(adj = FALSE)))
#> (Intercept) x1
#> 0.78917195 0.03609859