Add sampling weights to a matchit object — add

Adds sampling weights to a matchit object so that they are incorporated into balance assessment and creation of the weights. This would typically only be used when an argument to s.weights was not supplied to matchit() (i.e., because they were not to be included in the estimation of the propensity score) but sampling weights are required for generalizing an effect to the correct population. Without adding sampling weights to the matchit object, balance assessment tools (i.e., summary.matchit() and plot.matchit()) will not calculate balance statistics correctly, and the weights produced by match_data() and get_matches() will not incorporate the sampling weights.

add_s.weights(m, s.weights = NULL, data = NULL)

Arguments

m: a matchit object; the output of a call to matchit(), typically with the s.weights argument unspecified.
s.weights: an numeric vector of sampling weights to be added to the matchit object. Can also be specified as a string containing the name of variable in data to be used or a one-sided formula with the variable on the right-hand side (e.g., ~ SW).
data: a data frame containing the sampling weights if given as a string or formula. If unspecified, add_s.weights() will attempt to find the dataset using the environment of the matchit object.

Value

a matchit object with an s.weights component containing the supplied sampling weights. If s.weights = NULL, the original matchit object is returned.

Author

Noah Greifer

Examples


data("lalonde")

# Generate random sampling weights, just
# for this example
sw <- rchisq(nrow(lalonde), 2)

# NN PS match using logistic regression PS that doesn't
# include sampling weights
m.out <- matchit(treat ~ age + educ + race + nodegree +
                   married  + re74 + re75,
                 data = lalonde)

m.out
#> A `matchit` object
#>  - method: 1:1 nearest neighbor matching without replacement
#>  - distance: Propensity score
#>              - estimated with logistic regression
#>  - number of obs.: 614 (original), 370 (matched)
#>  - target estimand: ATT
#>  - covariates: age, educ, race, nodegree, married, re74, re75

# Add s.weights to the matchit object
m.out <- add_s.weights(m.out, sw)

m.out #note additional output
#> A `matchit` object
#>  - method: 1:1 nearest neighbor matching without replacement
#>  - distance: Propensity score
#>              - estimated with logistic regression
#>              - sampling weights not included in estimation
#>  - number of obs.: 614 (original), 370 (matched)
#>  - sampling weights: present
#>  - target estimand: ATT
#>  - covariates: age, educ, race, nodegree, married, re74, re75

# Check balance; note that sample sizes incorporate
# s.weights
summary(m.out, improvement = FALSE)
#> 
#> Call:
#> matchit(formula = treat ~ age + educ + race + nodegree + married + 
#>     re74 + re75, data = lalonde)
#> 
#> Summary of Balance for All Data:
#>            Means Treated Means Control Std. Mean Diff. Var. Ratio eCDF Mean
#> distance          0.6097        0.1809          2.2039     0.7364    0.3949
#> age              26.1712       27.3490         -0.1700     0.4302    0.0850
#> educ             10.2988       10.2206          0.0406     0.4377    0.0403
#> raceblack         0.8678        0.1997          1.9727          .    0.6681
#> racehispan        0.0776        0.1608         -0.3111          .    0.0832
#> racewhite         0.0546        0.6395         -2.5748          .    0.5849
#> nodegree          0.7447        0.5954          0.3424          .    0.1493
#> married           0.1754        0.4994         -0.8521          .    0.3240
#> re74           1551.7851     5500.2298         -1.0683     0.2540    0.2254
#> re75           1501.2411     2542.2920         -0.3010     0.9167    0.1403
#>            eCDF Max
#> distance     0.7142
#> age          0.1807
#> educ         0.1493
#> raceblack    0.6681
#> racehispan   0.0832
#> racewhite    0.5849
#> nodegree     0.1493
#> married      0.3240
#> re74         0.4446
#> re75         0.3190
#> 
#> Summary of Balance for Matched Data:
#>            Means Treated Means Control Std. Mean Diff. Var. Ratio eCDF Mean
#> distance          0.6097        0.3622          1.2723     0.5913    0.1556
#> age              26.1712       24.5947          0.2275     0.4649    0.0967
#> educ             10.2988       10.1863          0.0583     0.4708    0.0357
#> raceblack         0.8678        0.4726          1.1670          .    0.3952
#> racehispan        0.0776        0.2300         -0.5698          .    0.1524
#> racewhite         0.0546        0.2974         -1.0688          .    0.2428
#> nodegree          0.7447        0.6655          0.1815          .    0.0791
#> married           0.1754        0.1723          0.0081          .    0.0031
#> re74           1551.7851     2256.6691         -0.1907     0.8279    0.0618
#> re75           1501.2411     1659.6022         -0.0458     1.5703    0.0526
#>            eCDF Max Std. Pair Dist.
#> distance     0.4794          1.1028
#> age          0.3654          1.4392
#> educ         0.0863          1.3007
#> raceblack    0.3952          1.1013
#> racehispan   0.1524          0.9497
#> racewhite    0.2428          1.0946
#> nodegree     0.0791          1.0537
#> married      0.0031          0.8529
#> re74         0.2884          1.0531
#> re75         0.2377          0.6871
#> 
#> Sample Sizes:
#>               Control Treated
#> All (ESS)      214.81    94.1
#> All            429.     185. 
#> Matched (ESS)  100.63    94.1
#> Matched        185.     185. 
#> Unmatched      244.       0. 
#> Discarded        0.       0. 
#>

Add sampling weights to a `matchit` object

Arguments

Value

See also

Author

Examples