drop1.dr.Rd
This function implements backward elimination using a dr
object for which
a dr.coordinate.test
is defined, currently for SIR SAVE, IRE and PIRE.
dr.step(object,scope=NULL,d=NULL,minsize=2,stop=0,trace=1,...)
# S3 method for class 'dr'
drop1(object, scope = NULL, update=TRUE,
test="general",trace=1,...)
A dr
object for which dr.coordinate.test
is defined, for method
equal to one of sir
, save
or ire
.
A one sided formula specifying predictors that will never be removed.
To use conditional coordinate tests, specify the dimension
of the central (mean) subspace. The default is NULL
, meaning no
conditioning. This is currently available only for methods sir
,
save
without categorical predictors, or for
ire
with or without categorical predictors.
Minimum subset size, must be greater than or equal to 2.
Set stopping criterion: continue removing variables until the p-value for the next variable to be removed is less than stop. The default is stop = 0.
If true, the update
method is used to return a
dr
object obtained from object
by updating the formula to
drop the variable with the largest p.value. This can significantly slow
the computations for IRE but has little effect on SAVE and SIR.
Type of test to be used for selecting the next predictor
to remove for method="save"
only. "normal"
assumes normal
predictors, "general"
assumes elliptically contoured predictors.
For other methods, this argument is ignored.
If positive, print informative output at each step, the default. If trace is 0 or false, suppress all printing.
Additional arguments passed to dr.coordinate.test
.
Suppose a dr
object has \(p=a+b\) predictors, with \(a\) predictors specified in the scope
statement.
drop1
will compute either marginal coordinate tests (if d=NULL
)
or conditional marginal coordinate tests (if d
is positive) for dropping each of the b
predictors not in the scope, and return p.values.
The result is an object created from the original object with the predictor
with the largest p.value removed.
dr.step
will call drop1.dr
repeatedly until
\(\max(a,d+1)\) predictors remain.
As a side effect,
a data frame of labels, tests, df, and p.values is printed. If
update=TRUE
, a dr
object is returned with the predictor with the largest p.value removed.
Cook, R. D. (2004). Testing predictor contributions in sufficient dimension reduction. Annals of Statistics, 32, 1062-1092.
Shao, Y., Cook, R. D. and Weisberg (2007). Marginal tests with sliced average variance estimation. Biometrika.
data(ais)
# To make this idential to ARC, need to modify slices to match by
# using slice.info=dr.slices.arc() rather than nslices=8
summary(s1 <- dr(LBM~log(SSF)+log(Wt)+log(Hg)+log(Ht)+log(WCC)+log(RCC)+
log(Hc)+log(Ferr), data=ais,method="sir",
slice.method=dr.slices.arc,nslices=8))
#>
#> Call:
#> dr(formula = LBM ~ log(SSF) + log(Wt) + log(Hg) + log(Ht) + log(WCC) +
#> log(RCC) + log(Hc) + log(Ferr), data = ais, method = "sir",
#> slice.method = dr.slices.arc, nslices = 8)
#>
#> Method:
#> sir with 8 slices, n = 202.
#>
#> Slice Sizes:
#> 25 25 25 25 27 27 30 18
#>
#> Estimated Basis Vectors for Central Subspace:
#> Dir1 Dir2 Dir3 Dir4
#> log(SSF) 0.155356 0.045363 -0.08080 0.007174
#> log(Wt) -0.969123 0.006309 0.28789 0.249082
#> log(Hg) -0.157412 -0.456823 -0.00915 -0.045435
#> log(Ht) -0.054094 0.315217 -0.68876 -0.542777
#> log(WCC) 0.005472 0.007850 -0.01038 -0.061888
#> log(RCC) -0.006035 -0.419167 0.08569 0.566282
#> log(Hc) 0.094247 0.716934 -0.65463 -0.555732
#> log(Ferr) -0.003480 0.009819 0.01067 -0.088837
#>
#> Dir1 Dir2 Dir3 Dir4
#> Eigenvalues 0.9391 0.2220 0.09066 0.06427
#> R^2(OLS|dr) 0.9991 0.9991 0.99925 0.99926
#>
#> Large-sample Marginal Dimension Tests:
#> Stat df p.value
#> 0D vs >= 1D 269.35 56 0.0000000
#> 1D vs >= 2D 79.66 42 0.0004021
#> 2D vs >= 3D 34.82 30 0.2492051
#> 3D vs >= 4D 16.51 20 0.6847223
# The following will almost duplicate information in Table 5 of Cook (2004).
# Slight differences occur because a different approximation for the
# sum of independent chi-square(1) random variables is used:
ans1 <- drop1(s1)
#>
#> LBM ~ log(SSF) + log(Wt) + log(Hg) + log(Ht) + log(WCC) +
#> log(RCC) + log(Hc) + log(Ferr)
#> Statistic P.value
#> - log(WCC) 2.388834 8.674682e-01
#> - log(Hg) 5.202510 4.813751e-01
#> - log(Ht) 8.077548 1.994139e-01
#> - log(RCC) 9.770283 1.097139e-01
#> - log(Hc) 10.039536 9.936986e-02
#> - log(Ferr) 10.863385 7.296643e-02
#> - log(SSF) 25.322296 1.435986e-04
#> - log(Wt) 42.322135 4.193156e-08
ans2 <- drop1(s1,d=2)
#>
#> LBM ~ log(SSF) + log(Wt) + log(Hg) + log(Ht) + log(WCC) +
#> log(RCC) + log(Hc) + log(Ferr)
#> Statistic P.value
#> - log(WCC) 0.1084395 7.561248e-01
#> - log(Ferr) 0.7359519 3.635322e-01
#> - log(Ht) 1.9821144 1.198231e-01
#> - log(Hg) 4.4894002 1.647214e-02
#> - log(Hc) 6.3180635 4.150525e-03
#> - log(RCC) 6.8164660 2.865534e-03
#> - log(SSF) 15.6856307 4.656384e-06
#> - log(Wt) 31.7148776 5.643419e-11
ans3 <- drop1(s1,d=3)
#>
#> LBM ~ log(SSF) + log(Wt) + log(Hg) + log(Ht) + log(WCC) +
#> log(RCC) + log(Hc) + log(Ferr)
#> Statistic P.value
#> - log(WCC) 0.1638879 9.205514e-01
#> - log(Ferr) 0.8845944 6.144432e-01
#> - log(Hg) 4.4901630 7.304205e-02
#> - log(Ht) 5.1053927 5.054880e-02
#> - log(RCC) 6.9233683 1.698072e-02
#> - log(Hc) 8.6677890 5.943620e-03
#> - log(SSF) 20.9863952 3.462158e-06
#> - log(Wt) 35.3792122 5.595753e-10
# remove predictors stepwise until we run out of variables to drop.
dr.step(s1,scope=~log(Wt)+log(Ht))
#>
#> LBM ~ log(SSF) + log(Wt) + log(Hg) + log(Ht) + log(WCC) +
#> log(RCC) + log(Hc) + log(Ferr)
#> Statistic P.value
#> - log(WCC) 2.388834 0.8674682270
#> - log(Hg) 5.202510 0.4813750749
#> - log(RCC) 9.770283 0.1097139021
#> - log(Hc) 10.039536 0.0993698643
#> - log(Ferr) 10.863385 0.0729664252
#> - log(SSF) 25.322296 0.0001435986
#>
#> LBM ~ log(SSF) + log(Wt) + log(Hg) + log(Ht) + log(RCC) +
#> log(Hc) + log(Ferr)
#> Statistic P.value
#> - log(Hg) 5.280848 4.727389e-01
#> - log(RCC) 9.680354 1.142247e-01
#> - log(Hc) 10.467980 8.542042e-02
#> - log(Ferr) 10.965572 7.080631e-02
#> - log(SSF) 27.341251 5.775252e-05
#>
#> LBM ~ log(SSF) + log(Wt) + log(Ht) + log(RCC) + log(Hc) +
#> log(Ferr)
#> Statistic P.value
#> - log(Hc) 9.498782 7.117135e-02
#> - log(Ferr) 10.728497 4.278951e-02
#> - log(RCC) 10.814724 4.126625e-02
#> - log(SSF) 31.988544 1.956933e-06
#>
#> LBM ~ log(SSF) + log(Wt) + log(Ht) + log(RCC) + log(Ferr)
#> Statistic P.value
#> - log(Ferr) 10.62658 2.198143e-02
#> - log(RCC) 19.05848 3.750310e-04
#> - log(SSF) 28.72390 2.821080e-06
#>
#> LBM ~ log(SSF) + log(Wt) + log(Ht) + log(RCC)
#> Statistic P.value
#> - log(RCC) 18.45772 1.542279e-04
#> - log(SSF) 30.14733 3.038586e-07
#>
#> LBM ~ log(SSF) + log(Wt) + log(Ht)
#> Statistic P.value
#> - log(SSF) 50.17375 8.198997e-13
#>
#> No more variables to remove
#>
#> dr(formula = LBM ~ log(Wt) + log(Ht), data = ais, method = "sir",
#> slice.method = dr.slices.arc, nslices = 8)
#> Estimated Basis Vectors for Central Subspace:
#> Dir1 Dir2
#> log(Wt) 0.7067240 -0.2467316
#> log(Ht) 0.7074893 0.9690838
#> Eigenvalues:
#> [1] 0.84018385 0.01589101
#>