tilt.boot.Rd
This function will run an initial bootstrap with equal resampling probabilities (if required) and will use the output of the initial run to find resampling probabilities which put the value of the statistic at required values. It then runs an importance resampling bootstrap using the calculated probabilities as the resampling distribution.
The data as a vector, matrix or data frame. If it is a matrix or data frame then each row is considered as one (multivariate) observation.
A function which when applied to data returns a vector containing the
statistic(s) of interest. It must take at least two arguments. The first
argument will always be data
and the second should be a
vector of indices, weights or frequencies describing the bootstrap
sample. Any other arguments must be supplied to tilt.boot
and will be passed unchanged to statistic each time it is called.
The number of bootstrap replicates required. This will generally be
a vector, the first value stating how many uniform bootstrap
simulations are to be performed at the initial stage. The remaining
values of R
are the number of simulations to be performed
resampling from each reweighted distribution. The first value of
R
must always be present, a value of 0 implying that no
uniform resampling is to be carried out. Thus length(R)
should always equal 1+length(theta)
.
This is a character string indicating the type of bootstrap
simulation required. There are only two possible values that this
can take: "ordinary"
and "balanced"
. If other
simulation types are required for the initial un-weighted bootstrap
then it will be necessary to run boot
, calculate the weights
appropriately, and run boot
again using the calculated
weights.
A character string indicating the type of second argument expected
by statistic
. The possible values that stype
can take
are "i"
(indices), "w"
(weights) and "f"
(frequencies).
An integer vector or factor representing the strata for multi-sample problems.
The empirical influence values for the statistic of interest. They
are used only for exponential tilting when tilt
is
TRUE
. If tilt
is TRUE
and they are not
supplied then tilt.boot
uses empinf
to calculate
them.
The required parameter value(s) for the tilted distribution(s).
There should be one value of theta
for each of the
non-uniform distributions. If R[1]
is 0 theta
is a
required argument. Otherwise theta
values can be estimated
from the initial uniform bootstrap and the values in alpha
.
The alpha level to which tilting is required. This parameter is
ignored if R[1]
is 0 or if theta
is supplied,
otherwise it is used to find the values of theta
as quantiles
of the initial uniform bootstrap. In this case R[1]
should
be large enough that min(c(alpha, 1-alpha))*R[1] > 5
, if this
is not the case then a warning is generated to the effect that the
theta
are extreme values and so the tilted output may be
unreliable.
A logical variable which if TRUE
(the default) indicates that
exponential tilting should be used, otherwise local frequency
smoothing (smooth.f
) is used. If tilt
is FALSE
then R[1]
must be positive. In fact in this case the value
of R[1]
should be fairly large (in the region of 500 or
more).
This argument is used only if tilt
is FALSE
, in which
case it is passed unchanged to smooth.f
as the standardized
bandwidth for the smoothing operation. The value should generally
be in the range (0.2, 1). See smooth.f
for for more details.
The index of the statistic of interest in the output from
statistic
. By default the first element of the output of
statistic
is used.
Any additional arguments required by statistic
. These are
passed unchanged to statistic
each time it is called.
An object of class "boot"
with the following components
The observed value of the statistic on the original data.
The values of the bootstrap replicates of the statistic. There will
be sum(R)
of these, the first R[1]
corresponding to the
uniform bootstrap and the remainder to the tilted bootstrap(s).
The input vector of the number of bootstrap replicates.
The original data as supplied.
The statistic
function as supplied.
The simulation type used in the bootstrap(s), it can either be
"ordinary"
or "balanced"
.
The type of statistic supplied, it is the same as the input value
stype
.
A copy of the original call to tilt.boot
.
The strata as supplied.
The matrix of weights used. If R[1]
is greater than 0 then the
first row will be the uniform weights and each subsequent row the
tilted weights. If R[1]
equals 0 then the uniform weights are
omitted and only the tilted weights are output.
The values of theta
used for the tilted distributions. These
are either the input values or the values derived from the uniform
bootstrap and alpha
.
Booth, J.G., Hall, P. and Wood, A.T.A. (1993) Balanced importance resampling for the bootstrap. Annals of Statistics, 21, 286–298.
Davison, A.C. and Hinkley, D.V. (1997) Bootstrap Methods and Their Application. Cambridge University Press.
Hinkley, D.V. and Shi, S. (1989) Importance sampling and the nested bootstrap. Biometrika, 76, 435–446.
# Note that these examples can take a while to run.
# Example 9.9 of Davison and Hinkley (1997).
grav1 <- gravity[as.numeric(gravity[,2]) >= 7, ]
grav.fun <- function(dat, w, orig) {
strata <- tapply(dat[, 2], as.numeric(dat[, 2]))
d <- dat[, 1]
ns <- tabulate(strata)
w <- w/tapply(w, strata, sum)[strata]
mns <- as.vector(tapply(d * w, strata, sum)) # drop names
mn2 <- tapply(d * d * w, strata, sum)
s2hat <- sum((mn2 - mns^2)/ns)
c(mns[2]-mns[1],s2hat,(mns[2]-mns[1]-orig)/sqrt(s2hat))
}
grav.z0 <- grav.fun(grav1, rep(1, 26), 0)
tilt.boot(grav1, grav.fun, R = c(249, 375, 375), stype = "w",
strata = grav1[,2], tilt = TRUE, index = 3, orig = grav.z0[1])
#>
#> TILTED BOOTSTRAP
#>
#> Exponential tilting used
#> First 249 replicates untilted,
#> Next 375 replicates tilted to -3.367,
#> Next 375 replicates tilted to 1.544.
#>
#> Call:
#> tilt.boot(data = grav1, statistic = grav.fun, R = c(249, 375,
#> 375), stype = "w", strata = grav1[, 2], tilt = TRUE, index = 3,
#> orig = grav.z0[1])
#>
#>
#> Bootstrap Statistics :
#> original bias std. error
#> t1* 2.846154 -0.7768538 2.552199
#> t2* 2.392353 -0.4548937 1.174004
#> t3* 0.000000 -1.2672756 2.513175
# Example 9.10 of Davison and Hinkley (1997) requires a balanced
# importance resampling bootstrap to be run. In this example we
# show how this might be run.
acme.fun <- function(data, i, bhat) {
d <- data[i,]
n <- nrow(d)
d.lm <- glm(d$acme~d$market)
beta.b <- coef(d.lm)[2]
d.diag <- boot::glm.diag(d.lm)
SSx <- (n-1)*var(d$market)
tmp <- (d$market-mean(d$market))*d.diag$res*d.diag$sd
sr <- sqrt(sum(tmp^2))/SSx
c(beta.b, sr, (beta.b-bhat)/sr)
}
acme.b <- acme.fun(acme, 1:nrow(acme), 0)
acme.boot1 <- tilt.boot(acme, acme.fun, R = c(499, 250, 250),
stype = "i", sim = "balanced", alpha = c(0.05, 0.95),
tilt = TRUE, index = 3, bhat = acme.b[1])