stackingWeights.Rd
Compute model weights based on a cross-validation-like procedure.
stackingWeights(object, ..., data, R, p = 0.5)
A matrix with two rows, containing model weights
calculated using mean
and median
.
Each model in a set is fitted to the training data: a subset of p * N
observations in data
. From these models a prediction is produced on
the remaining part of data
(the test
or hold-out data). These hold-out predictions are fitted to the hold-out
observations, by optimising the weights by which the models are combined. This
process is repeated R
times, yielding a distribution of weights for each
model (which Smyth & Wolpert (1998) referred to as an ‘empirical Bayesian
estimate of posterior model probability’). A mean or median of model weights for
each model is taken and re-scaled to sum to one.
This approach requires a sample size of at least \(2\times\) the number of models.
Wolpert, D. H. 1992 Stacked generalization. Neural Networks 5, 241–259.
Smyth, P. and Wolpert, D. 1998 An Evaluation of Linearly Combining Density Estimators via Stacking. Technical Report No. 98–25. Information and Computer Science Department, University of California, Irvine, CA.
Dormann, C. et al. 2018 Model averaging in ecology: a review of Bayesian, information-theoretic, and tactical approaches for predictive inference. Ecological Monographs 88, 485–504.
Weights, model.avg
Other model weights:
BGWeights()
,
bootWeights()
,
cos2Weights()
,
jackknifeWeights()
#simulated Cement dataset to increase sample size for the training data
fm0 <- glm(y ~ X1 + X2 + X3 + X4, data = Cement, na.action = na.fail)
dat <- as.data.frame(apply(Cement[, -1], 2, sample, 50, replace = TRUE))
dat$y <- rnorm(nrow(dat), predict(fm0), sigma(fm0))
# global model fitted to training data:
fm <- glm(y ~ X1 + X2 + X3 + X4, data = dat, na.action = na.fail)
# generate a list of *some* subsets of the global model
models <- lapply(dredge(fm, evaluate = FALSE, fixed = "X1", m.lim = c(1, 3)), eval)
#> Fixed terms are "X1" and "(Intercept)"
#> Error in eval(mf, parent.frame()): object 'dat' not found
wts <- stackingWeights(models, data = dat, R = 10)
#> Error: object 'models' not found
ma <- model.avg(models)
#> Error: object 'models' not found
Weights(ma) <- wts["mean", ]
#> Error: object 'wts' not found
predict(ma)
#> Error: object 'ma' not found