subset.model.selection.Rd
Extract a subset of a model selection table.
# S3 method for class 'model.selection'
subset(x, subset, select, recalc.weights = TRUE, recalc.delta = FALSE, ...)
# S3 method for class 'model.selection'
x[i, j, recalc.weights = TRUE, recalc.delta = FALSE, ...]
# S3 method for class 'model.selection'
x[[..., exact = TRUE]]
a model.selection
object to be subsetted.
logical expressions indicating columns and rows to keep. See subset.
indices specifying elements to extract.
logical value specyfying whether Akaike weights should be normalized across the new set of models to sum to one.
logical value specyfying whether Δ_IC should be calculated for the new set of models (not done by default).
logical, see [=Extract.
further arguments passed to [.data.frame
(drop
).
A model.selection
object containing only the selected models (rows).
If columns are selected (via argument select
or the second index
x[, j]
) and not all essential columns (i.e. all except
"varying" and "extra") are present in the result, a plain data.frame
is
returned. Similarly, modifying values in the essential columns with [<-
,
[[<-
or $<-
produces a regular data frame.
Unlike the method for data.frame
, single bracket extraction with only
one index x[i]
selects rows (models) rather than columns.
To select rows according to presence or absence of the variables (rather than
their value), a pseudo-function has
may be used with subset
, e.g.
subset(x, has(a, !b))
will select rows with a and without b (this is
equivalent to !is.na(a) & is.na(b)
). has
can take any number of
arguments.
Complex model terms need to be enclosed within curly brackets
(e.g {s(a,k=2)}
), except for within has
. Backticks-quoting is
also possible, but then the name must match exactly (including whitespace)
the term name as returned by getAllTerms
.
Enclosing in I
prevents the name from being interpreted as a column name.
To select rows where one variable can be present conditional on the presence of
other variables, the function dc
(dependency chain) can
be used.
dc
takes any number of variables as arguments, and allows a variable to be
included only if all the preceding arguments are also included (e.g. subset =
dc(a, b, c)
allows for models of form a
, a+b
and a+b+c
but not
b
, c
, b+c
or a+c
).
dredge, subset and [.data.frame
for
subsetting and extracting from data.frame
s.
fm1 <- lm(formula = y ~ X1 + X2 + X3 + X4, data = Cement, na.action = na.fail)
# generate models where each variable is included only if the previous
# are included too, e.g. X2 only if X1 is there, and X3 only if X2 and X1
dredge(fm1, subset = dc(X1, X2, X3, X4))
#> Fixed term is "(Intercept)"
#> Global model call: lm(formula = y ~ X1 + X2 + X3 + X4, data = Cement, na.action = na.fail)
#> ---
#> Model selection table
#> (Intrc) X1 X2 X3 X4 df logLik AICc delta weight
#> 4 52.58 1.468 0.6623 4 -28.156 69.3 0.00 0.826
#> 8 48.19 1.696 0.6569 0.2500 5 -26.952 72.5 3.16 0.170
#> 16 62.41 1.551 0.5102 0.1019 -0.1441 6 -26.918 79.8 10.52 0.004
#> 2 81.48 1.869 3 -48.206 105.1 35.77 0.000
#> 1 95.42 2 -53.168 111.5 42.22 0.000
#> Models ranked by AICc(x)
# which is equivalent to
# dredge(fm1, subset = (!X2 | X1) & (!X3 | X2) & (!X4 | X3))
# alternatively, generate "all possible" combinations
ms0 <- dredge(fm1)
#> Fixed term is "(Intercept)"
# ...and afterwards select the subset of models
subset(ms0, dc(X1, X2, X3, X4))
#> Global model call: lm(formula = y ~ X1 + X2 + X3 + X4, data = Cement, na.action = na.fail)
#> ---
#> Model selection table
#> (Intrc) X1 X2 X3 X4 df logLik AICc delta weight
#> 4 52.58 1.468 0.6623 4 -28.156 69.3 0.00 0.826
#> 8 48.19 1.696 0.6569 0.2500 5 -26.952 72.5 3.16 0.170
#> 16 62.41 1.551 0.5102 0.1019 -0.1441 6 -26.918 79.8 10.52 0.004
#> 2 81.48 1.869 3 -48.206 105.1 35.77 0.000
#> 1 95.42 2 -53.168 111.5 42.22 0.000
#> Models ranked by AICc(x)
# which is equivalent to
# subset(ms0, (has(!X2) | has(X1)) & (has(!X3) | has(X2)) & (has(!X4) | has(X3)))
# Different ways of finding a confidence set of models:
# delta(AIC) cutoff
subset(ms0, delta <= 4, recalc.weights = FALSE)
#> Global model call: lm(formula = y ~ X1 + X2 + X3 + X4, data = Cement, na.action = na.fail)
#> ---
#> Model selection table
#> (Intrc) X1 X2 X3 X4 df logLik AICc delta weight
#> 4 52.58 1.468 0.6623 4 -28.156 69.3 0.00 0.566
#> 12 71.65 1.452 0.4161 -0.2365 5 -26.933 72.4 3.13 0.119
#> 8 48.19 1.696 0.6569 0.25 5 -26.952 72.5 3.16 0.116
#> 10 103.10 1.440 -0.6140 4 -29.817 72.6 3.32 0.107
#> 14 111.70 1.052 -0.41 -0.6428 5 -27.310 73.2 3.88 0.081
#> Models ranked by AICc(x)
# cumulative sum of Akaike weights
subset(ms0, cumsum(weight) <= .95, recalc.weights = FALSE)
#> Global model call: lm(formula = y ~ X1 + X2 + X3 + X4, data = Cement, na.action = na.fail)
#> ---
#> Model selection table
#> (Intrc) X1 X2 X3 X4 df logLik AICc delta weight
#> 4 52.58 1.468 0.6623 4 -28.156 69.3 0.00 0.566
#> 12 71.65 1.452 0.4161 -0.2365 5 -26.933 72.4 3.13 0.119
#> 8 48.19 1.696 0.6569 0.25 5 -26.952 72.5 3.16 0.116
#> 10 103.10 1.440 -0.6140 4 -29.817 72.6 3.32 0.107
#> Models ranked by AICc(x)
# relative likelihood
subset(ms0, (weight / weight[1]) > (1/8), recalc.weights = FALSE)
#> Global model call: lm(formula = y ~ X1 + X2 + X3 + X4, data = Cement, na.action = na.fail)
#> ---
#> Model selection table
#> (Intrc) X1 X2 X3 X4 df logLik AICc delta weight
#> 4 52.58 1.468 0.6623 4 -28.156 69.3 0.00 0.566
#> 12 71.65 1.452 0.4161 -0.2365 5 -26.933 72.4 3.13 0.119
#> 8 48.19 1.696 0.6569 0.25 5 -26.952 72.5 3.16 0.116
#> 10 103.10 1.440 -0.6140 4 -29.817 72.6 3.32 0.107
#> 14 111.70 1.052 -0.41 -0.6428 5 -27.310 73.2 3.88 0.081
#> Models ranked by AICc(x)