Subsetting model selection table

Extract a subset of a model selection table.

# S3 method for class 'model.selection'
subset(x, subset, select, recalc.weights = TRUE, recalc.delta = FALSE, ...)
# S3 method for class 'model.selection'
x[i, j, recalc.weights = TRUE, recalc.delta = FALSE, ...]
# S3 method for class 'model.selection'
x[[..., exact = TRUE]]

Arguments

x: a model.selection object to be subsetted.
subset,select: logical expressions indicating columns and rows to keep. See subset.
i,j: indices specifying elements to extract.
recalc.weights: logical value specyfying whether Akaike weights should be normalized across the new set of models to sum to one.
recalc.delta: logical value specyfying whether Δ_IC should be calculated for the new set of models (not done by default).
exact: logical, see [=Extract.
...: further arguments passed to [.data.frame (drop).

Value

A model.selection object containing only the selected models (rows). If columns are selected (via argument select or the second index x[, j]) and not all essential columns (i.e. all except "varying" and "extra") are present in the result, a plain data.frame is returned. Similarly, modifying values in the essential columns with [<-, [[<- or $<- produces a regular data frame.

Details

Unlike the method for data.frame, single bracket extraction with only one index x[i] selects rows (models) rather than columns.

To select rows according to presence or absence of the variables (rather than their value), a pseudo-function has may be used with subset, e.g. subset(x, has(a, !b)) will select rows with a and without b (this is equivalent to !is.na(a) & is.na(b)). has can take any number of arguments.

Complex model terms need to be enclosed within curly brackets (e.g {s(a,k=2)}), except for within has. Backticks-quoting is also possible, but then the name must match exactly (including whitespace) the term name as returned by getAllTerms.

Enclosing in I prevents the name from being interpreted as a column name.

To select rows where one variable can be present conditional on the presence of other variables, the function dc (dependency chain) can be used. dc takes any number of variables as arguments, and allows a variable to be included only if all the preceding arguments are also included (e.g. subset = dc(a, b, c) allows for models of form a, a+b and a+b+c but not b, c, b+c or a+c).

Author

Kamil Bartoń

Examples

fm1 <- lm(formula = y ~ X1 + X2 + X3 + X4, data = Cement, na.action = na.fail)

# generate models where each variable is included only if the previous
# are included too, e.g. X2 only if X1 is there, and X3 only if X2 and X1
dredge(fm1, subset = dc(X1, X2, X3, X4))
#> Fixed term is "(Intercept)"
#> Global model call: lm(formula = y ~ X1 + X2 + X3 + X4, data = Cement, na.action = na.fail)
#> ---
#> Model selection table 
#>    (Intrc)    X1     X2     X3      X4 df  logLik  AICc delta weight
#> 4    52.58 1.468 0.6623                 4 -28.156  69.3  0.00  0.826
#> 8    48.19 1.696 0.6569 0.2500          5 -26.952  72.5  3.16  0.170
#> 16   62.41 1.551 0.5102 0.1019 -0.1441  6 -26.918  79.8 10.52  0.004
#> 2    81.48 1.869                        3 -48.206 105.1 35.77  0.000
#> 1    95.42                              2 -53.168 111.5 42.22  0.000
#> Models ranked by AICc(x) 

# which is equivalent to
# dredge(fm1, subset = (!X2 | X1) & (!X3 | X2) & (!X4 | X3))

# alternatively, generate "all possible" combinations
ms0 <- dredge(fm1)
#> Fixed term is "(Intercept)"
# ...and afterwards select the subset of models
subset(ms0, dc(X1, X2, X3, X4))
#> Global model call: lm(formula = y ~ X1 + X2 + X3 + X4, data = Cement, na.action = na.fail)
#> ---
#> Model selection table 
#>    (Intrc)    X1     X2     X3      X4 df  logLik  AICc delta weight
#> 4    52.58 1.468 0.6623                 4 -28.156  69.3  0.00  0.826
#> 8    48.19 1.696 0.6569 0.2500          5 -26.952  72.5  3.16  0.170
#> 16   62.41 1.551 0.5102 0.1019 -0.1441  6 -26.918  79.8 10.52  0.004
#> 2    81.48 1.869                        3 -48.206 105.1 35.77  0.000
#> 1    95.42                              2 -53.168 111.5 42.22  0.000
#> Models ranked by AICc(x) 
# which is equivalent to
# subset(ms0, (has(!X2) | has(X1)) & (has(!X3) | has(X2)) & (has(!X4) | has(X3)))

# Different ways of finding a confidence set of models:
# delta(AIC) cutoff
subset(ms0, delta <= 4, recalc.weights = FALSE)
#> Global model call: lm(formula = y ~ X1 + X2 + X3 + X4, data = Cement, na.action = na.fail)
#> ---
#> Model selection table 
#>    (Intrc)    X1     X2    X3      X4 df  logLik AICc delta weight
#> 4    52.58 1.468 0.6623                4 -28.156 69.3  0.00  0.566
#> 12   71.65 1.452 0.4161       -0.2365  5 -26.933 72.4  3.13  0.119
#> 8    48.19 1.696 0.6569  0.25          5 -26.952 72.5  3.16  0.116
#> 10  103.10 1.440              -0.6140  4 -29.817 72.6  3.32  0.107
#> 14  111.70 1.052        -0.41 -0.6428  5 -27.310 73.2  3.88  0.081
#> Models ranked by AICc(x) 
# cumulative sum of Akaike weights
subset(ms0, cumsum(weight) <= .95, recalc.weights = FALSE)
#> Global model call: lm(formula = y ~ X1 + X2 + X3 + X4, data = Cement, na.action = na.fail)
#> ---
#> Model selection table 
#>    (Intrc)    X1     X2   X3      X4 df  logLik AICc delta weight
#> 4    52.58 1.468 0.6623               4 -28.156 69.3  0.00  0.566
#> 12   71.65 1.452 0.4161      -0.2365  5 -26.933 72.4  3.13  0.119
#> 8    48.19 1.696 0.6569 0.25          5 -26.952 72.5  3.16  0.116
#> 10  103.10 1.440             -0.6140  4 -29.817 72.6  3.32  0.107
#> Models ranked by AICc(x) 
# relative likelihood
subset(ms0, (weight / weight[1]) > (1/8), recalc.weights = FALSE)
#> Global model call: lm(formula = y ~ X1 + X2 + X3 + X4, data = Cement, na.action = na.fail)
#> ---
#> Model selection table 
#>    (Intrc)    X1     X2    X3      X4 df  logLik AICc delta weight
#> 4    52.58 1.468 0.6623                4 -28.156 69.3  0.00  0.566
#> 12   71.65 1.452 0.4161       -0.2365  5 -26.933 72.4  3.13  0.119
#> 8    48.19 1.696 0.6569  0.25          5 -26.952 72.5  3.16  0.116
#> 10  103.10 1.440              -0.6140  4 -29.817 72.6  3.32  0.107
#> 14  111.70 1.052        -0.41 -0.6428  5 -27.310 73.2  3.88  0.081
#> Models ranked by AICc(x)

Arguments

Value

Details

Author

See also

Examples