validate.RdThe validate function when used on an object created by one of the
rms series does resampling validation of a
regression model, with or without backward step-down variable
deletion.
The print method will call the latex or html method
if options(prType=) is set to "latex" or "html".
For "latex" printing through print(), the LaTeX table
environment is turned off. When using html with Quarto or RMarkdown,
results='asis' need not be written in the chunk header.
See predab.resample for information about confidence limits.
# fit <- fitting.function(formula=response ~ terms, x=TRUE, y=TRUE)
validate(fit, method="boot", B=40,
bw=FALSE, rule="aic", type="residual", sls=0.05, aics=0,
force=NULL, estimates=TRUE, pr=FALSE, ...)
# S3 method for class 'validate'
print(x, digits=4, B=Inf, ...)
# S3 method for class 'validate'
latex(object, digits=4, B=Inf, file='', append=FALSE,
title=first.word(deparse(substitute(x))),
caption=NULL, table.env=FALSE,
size='normalsize', extracolsize=size, ...)
# S3 method for class 'validate'
html(object, digits=4, B=Inf, caption=NULL, ...)a fit derived by e.g. lrm, cph, psm,
ols. The options x=TRUE and y=TRUE
must have been specified.
may be "crossvalidation", "boot" (the default),
".632", or "randomization".
See predab.resample for details. Can abbreviate, e.g.
"cross", "b", ".6".
number of repetitions. For method="crossvalidation", is the
number of groups of omitted observations. For print.validate,
latex.validate, and html.validate, B is an upper
limit on the number
of resamples for which information is printed about which variables
were selected in each model re-fit. Specify zero to suppress
printing. Default is to print all re-samples.
TRUE to do fast step-down using the fastbw function,
for both the overall model and for each repetition. fastbw
keeps parameters together that represent the same factor.
Applies if bw=TRUE. "aic" to use Akaike's information criterion as a
stopping rule (i.e., a factor is deleted if the \(\chi^2\) falls below
twice its degrees of freedom), or "p" to use \(P\)-values.
"residual" or "individual" - stopping rule is for individual factors or
for the residual \(\chi^2\) for all variables deleted
significance level for a factor to be kept in a model, or for judging the residual \(\chi^2\).
cutoff on AIC when rule="aic".
see fastbw
see print.fastbw
TRUE to print results of each repetition
parameters for each specific validate function, and parameters to
pass to predab.resample (note especially the group,
cluster, amd subset parameters). For latex,
optional arguments to latex.default. Ignored for
html.validate.
For psm, you can pass the maxiter parameter here (passed to
survreg.control, default is 15 iterations) as well as a tol parameter
for judging matrix singularity in solvet (default is 1e-12)
and a rel.tolerance parameter that is passed to
survreg.control (default is 1e-5).
For print.validate ... is ignored.
an object produced by one of the validate functions
number of decimal places to print
file to write LaTeX output. Default is standard output.
set to TRUE to append LaTeX output to an existing
file
see
latex.default. If table.env is
FALSE and caption is given, the character string
contained in caption will be placed before the table,
centered.
size of LaTeX output. Default is 'normalsize'. Must
be a defined LaTeX size when prepended by double slash.
It provides bias-corrected indexes that are specific to each type
of model. For validate.cph and validate.psm, see validate.lrm,
which is similar.
For validate.cph and validate.psm, there is
an extra argument dxy, which if TRUE causes the dxy.cens
function to be invoked to compute the Somers' \(D_{xy}\) rank correlation
to be computed at each resample. The values corresponding to the row
\(D_{xy}\) are equal to \(2 * (C - 0.5)\) where C is the
C-index or concordance probability.
For validate.cph with dxy=TRUE,
you must specify an argument u if the model is stratified, since
survival curves can then cross and \(X\beta\) is not 1-1 with
predicted survival.
There is also validate method for
tree, which only does cross-validation and which has a different
list of arguments.
a matrix with rows corresponding to the statistical indexes and columns for columns for the original index, resample estimates, indexes applied to the whole or omitted sample using the model derived from the resample, average optimism, corrected index, and number of successful re-samples.
prints a summary, and optionally statistics for each re-fit
# See examples for validate.cph, validate.lrm, validate.ols
# Example of validating a parametric survival model:
require(survival)
n <- 1000
set.seed(731)
age <- 50 + 12*rnorm(n)
label(age) <- "Age"
sex <- factor(sample(c('Male','Female'), n, TRUE))
cens <- 15*runif(n)
h <- .02*exp(.04*(age-50)+.8*(sex=='Female'))
dt <- -log(runif(n))/h
e <- ifelse(dt <= cens,1,0)
dt <- pmin(dt, cens)
units(dt) <- "Year"
S <- Surv(dt,e)
f <- psm(S ~ age*sex, x=TRUE, y=TRUE) # Weibull model
# Validate full model fit
validate(f, B=10) # usually B=150
#> index.orig training test optimism index.corrected n
#> Dxy 0.3854 0.3937 0.3789 0.0148 0.3706 10
#> R2 0.0918 0.1032 0.0881 0.0152 0.0766 10
#> Intercept 0.0000 0.0000 0.3467 -0.3467 0.3467 10
#> Slope 1.0000 1.0000 0.9015 0.0985 0.9015 10
#> D 0.0449 0.0503 0.0430 0.0073 0.0376 10
#> U -0.0012 -0.0011 -0.0017 0.0005 -0.0017 10
#> Q 0.0461 0.0515 0.0447 0.0068 0.0393 10
#> g 0.7377 0.8071 0.7236 0.0835 0.6542 10
# Validate stepwise model with typical (not so good) stopping rule
# bw=TRUE does not preserve hierarchy of terms at present
validate(f, B=10, bw=TRUE, rule="p", sls=.1, type="individual")
#>
#> Backwards Step-down - Original Model
#>
#> Deleted Chi-Sq d.f. P Residual d.f. P AIC
#> age * sex 0.98 1 0.3217 0.98 1 0.3217 -1.02
#>
#> Approximate Estimates after Deleting Factors
#>
#> Coef S.E. Wald Z P
#> (Intercept) 5.40012 0.37320 14.470 0.000e+00
#> age -0.04254 0.00598 -7.114 1.127e-12
#> sex=Male 0.58686 0.14850 3.952 7.750e-05
#>
#> Factors in Final Model
#>
#> [1] age sex
#> index.orig training test optimism index.corrected n
#> Dxy 0.3858 0.3942 0.3805 0.0137 0.3721 10
#> R2 0.0907 0.0981 0.0894 0.0087 0.0820 10
#> Intercept 0.0000 0.0000 0.0759 -0.0759 0.0759 10
#> Slope 1.0000 1.0000 0.9799 0.0201 0.9799 10
#> D 0.0443 0.0481 0.0437 0.0044 0.0399 10
#> U -0.0012 -0.0012 -0.0010 -0.0002 -0.0010 10
#> Q 0.0455 0.0493 0.0447 0.0046 0.0409 10
#> g 0.6888 0.7217 0.6996 0.0220 0.6668 10
#>
#> Factors Retained in Backwards Elimination
#>
#> age sex age * sex
#> * *
#> * * *
#> * *
#> * *
#> * *
#> * *
#> * *
#> * *
#> * * *
#> * *
#>
#> Frequencies of Numbers of Factors Retained
#>
#> 2 3
#> 8 2