Pseudo r-squared measures for various models
nagelkerke.RdProduces McFadden, Cox and Snell, and Nagelkerke pseudo r-squared measures, along with p-values, for models.
Arguments
- fit
The fitted model object for which to determine pseudo r-squared.
- null
The null model object against which to compare the fitted model object. The null model must be nested in the fitted model to be valid. Specifying the null is optional for some model object types and is required for others.
- restrictNobs
If
TRUE, limits the observations for the null model to those used in the fitted model. Works with only some model object types.
Value
A list of six objects describing the models used, the pseudo r-squared values, the likelihood ratio test for the model, the number of observations for the models, messages, and any warnings.
Details
Pseudo R-squared values are not directly comparable to the R-squared for OLS models. Nor can they be interpreted as the proportion of the variability in the dependent variable that is explained by model. Instead pseudo R-squared measures are relative measures among similar models indicating how well the model explains the data.
Cox and Snell is also referred to as ML. Nagelkerke is also referred to as Cragg and Uhler.
Model objects accepted are lm, glm, gls, lme, lmer, lmerTest, nls, clm, clmm, vglm, glmer, glmmTMB, negbin, zeroinfl, betareg, and rq.
Model objects that require the null model to
be defined are nls, lmer, glmer, and clmm.
Other objects use the update function to
define the null model.
Likelihoods are found using ML (REML = FALSE).
The fitted model and the null model
should be properly nested.
That is, the terms of one need to be a subset of the the other,
and they should have the same set of observations.
One issue arises when there are NA
values in one variable but not another, and observations with
NA are removed in the model fitting. The result may be
fitted and null models with
different sets of observations.
Setting restrictNobs to TRUE
ensures that only observations in
the fit model are used in the null model.
This appears to work for lm and some glm models,
but causes the function to fail for other model
object types.
Some pseudo R-squared measures may not be appropriate or useful for some model types.
Calculations are based on log likelihood values for models. Results may be different than those based on deviance.
Acknowledgments
My thanks to
Jan-Herman Kuiper of Keele University for suggesting
the restrictNobs fix.
Author
Salvatore Mangiafico, mangiafico@njaes.rutgers.edu
Examples
### Logistic regression example
data(AndersonBias)
model = glm(Result ~ County + Gender + County:Gender,
weight = Count,
data = AndersonBias,
family = binomial(link="logit"))
nagelkerke(model)
#> $Models
#>
#> Model: "glm, Result ~ County + Gender + County:Gender, binomial(link = \"logit\"), AndersonBias, Count"
#> Null: "glm, Result ~ 1, binomial(link = \"logit\"), AndersonBias, Count"
#>
#> $Pseudo.R.squared.for.model.vs.null
#> Pseudo.R.squared
#> McFadden 0.0797857
#> Cox and Snell (ML) 0.7136520
#> Nagelkerke (Cragg and Uhler) 0.7136520
#>
#> $Likelihood.ratio.test
#> Df.diff LogLik.diff Chisq p.value
#> -7 -10.004 20.009 0.0055508
#>
#> $Number.of.observations
#>
#> Model: 16
#> Null: 16
#>
#> $Messages
#> [1] "Note: For models fit with REML, these statistics are based on refitting with ML"
#>
#> $Warnings
#> [1] "None"
#>
### Quadratic plateau example
### With nls, the null needs to be defined
data(BrendonSmall)
quadplat = function(x, a, b, clx) {
ifelse(x < clx, a + b * x + (-0.5*b/clx) * x * x,
a + b * clx + (-0.5*b/clx) * clx * clx)}
model = nls(Sodium ~ quadplat(Calories, a, b, clx),
data = BrendonSmall,
start = list(a = 519,
b = 0.359,
clx = 2304))
nullfunct = function(x, m){m}
null.model = nls(Sodium ~ nullfunct(Calories, m),
data = BrendonSmall,
start = list(m = 1346))
nagelkerke(model, null=null.model)
#> $Models
#>
#> Model: "nls, Sodium ~ quadplat(Calories, a, b, clx), BrendonSmall, list(a = 519, b = 0.359, clx = 2304), default, list(50, 1e-05, 0.0009765625, FALSE, FALSE, 0, FALSE), FALSE"
#> Null: "nls, Sodium ~ nullfunct(Calories, m), BrendonSmall, list(m = 1346), default, list(50, 1e-05, 0.0009765625, FALSE, FALSE, 0, FALSE), FALSE"
#>
#> $Pseudo.R.squared.for.model.vs.null
#> Pseudo.R.squared
#> McFadden 0.175609
#> Cox and Snell (ML) 0.864674
#> Nagelkerke (Cragg and Uhler) 0.864683
#>
#> $Likelihood.ratio.test
#> Df.diff LogLik.diff Chisq p.value
#> -2 -45.001 90.003 2.8583e-20
#>
#> $Number.of.observations
#>
#> Model: 45
#> Null: 45
#>
#> $Messages
#> [1] "Note: For models fit with REML, these statistics are based on refitting with ML"
#>
#> $Warnings
#> [1] "None"
#>