Count pseudo r-squared for logistic and other binary outcome models

Produces the count pseudo r-squared measure for models with a binary outcome.

Usage

countRSquare(
  fit,
  digits = 3,
  suppressWarnings = TRUE,
  plotit = FALSE,
  jitter = FALSE,
  pch = 1,
  ...
)

Arguments

fit: The fitted model object for which to determine pseudo r-squared. glm and glmmTMB are supported. Others may work as well.
digits: The number of digits in the outputted values.
suppressWarnings: If TRUE, suppresses warning messages.
plotit: If TRUE, produces a simple plot of actual vs. predicted values.
jitter: If TRUE, jitters the "actual" values in the plot.
pch: Passed to plot.
...: Additional arguments.

Value

A list including a description of the submitted model, a data frame with the pseudo r-squared results, and a confusion matrix of the results.

Details

The count pseudo r-squared is simply the number of correctly predicted observations divided the total number of observations.

This version is appropriate for models with a binary outcome.

The adjusted value deducts the count of the most frequent outcome from both the numerator and the denominator.

It is recommended that the model is fit on data in long format. That is, that the weight option not be used in the model.

The function makes no provisions for NA values. It is recommended that NA values be removed before the determination of the model.

References

https://stats.oarc.ucla.edu/other/mult-pkg/faq/general/faq-what-are-pseudo-r-squareds/, https://rcompanion.org/handbook/H_08.html, https://rcompanion.org/rcompanion/e_06.html

Author

Salvatore Mangiafico, mangiafico@njaes.rutgers.edu

Examples

data(AndersonBias)

### Covert data to long format

Long = AndersonBias[rep(row.names(AndersonBias), AndersonBias$Count),
                    c("Result", "County", "Gender")]
rownames(Long) = seq(1:nrow(Long))
str(Long)
#> 'data.frame':	181 obs. of  3 variables:
#>  $ Result: Factor w/ 2 levels "Pass","Fail": 1 1 1 1 1 1 1 1 1 2 ...
#>  $ County: Factor w/ 4 levels "Bloom","Cobblestone",..: 1 1 1 1 1 1 1 1 1 1 ...
#>  $ Gender: Factor w/ 2 levels "Female","Male": 1 1 1 1 1 1 1 1 1 1 ...

### Fit model and determine count r-square

model = glm(Result ~ County + Gender + County:Gender,
            data = Long,
            family = binomial())

countRSquare(model)
#> $Model
#> [1] "glm, Result ~ County + Gender + County:Gender, binomial(), Long"
#> 
#> $Result
#>   Count.R2 Count.R2.corrected
#> 1    0.652              0.284
#> 
#> $Confusion.matrix
#>       Predicted
#> Actual   0   1 Sum
#>    0    63  30  93
#>    1    33  55  88
#>    Sum  96  85 181
#>