Eta-squared for ordinal variables

Calculates eta-squared as an effect size statistic, following a Kruskal-Wallis test, or for a table with one ordinal variable and one nominal variable; confidence intervals by bootstrap.

Usage

ordinalEtaSquared(
  x,
  g = NULL,
  group = "row",
  ci = FALSE,
  conf = 0.95,
  type = "perc",
  R = 1000,
  histogram = FALSE,
  digits = 3,
  reportIncomplete = FALSE,
  ...
)

Arguments

x: Either a two-way table or a two-way matrix. Can also be a vector of observations of an ordinal variable.
g: If x is a vector, g is the vector of observations for the grouping, nominal variable.
group: If x is a table or matrix, group indicates whether the "row" or the "column" variable is the nominal, grouping variable.
ci: If TRUE, returns confidence intervals by bootstrap. May be slow.
conf: The level for the confidence interval.
type: The type of confidence interval to use. Can be any of "norm", "basic", "perc", or "bca". Passed to boot.ci.
R: The number of replications to use for bootstrap.
histogram: If TRUE, produces a histogram of bootstrapped values.
digits: The number of significant digits in the output.
reportIncomplete: If FALSE (the default), NA will be reported in cases where there are instances of the calculation of the statistic failing during the bootstrap procedure.
...: Additional arguments passed to the kruskal.test function.

Value

A single statistic, eta-squared. Or a small data frame consisting of eta-squared, and the lower and upper confidence limits.

Details

Eta-squared is used as a measure of association for the Kruskal-Wallis test or for a two-way table with one ordinal and one nominal variable.

Currently, the function makes no provisions for NA values in the data. It is recommended that NAs be removed beforehand.

eta-squared is typically positive, though may be negative in some cases, as is the case with adjusted r-squared. It's not recommended that the confidence interval be used for statistical inference.

When eta-squared is close to 0 or very large, or with small counts in some cells, the confidence intervals determined by this method may not be reliable, or the procedure may fail.

Note

Note that eta-squared as calculated by this function is equivalent to the epsilon-squared, or adjusted r-squared, as determined by an anova on the rank-transformed values. Eta-squared for Kruskal-Wallis is typically defined this way in the literature.

References

Cohen, B.H. 2013. Explaining Psychological Statistics, 4th ed. Wiley.

https://rcompanion.org/handbook/F_08.html

Author

Salvatore Mangiafico, mangiafico@njaes.rutgers.edu

Examples

data(Breakfast)
library(coin)
chisq_test(Breakfast, scores = list("Breakfast" = c(-2, -1, 0, 1, 2)))
#> 
#> 	Asymptotic Generalized Pearson Chi-Squared Test
#> 
#> data:  Breakfast (ordered) by Travel (Walk, Bus, Drive)
#> chi-squared = 8.6739, df = 2, p-value = 0.01308
#> 
ordinalEtaSquared(Breakfast)
#> eta.squared 
#>      0.0865 

data(PoohPiglet)
kruskal.test(Likert ~ Speaker, data = PoohPiglet)
#> 
#> 	Kruskal-Wallis rank sum test
#> 
#> data:  Likert by Speaker
#> Kruskal-Wallis chi-squared = 16.842, df = 2, p-value = 0.0002202
#> 
ordinalEtaSquared(x = PoohPiglet$Likert, g = PoohPiglet$Speaker)
#> eta.squared 
#>        0.55 

### Same data, as matrix of counts
data(PoohPiglet)
XT = xtabs( ~ Speaker + Likert , data = PoohPiglet)
ordinalEtaSquared(XT)
#> eta.squared 
#>        0.55