Epsilon-squared

Calculates epsilon-squared as an effect size statistic, following a Kruskal-Wallis test, or for a table with one ordinal variable and one nominal variable; confidence intervals by bootstrap

Usage

epsilonSquared(
  x,
  g = NULL,
  group = "row",
  ci = FALSE,
  conf = 0.95,
  type = "perc",
  R = 1000,
  histogram = FALSE,
  digits = 3,
  reportIncomplete = FALSE,
  ...
)

Arguments

x: Either a two-way table or a two-way matrix. Can also be a vector of observations of an ordinal variable.
g: If x is a vector, g is the vector of observations for the grouping, nominal variable.
group: If x is a table or matrix, group indicates whether the "row" or the "column" variable is the nominal, grouping variable.
ci: If TRUE, returns confidence intervals by bootstrap. May be slow.
conf: The level for the confidence interval.
type: The type of confidence interval to use. Can be any of "norm", "basic", "perc", or "bca". Passed to boot.ci.
R: The number of replications to use for bootstrap.
histogram: If TRUE, produces a histogram of bootstrapped values.
digits: The number of significant digits in the output.
reportIncomplete: If FALSE (the default), NA will be reported in cases where there are instances of the calculation of the statistic failing during the bootstrap procedure.
...: Additional arguments passed to the kruskal.test function.

Value

A single statistic, epsilon-squared. Or a small data frame consisting of epsilon-squared, and the lower and upper confidence limits.

Details

Epsilon-squared is used as a measure of association for the Kruskal-Wallis test or for a two-way table with one ordinal and one nominal variable.

Currently, the function makes no provisions for NA values in the data. It is recommended that NAs be removed beforehand.

Because epsilon-squared is always positive, if type="perc", the confidence interval will never cross zero, and should not be used for statistical inference. However, if type="norm", the confidence interval may cross zero.

When epsilon-squared is close to 0 or very large, or with small counts in some cells, the confidence intervals determined by this method may not be reliable, or the procedure may fail.

Note

Note that epsilon-squared as calculated by this function is equivalent to the eta-squared, or r-squared, as determined by an anova on the rank-transformed values. Epsilon-squared for Kruskal-Wallis is typically defined this way in the literature.

References

King, B.M., P.J. Rosopa, and E.W. Minium. 2018. Statistical Reasoning in the Behavioral Sciences, 7th ed. Wiley.

https://rcompanion.org/handbook/F_08.html

Author

Salvatore Mangiafico, mangiafico@njaes.rutgers.edu

Examples

data(Breakfast)
library(coin)
#> Loading required package: survival
chisq_test(Breakfast, scores = list("Breakfast" = c(-2, -1, 0, 1, 2)))
#> 
#> 	Asymptotic Generalized Pearson Chi-Squared Test
#> 
#> data:  Breakfast (ordered) by Travel (Walk, Bus, Drive)
#> chi-squared = 8.6739, df = 2, p-value = 0.01308
#> 
epsilonSquared(Breakfast)
#> epsilon.squared 
#>            0.11 

data(PoohPiglet)
kruskal.test(Likert ~ Speaker, data = PoohPiglet)
#> 
#> 	Kruskal-Wallis rank sum test
#> 
#> data:  Likert by Speaker
#> Kruskal-Wallis chi-squared = 16.842, df = 2, p-value = 0.0002202
#> 
epsilonSquared(x = PoohPiglet$Likert, g = PoohPiglet$Speaker)
#> epsilon.squared 
#>           0.581 

### Same data, as matrix of counts
data(PoohPiglet)
XT = xtabs( ~ Speaker + Likert , data = PoohPiglet)
epsilonSquared(XT)
#> epsilon.squared 
#>           0.581