r effect size for Wilcoxon two-sample rank-sum test

Calculates r effect size for Mann-Whitney two-sample rank-sum test, or a table with an ordinal variable and a nominal variable with two levels; confidence intervals by bootstrap.

Usage

wilcoxonR(
  x,
  g = NULL,
  group = "row",
  coin = FALSE,
  ci = FALSE,
  conf = 0.95,
  type = "perc",
  R = 1000,
  histogram = FALSE,
  digits = 3,
  reportIncomplete = FALSE,
  ...
)

Arguments

x: Either a two-way table or a two-way matrix. Can also be a vector of observations.
g: If x is a vector, g is the vector of observations for the grouping, nominal variable. Only the first two levels of the nominal variable are used.
group: If x is a table or matrix, group indicates whether the "row" or the "column" variable is the nominal, grouping variable.
coin: If FALSE, the default, the Z value is extracted from a function similar to the wilcox.test function in the stats package. If TRUE, the Z value is extracted from the wilcox_test function in the coin package. This method may be much slower, especially if a confidence interval is produced.
ci: If TRUE, returns confidence intervals by bootstrap. May be slow.
conf: The level for the confidence interval.
type: The type of confidence interval to use. Can be any of "norm", "basic", "perc", or "bca". Passed to boot.ci.
R: The number of replications to use for bootstrap.
histogram: If TRUE, produces a histogram of bootstrapped values.
digits: The number of significant digits in the output.
reportIncomplete: If FALSE (the default), NA will be reported in cases where there are instances of the calculation of the statistic failing during the bootstrap procedure.
...: Additional arguments passed to the wilcox_test function.

Value

A single statistic, r. Or a small data frame consisting of r, and the lower and upper confidence limits.

Details

r is calculated as Z divided by square root of the total observations.

This statistic reports a smaller effect size than does Glass rank biserial correlation coefficient (wilcoxonRG), and cannot reach -1 or 1. This effect is exaserbated when sample sizes are not equal.

Currently, the function makes no provisions for NA values in the data. It is recommended that NAs be removed beforehand.

When the data in the first group are greater than in the second group, r is positive. When the data in the second group are greater than in the first group, r is negative. Be cautious with this interpretation, as R will alphabetize groups if g is not already a factor.

When r is close to extremes, or with small counts in some cells, the confidence intervals determined by this method may not be reliable, or the procedure may fail.

References

https://rcompanion.org/handbook/F_04.html

Author

Salvatore Mangiafico, mangiafico@njaes.rutgers.edu

Examples

data(Breakfast)
Table = Breakfast[1:2,]
library(coin)
chisq_test(Table, scores = list("Breakfast" = c(-2, -1, 0, 1, 2)))
#> 
#> 	Asymptotic Linear-by-Linear Association Test
#> 
#> data:  Breakfast (ordered) by Travel (Walk, Bus)
#> Z = -1.5204, p-value = 0.1284
#> alternative hypothesis: two.sided
#> 
wilcoxonR(Table)
#>      r 
#> -0.216 

data(Catbus)
wilcox.test(Steps ~ Gender, data = Catbus)
#> Warning: cannot compute exact p-value with ties
#> 
#> 	Wilcoxon rank sum test with continuity correction
#> 
#> data:  Steps by Gender
#> W = 127.5, p-value = 0.01773
#> alternative hypothesis: true location shift is not equal to 0
#> 
wilcoxonR(x = Catbus$Steps, g = Catbus$Gender)
#>     r 
#> 0.471