Correlation and measures of association
correlation.RdProduces measures of association for all variables in a data frame with confidence intervals when available.
Usage
correlation(
data = NULL,
printClasses = FALSE,
progress = TRUE,
methodNum = "pearson",
methodOrd = "kendall",
methodNumOrd = "spearman",
methodNumNom = "eta",
methodNumBin = "pearson",
testChisq = "chisq",
ci = FALSE,
conf = 0.95,
R = 1000,
correct = FALSE,
reportIncomplete = TRUE,
na.action = "na.omit",
digits = 3,
pDigits = 4,
...
)Arguments
- data
A data frame.
- printClasses
If
TRUE, prints a table of classes for all variables.- progress
If
TRUE, prints progress bar when bootstrap methods are called.- methodNum
The method for the correlation for two numeric variables. The default is
"pearson". Other options are"spearman"and"kendall".- methodOrd
The method for the correlation for two ordinal variables. The default is
"kendall", with Kendall's tau-c used. Other option is"spearman".- methodNumOrd
The method for the correlation of a numeric and an ordinal variable. The default is
"pearson". Other options are"spearman"and"kendall".- methodNumNom
The method for the correlation of a numeric and a nominal variable.
The default is
"eta", which is the square root of the r-squared value from anova. The other option is"epsilon", which is the same, except with the numeric value rank-transformed.- methodNumBin
The method for the correlation of a numeric and a binary variable. The default is
"pearson". The other option is"glass", which uses the Glass rank biserial correlation.- testChisq
The method for the test of two nominal variables. The default is
"chisq". The other option is"fisher".- ci
If
TRUE, calculates confidence intervals for methods requiring bootstrap. IfFALSE, will return only those confidence intervals from methods not requiring bootstrap.- conf
The confidence level for confidence intervals.
- R
The number of replications to use for bootstrap confidence intervals for applicable methods.
- correct
Passed to
chisq.test.- reportIncomplete
If
FALSE,NAwill be reported in cases where there are instances of the calculation of the statistic failing during the bootstrap procedure.- na.action
If
"na.omit", the function will use only complete cases, assessed on a bivariate basis. The other option is"na.pass".- digits
The number of decimal places in the output of most statistics.
- pDigits
The number of decimal places in the output for p-values.
- ...
Other arguments.
Details
It’s important that variables are assigned the correct class to get an appropriate measure of association. That is, factor variables should be of class "factor", not "character". Ordered factors should be ordered factors (and have their levels in the correct order!).
Date variables are treated as numeric.
The default for measures of association tend to be "parametric" type. That is, e.g. Pearson correlation where appropriate.
Nonparametric measures of association will be reported
with the options
methodNum = "spearman", methodNumNom = "epsilon",
methodNumBin = "glass", methodNumOrd="spearman".
Author
Salvatore Mangiafico, mangiafico@njaes.rutgers.edu
Examples
Length = c(0.29, 0.25, NA, 0.40, 0.50, 0.57, 0.62, 0.88, 0.99, 0.90)
Rating = factor(ordered=TRUE, levels=c("Low", "Medium", "High"),
x = rep(c("Low", "Medium", "High"), c(3,3,4)))
Color = factor(rep(c("Red", "Green", "Blue"), c(4,4,2)))
Flag = factor(rep(c(TRUE, FALSE, TRUE), c(5,4,1)))
Answer = factor(rep(c("Yes", "No", "Yes"), c(4,3,3)), levels=c("Yes", "No"))
Location = factor(rep(c("Home", "Away", "Other"), c(2,4,4)))
Distance = factor(ordered=TRUE, levels=c("Low", "Medium", "High"),
x = rep(c("Low", "Medium", "High"), c(5,2,3)))
Start = seq(as.Date("2024-01-01"), by = "month", length.out = 10)
Data = data.frame(Length, Rating, Color, Flag, Answer, Location, Distance, Start)
correlation(Data)
#> Var1 Var2 Type N Measure Statistic
#> 1 Length Rating Numeric x Ordinal 9 Spearman 0.935
#> 2 Length Color Numeric x Nominal 9 Eta 0.913
#> 3 Length Flag Numeric x Binary 9 Pearson -0.576
#> 4 Length Answer Numeric x Binary 9 Pearson -0.101
#> 5 Length Location Numeric x Nominal 9 Eta 0.919
#> 6 Length Distance Numeric x Ordinal 9 Spearman 0.935
#> 7 Length Start Numeric x Numeric 9 Pearson 0.959
#> 8 Rating Color Ordinal x Nominal 10 Freeman 0.812
#> 9 Rating Flag Ordinal x Binary 10 Glass rank biserial -0.333
#> 10 Rating Answer Ordinal x Binary 10 Glass rank biserial 0.667
#> 11 Rating Location Ordinal x Nominal 10 Freeman 0.938
#> 12 Rating Distance Ordinal x Ordinal 10 Kendall 0.780
#> 13 Rating Start Ordinal x Numeric 10 Spearman 0.944
#> 14 Color Flag Nominal x Binary 10 Cramer 0.692
#> 15 Color Answer Nominal x Binary 10 Cramer 0.802
#> 16 Color Location Nominal x Nominal 10 Cramer 0.612
#> 17 Color Distance Nominal x Ordinal 10 Freeman 0.812
#> 18 Color Start Nominal x Numeric 10 Eta 0.935
#> 19 Flag Answer Binary x Binary 10 Phi -0.356
#> 20 Flag Location Binary x Nominal 10 Cramer 0.612
#> 21 Flag Distance Binary x Ordinal 10 Glass rank biserial -0.750
#> 22 Flag Start Binary x Numeric 10 Pearson -0.569
#> 23 Answer Location Binary x Nominal 10 Cramer 0.408
#> 24 Answer Distance Binary x Ordinal 10 Glass rank biserial -0.048
#> 25 Answer Start Binary x Numeric 10 Pearson 0.111
#> 26 Location Distance Nominal x Ordinal 10 Freeman 0.781
#> 27 Location Start Nominal x Numeric 10 Eta 0.933
#> 28 Distance Start Ordinal x Numeric 10 Spearman 0.921
#> Lower.CL Upper.CL Test p.value Signif
#> 1 0.716 0.987 cor.test 0.0002 ***
#> 2 0.812 1.000 Anova 0.0047 **
#> 3 -0.897 0.142 cor.test 0.1044 n.s.
#> 4 -0.717 0.603 cor.test 0.7955 n.s.
#> 5 0.827 1.000 Anova 0.0037 **
#> 6 0.716 0.987 cor.test 0.0002 ***
#> 7 0.812 0.992 cor.test 0.0000 ****
#> 8 NA NA Cochran-Armitage 0.0239 *
#> 9 NA NA wilcox.test 0.0708 n.s.
#> 10 NA NA wilcox.test 0.7172 n.s.
#> 11 NA NA Cochran-Armitage 0.0116 *
#> 12 0.641 0.919 Linear by linear 0.0102 *
#> 13 0.775 0.987 cor.test 0.0000 ****
#> 14 NA NA chisq.test 0.0911 n.s.
#> 15 NA NA chisq.test 0.0402 *
#> 16 NA NA chisq.test 0.1117 n.s.
#> 17 NA NA Cochran-Armitage 0.0251 *
#> 18 0.885 0.982 Anova 0.0007 ***
#> 19 NA NA chisq.test 0.2598 n.s.
#> 20 NA NA chisq.test 0.1534 n.s.
#> 21 NA NA wilcox.test 0.0491 *
#> 22 -0.882 0.095 cor.test 0.0862 n.s.
#> 23 NA NA chisq.test 0.4346 n.s.
#> 24 NA NA wilcox.test 1.0000 n.s.
#> 25 -0.557 0.692 cor.test 0.7597 n.s.
#> 26 NA NA Cochran-Armitage 0.0181 *
#> 27 0.883 0.981 Anova 0.0008 ***
#> 28 0.694 0.982 cor.test 0.0002 ***