describe.by.Rd
Report basic summary statistics by a grouping variable. Useful if the grouping variable is some experimental variable and data are to be aggregated for plotting. Partly a wrapper for by and describe
describeBy(x, group=NULL,mat=FALSE,type=3,digits=15,data,...)
describe.by(x, group=NULL,mat=FALSE,type=3,...) # deprecated
a data.frame or matrix. See note for statsBy.
a grouping variable or a list of grouping variables. (may be ignored if calling using the formula mode.)
provide a matrix output rather than a list
Which type of skew and kurtosis should be found
When giving matrix output, how many digits should be reported?
Needed if using formula input
parameters to be passed to describe
To get descriptive statistics for several different grouping variables, make sure that group is a list. In the case of matrix output with multiple grouping variables, the grouping variable values are added to the output.
As of July, 2020, the grouping variable(s) may be specified in formula mode (see the examples).
The type parameter specifies which version of skew and kurtosis should be found. See describe
for more details.
An alternative function (statsBy
) returns a list of means, n, and standard deviations for each group. This is particularly useful if finding weighted correlations of group means using cor.wt
. More importantly, it does a proper within and between group decomposition of the correlation.
cohen.d
will work for two groups. It converts the data into mean differences and pools the within group standard deviations. Returns cohen.d statistic as well as the multivariate generalization (Mahalanobis D).
A data.frame of the relevant statistics broken down by group:
item name
item number
number of valid cases
mean
standard deviation
median
mad: median absolute deviation (from the median)
minimum
maximum
skew
standard error
describe
, statsBy
, densityBy
and violinBy
, cohen.d
, cohen.d.by
, and cohen.d.ci
as well as error.bars
and error.bars.by
for other graphical displays.
data(sat.act)
describeBy(sat.act,sat.act$gender) #just one grouping variable
#>
#> Descriptive statistics by group
#> group: 1
#> vars n mean sd median trimmed mad min max range skew
#> gender 1 247 1.00 0.00 1 1.00 0.00 1 1 0 NaN
#> education 2 247 3.00 1.54 3 3.12 1.48 0 5 5 -0.54
#> age 3 247 25.86 9.74 22 24.23 5.93 14 58 44 1.43
#> ACT 4 247 28.79 5.06 30 29.23 4.45 3 36 33 -1.06
#> SATV 5 247 615.11 114.16 630 622.07 118.61 200 800 600 -0.63
#> SATQ 6 245 635.87 116.02 660 645.53 94.89 300 800 500 -0.72
#> kurtosis se
#> gender NaN 0.00
#> education -0.60 0.10
#> age 1.43 0.62
#> ACT 1.89 0.32
#> SATV 0.13 7.26
#> SATQ -0.12 7.41
#> ------------------------------------------------------------
#> group: 2
#> vars n mean sd median trimmed mad min max range skew
#> gender 1 453 2.00 0.00 2 2.00 0.00 2 2 0 NaN
#> education 2 453 3.26 1.35 3 3.40 1.48 0 5 5 -0.74
#> age 3 453 25.45 9.37 22 23.70 5.93 13 65 52 1.77
#> ACT 4 453 28.42 4.69 29 28.63 4.45 15 36 21 -0.39
#> SATV 5 453 610.66 112.31 620 617.91 103.78 200 800 600 -0.65
#> SATQ 6 442 596.00 113.07 600 602.21 133.43 200 800 600 -0.58
#> kurtosis se
#> gender NaN 0.00
#> education 0.27 0.06
#> age 3.03 0.44
#> ACT -0.42 0.22
#> SATV 0.42 5.28
#> SATQ 0.13 5.38
describeBy(sat.act ~ gender) #describe the entire set formula input
#>
#> Descriptive statistics by group
#> gender: 1
#> vars n mean sd median trimmed mad min max range skew
#> gender 1 247 1.00 0.00 1 1.00 0.00 1 1 0 NaN
#> education 2 247 3.00 1.54 3 3.12 1.48 0 5 5 -0.54
#> age 3 247 25.86 9.74 22 24.23 5.93 14 58 44 1.43
#> ACT 4 247 28.79 5.06 30 29.23 4.45 3 36 33 -1.06
#> SATV 5 247 615.11 114.16 630 622.07 118.61 200 800 600 -0.63
#> SATQ 6 245 635.87 116.02 660 645.53 94.89 300 800 500 -0.72
#> kurtosis se
#> gender NaN 0.00
#> education -0.60 0.10
#> age 1.43 0.62
#> ACT 1.89 0.32
#> SATV 0.13 7.26
#> SATQ -0.12 7.41
#> ------------------------------------------------------------
#> gender: 2
#> vars n mean sd median trimmed mad min max range skew
#> gender 1 453 2.00 0.00 2 2.00 0.00 2 2 0 NaN
#> education 2 453 3.26 1.35 3 3.40 1.48 0 5 5 -0.74
#> age 3 453 25.45 9.37 22 23.70 5.93 13 65 52 1.77
#> ACT 4 453 28.42 4.69 29 28.63 4.45 15 36 21 -0.39
#> SATV 5 453 610.66 112.31 620 617.91 103.78 200 800 600 -0.65
#> SATQ 6 442 596.00 113.07 600 602.21 133.43 200 800 600 -0.58
#> kurtosis se
#> gender NaN 0.00
#> education 0.27 0.06
#> age 3.03 0.44
#> ACT -0.42 0.22
#> SATV 0.42 5.28
#> SATQ 0.13 5.38
describeBy(SATV + SATQ ~ gender,data =sat.act) #specify the data set if using formula
#>
#> Descriptive statistics by group
#> gender: 1
#> vars n mean sd median trimmed mad min max range skew kurtosis
#> SATV 1 247 615.11 114.16 630 622.07 118.61 200 800 600 -0.63 0.13
#> SATQ 2 245 635.87 116.02 660 645.53 94.89 300 800 500 -0.72 -0.12
#> se
#> SATV 7.26
#> SATQ 7.41
#> ------------------------------------------------------------
#> gender: 2
#> vars n mean sd median trimmed mad min max range skew kurtosis
#> SATV 1 453 610.66 112.31 620 617.91 103.78 200 800 600 -0.65 0.42
#> SATQ 2 442 596.00 113.07 600 602.21 133.43 200 800 600 -0.58 0.13
#> se
#> SATV 5.28
#> SATQ 5.38
#describeBy(sat.act,list(sat.act$gender,sat.act$education)) #two grouping variables
describeBy(sat.act ~ gender + education) #two grouping variables
#>
#> Descriptive statistics by group
#> gender: 1
#> education: 0
#> vars n mean sd median trimmed mad min max range skew
#> gender 1 27 1.00 0.00 1 1.00 0.00 1 1 0 NaN
#> education 2 27 0.00 0.00 0 0.00 0.00 0 0 0 NaN
#> age 3 27 16.93 1.04 17 17.04 1.48 14 18 4 -0.86
#> ACT 4 27 29.04 5.00 29 29.22 5.93 20 36 16 -0.30
#> SATV 5 27 640.07 132.24 670 646.17 177.91 400 800 400 -0.29
#> SATQ 6 27 642.67 127.90 660 647.91 177.91 400 800 400 -0.24
#> kurtosis se
#> gender NaN 0.00
#> education NaN 0.00
#> age 0.34 0.20
#> ACT -1.13 0.96
#> SATV -1.40 25.45
#> SATQ -1.36 24.61
#> ------------------------------------------------------------
#> gender: 2
#> education: 0
#> vars n mean sd median trimmed mad min max range skew
#> gender 1 30 2.00 0.00 2 2.00 0.00 2 2 0 NaN
#> education 2 30 0.00 0.00 0 0.00 0.00 0 0 0 NaN
#> age 3 30 16.97 1.07 17 17.12 0.74 13 18 5 -1.75
#> ACT 4 30 26.07 5.06 26 25.92 5.93 15 36 21 0.08
#> SATV 5 30 595.30 123.46 595 597.08 148.26 350 800 450 -0.09
#> SATQ 6 29 599.72 123.20 600 600.96 148.26 333 800 467 -0.09
#> kurtosis se
#> gender NaN 0.00
#> education NaN 0.00
#> age 4.13 0.19
#> ACT -0.56 0.92
#> SATV -0.81 22.54
#> SATQ -0.99 22.88
#> ------------------------------------------------------------
#> gender: 1
#> education: 1
#> vars n mean sd median trimmed mad min max range skew
#> gender 1 20 1.00 0.00 1 1.00 0.00 1 1 0 NaN
#> education 2 20 1.00 0.00 1 1.00 0.00 1 1 0 NaN
#> age 3 20 19.65 6.12 18 18.19 0.00 17 45 28 3.55
#> ACT 4 20 26.70 7.11 28 27.12 8.15 15 35 20 -0.30
#> SATV 5 20 603.00 141.24 600 611.25 185.32 300 780 480 -0.39
#> SATQ 6 19 625.84 95.87 650 630.94 88.96 400 765 365 -0.66
#> kurtosis se
#> gender NaN 0.00
#> education NaN 0.00
#> age 11.78 1.37
#> ACT -1.51 1.59
#> SATV -1.12 31.58
#> SATQ -0.47 21.99
#> ------------------------------------------------------------
#> gender: 2
#> education: 1
#> vars n mean sd median trimmed mad min max range skew
#> gender 1 25 2.00 0.00 2 2.00 0.00 2 2 0 NaN
#> education 2 25 1.00 0.00 1 1.00 0.00 1 1 0 NaN
#> age 3 25 19.32 4.62 18 18.14 0.00 17 37 20 2.86
#> ACT 4 25 28.12 5.13 27 28.33 4.45 18 36 18 -0.21
#> SATV 5 25 597.00 119.38 610 600.76 133.43 350 799 449 -0.31
#> SATQ 6 24 592.54 140.83 625 606.60 111.19 230 799 569 -0.93
#> kurtosis se
#> gender NaN 0.00
#> education NaN 0.00
#> age 7.27 0.92
#> ACT -0.78 1.03
#> SATV -0.95 23.88
#> SATQ 0.20 28.75
#> ------------------------------------------------------------
#> gender: 1
#> education: 2
#> vars n mean sd median trimmed mad min max range skew
#> gender 1 23 1.00 0.00 1 1.00 0.00 1 1 0 NaN
#> education 2 23 2.00 0.00 2 2.00 0.00 2 2 0 NaN
#> age 3 23 25.26 8.68 22 23.58 4.45 18 55 37 1.94
#> ACT 4 23 26.65 6.39 28 27.68 4.45 3 32 29 -2.14
#> SATV 5 23 560.00 152.29 600 570.53 148.26 200 800 600 -0.53
#> SATQ 6 23 569.13 160.65 600 575.79 177.91 300 800 500 -0.36
#> kurtosis se
#> gender NaN 0.00
#> education NaN 0.00
#> age 3.63 1.81
#> ACT 5.39 1.33
#> SATV -0.59 31.75
#> SATQ -1.44 33.50
#> ------------------------------------------------------------
#> gender: 2
#> education: 2
#> vars n mean sd median trimmed mad min max range skew
#> gender 1 21 2.00 0.00 2 2.00 0.00 2 2 0 NaN
#> education 2 21 2.00 0.00 2 2.00 0.00 2 2 0 NaN
#> age 3 21 30.10 12.22 26 28.41 10.38 18 57 39 1.16
#> ACT 4 21 27.33 5.23 28 27.53 4.45 15 36 21 -0.32
#> SATV 5 21 593.57 115.34 600 598.24 118.61 375 770 395 -0.44
#> SATQ 6 20 586.50 120.96 585 587.81 163.09 375 800 425 0.01
#> kurtosis se
#> gender NaN 0.00
#> education NaN 0.00
#> age 0.01 2.67
#> ACT -0.34 1.14
#> SATV -0.91 25.17
#> SATQ -1.11 27.05
#> ------------------------------------------------------------
#> gender: 1
#> education: 3
#> vars n mean sd median trimmed mad min max range skew
#> gender 1 80 1.00 0.00 1 1.00 0.00 1 1 0 NaN
#> education 2 80 3.00 0.00 3 3.00 0.00 3 3 0 NaN
#> age 3 80 20.81 3.06 20 20.28 1.48 17 34 17 2.00
#> ACT 4 80 28.56 5.03 30 28.84 5.19 17 36 19 -0.45
#> SATV 5 80 617.44 111.79 630 624.45 111.19 300 800 500 -0.62
#> SATQ 6 79 642.59 118.28 680 653.15 118.61 300 800 500 -0.81
#> kurtosis se
#> gender NaN 0.00
#> education NaN 0.00
#> age 4.55 0.34
#> ACT -0.92 0.56
#> SATV -0.06 12.50
#> SATQ -0.17 13.31
#> ------------------------------------------------------------
#> gender: 2
#> education: 3
#> vars n mean sd median trimmed mad min max range skew
#> gender 1 195 2.00 0.00 2 2.00 0.00 2 2 0 NaN
#> education 2 195 3.00 0.00 3 3.00 0.00 3 3 0 NaN
#> age 3 195 21.09 4.75 20 20.04 1.48 17 46 29 3.41
#> ACT 4 195 28.18 4.78 29 28.43 4.45 16 36 20 -0.46
#> SATV 5 195 609.96 119.78 620 619.57 118.61 200 800 600 -0.81
#> SATQ 6 190 590.89 114.46 600 598.94 118.61 200 800 600 -0.72
#> kurtosis se
#> gender NaN 0.00
#> education NaN 0.00
#> age 12.83 0.34
#> ACT -0.47 0.34
#> SATV 0.66 8.58
#> SATQ 0.38 8.30
#> ------------------------------------------------------------
#> gender: 1
#> education: 4
#> vars n mean sd median trimmed mad min max range skew
#> gender 1 51 1.00 0.00 1 1.00 0.00 1 1 0 NaN
#> education 2 51 4.00 0.00 4 4.00 0.00 4 4 0 NaN
#> age 3 51 32.22 9.03 29 30.78 8.90 23 57 34 1.20
#> ACT 4 51 28.94 4.42 29 29.34 4.45 16 36 20 -0.74
#> SATV 5 51 620.31 81.72 620 623.32 88.96 430 800 370 -0.26
#> SATQ 6 51 635.90 104.12 640 642.46 88.96 400 800 400 -0.46
#> kurtosis se
#> gender NaN 0.00
#> education NaN 0.00
#> age 0.63 1.27
#> ACT 0.12 0.62
#> SATV -0.29 11.44
#> SATQ -0.45 14.58
#> ------------------------------------------------------------
#> gender: 2
#> education: 4
#> vars n mean sd median trimmed mad min max range skew
#> gender 1 87 2.00 0.00 2 2.00 0.00 2 2 0 NaN
#> education 2 87 4.00 0.00 4 4.00 0.00 4 4 0 NaN
#> age 3 87 29.08 7.76 26 27.83 5.93 21 52 31 1.26
#> ACT 4 87 29.45 4.32 30 29.59 4.45 19 36 17 -0.27
#> SATV 5 87 614.98 106.62 620 621.39 88.96 300 800 500 -0.58
#> SATQ 6 86 597.59 106.24 600 605.76 118.61 300 800 500 -0.71
#> kurtosis se
#> gender NaN 0.00
#> education NaN 0.00
#> age 0.70 0.83
#> ACT -0.67 0.46
#> SATV 0.28 11.43
#> SATQ 0.20 11.46
#> ------------------------------------------------------------
#> gender: 1
#> education: 5
#> vars n mean sd median trimmed mad min max range skew
#> gender 1 46 1.00 0.00 1.0 1.00 0.00 1 1 0 NaN
#> education 2 46 5.00 0.00 5.0 5.00 0.00 5 5 0 NaN
#> age 3 46 35.85 10.00 35.5 35.13 11.12 22 58 36 0.47
#> ACT 4 46 30.83 3.11 32.0 30.95 2.97 25 36 11 -0.38
#> SATV 5 46 623.48 99.58 645.0 631.18 96.37 390 770 380 -0.61
#> SATQ 6 46 657.83 89.61 680.0 661.71 103.78 475 800 325 -0.45
#> kurtosis se
#> gender NaN 0.00
#> education NaN 0.00
#> age -0.67 1.48
#> ACT -0.81 0.46
#> SATV -0.43 14.68
#> SATQ -0.77 13.21
#> ------------------------------------------------------------
#> gender: 2
#> education: 5
#> vars n mean sd median trimmed mad min max range skew
#> gender 1 95 2.00 0.00 2 2.00 0.00 2 2 0 NaN
#> education 2 95 5.00 0.00 5 5.00 0.00 5 5 0 NaN
#> age 3 95 34.34 10.67 30 32.74 8.90 22 65 43 1.18
#> ACT 4 95 29.01 4.19 29 29.14 4.45 18 36 18 -0.31
#> SATV 5 95 620.39 95.72 620 623.61 74.13 300 800 500 -0.46
#> SATQ 6 93 606.72 105.55 600 608.93 148.26 350 800 450 -0.14
#> kurtosis se
#> gender NaN 0.00
#> education NaN 0.00
#> age 0.61 1.09
#> ACT -0.73 0.43
#> SATV 0.43 9.82
#> SATQ -0.94 10.95
des.mat <- describeBy(age ~ education,mat=TRUE,data = sat.act) #matrix (data.frame) output
des.mat <- describeBy(age ~ education + gender, data=sat.act,
mat=TRUE,digits=2) #matrix output rounded to 2 decimals