Calculate univariate or multivariate (Mardia's test) skew and kurtosis for a vector, matrix, or data.frame

Find the skew and kurtosis for each variable in a data.frame or matrix. Unlike skew and kurtosis in e1071, this calculates a different skew for each variable or column of a data.frame/matrix. mardia applies Mardia's tests for multivariate skew and kurtosis

skew(x, na.rm = TRUE,type=3)
kurtosi(x, na.rm = TRUE,type=3)
mardia(x,na.rm = TRUE,plot=TRUE)

Arguments

x: A data.frame or matrix
na.rm: how to treat missing data
type: See the discussion in describe

plot: Plot the expected normal distribution values versus the Mahalanobis distance of the subjects.

Details

given a matrix or data.frame x, find the skew or kurtosis for each column (for skew and kurtosis) or the multivariate skew and kurtosis in the case of mardia.

As of version 1.2.3,when finding the skew and the kurtosis, there are three different options available. These match the choices available in skewness and kurtosis found in the e1071 package (see Joanes and Gill (1998) for the advantages of each one).

If we define \(m_r = [\sum(X- mx)^r]/n\) then

Type 1 finds skewness and kurtosis by \(g_1 = m_3/(m_2)^{3/2} \) and \(g_2 = m_4/(m_2)^2 -3\).

Type 2 is \(G1 = g1 * \sqrt{n *(n-1)}/(n-2)\) and \(G2 = (n-1)*[(n+1)g2 +6]/((n-2)(n-3))\).

Type 3 is \(b1 = [(n-1)/n]^{3/2} m_3/m_2^{3/2}\) and \(b2 = [(n-1)/n]^{3/2} m_4/m_2^2)\).

For consistency with e1071 and with the Joanes and Gill, the types are now defined as above.

However, from revision 1.0.93 to 1.2.3, kurtosi by default gives an unbiased estimate of the kurtosis (DeCarlo, 1997). Prior versions used a different equation which produced a biased estimate. (See the kurtosis function in the e1071 package for the distinction between these two formulae. The default, type 1 gave what is called type 2 in e1071. The other is their type 3.) For comparison with previous releases, specifying type = 2 will give the old estimate. These type numbers are now changed.

Value

skew: if input is a matrix or data.frame, skew is a vector of skews
kurtosi: if input is a matrix or data.frame, kurtosi is a vector of kurtosi
bp1: Mardia's bp1 estimate of multivariate skew
bp2: Mardia's bp2 estimate of multivariate kurtosis
skew: Mardia's skew statistic
small.skew: Mardia's small sample skew statistic
p.skew: Probability of skew
p.small: Probability of small.skew
kurtosis: Mardia's multivariate kurtosis statistic
p.kurtosis: Probability of kurtosis statistic
D: Mahalanobis distance of cases from centroid

References

Joanes, D.N. and Gill, C.A (1998). Comparing measures of sample skewness and kurtosis. The Statistician, 47, 183-189.

L.DeCarlo. 1997) On the meaning and use of kurtosis, Psychological Methods, 2(3):292-307,

K.V. Mardia (1970). Measures of multivariate skewness and kurtosis with applications. Biometrika, 57(3):pp. 519-30, 1970.

Author

William Revelle

Note

The mean function supplies means for the columns of a data.frame, but the overall mean for a matrix. Mean will throw a warning for non-numeric data, but colMeans stops with non-numeric data. Thus, the function uses either mean (for data frames) or colMeans (for matrices). This is true for skew and kurtosi as well.

Note

Probability values less than 10^-300 are set to 0.

Examples

round(skew(attitude),2)   #type 3 (default)
#> [1] -0.36 -0.22  0.38 -0.05  0.20 -0.87  0.85
round(kurtosi(attitude),2)  #type 3 (default)
#>     rating complaints privileges   learning     raises   critical    advance 
#>      -0.77      -0.68      -0.41      -1.22      -0.60       0.17       0.47 
#for the differences between the three types of skew and kurtosis:
round(skew(attitude,type=1),2)  #type 1
#> [1] -0.38 -0.23  0.40 -0.06  0.21 -0.91  0.89
round(skew(attitude,type=2),2)  #type 2 
#> [1] -0.40 -0.24  0.42 -0.06  0.22 -0.96  0.94
mardia(attitude)

#> Call: mardia(x = attitude)
#> 
#> Mardia tests of multivariate skew and kurtosis
#> Use describe(x) the to get univariate tests
#> n.obs = 30   num.vars =  7 
#> b1p =  20.09   skew =  100.45  with probability  <=  0.11
#>  small sample skew =  113.23  with probability <=  0.018
#> b2p =  61.91   kurtosis =  -0.27  with probability <=  0.79
x <- matrix(rnorm(1000),ncol=10)
describe(x)
#>     vars   n  mean   sd median trimmed  mad   min  max range  skew kurtosis
#> X1     1 100 -0.01 1.06   0.01    0.00 1.23 -3.67 2.61  6.29 -0.26     0.37
#> X2     2 100 -0.05 0.91  -0.07   -0.06 0.93 -2.58 1.91  4.49  0.03    -0.18
#> X3     3 100 -0.04 1.09  -0.05    0.01 1.09 -4.68 2.12  6.80 -0.82     2.01
#> X4     4 100  0.15 1.01   0.11    0.15 0.96 -2.82 3.07  5.90 -0.05     0.40
#> X5     5 100  0.01 1.07  -0.05    0.01 1.15 -2.30 2.26  4.56  0.04    -0.70
#> X6     6 100  0.09 0.97   0.10    0.11 0.89 -2.65 2.37  5.02 -0.24     0.00
#> X7     7 100  0.04 1.04  -0.04    0.04 0.88 -2.27 2.85  5.13  0.08    -0.11
#> X8     8 100  0.02 1.06  -0.05   -0.03 1.06 -2.58 2.56  5.14  0.25    -0.29
#> X9     9 100  0.16 0.99   0.15    0.16 1.10 -2.12 3.01  5.13  0.09    -0.35
#> X10   10 100  0.01 0.94  -0.02    0.00 0.98 -2.05 2.96  5.02  0.25     0.01
#>       se
#> X1  0.11
#> X2  0.09
#> X3  0.11
#> X4  0.10
#> X5  0.11
#> X6  0.10
#> X7  0.10
#> X8  0.11
#> X9  0.10
#> X10 0.09
mardia(x)

#> Call: mardia(x = x)
#> 
#> Mardia tests of multivariate skew and kurtosis
#> Use describe(x) the to get univariate tests
#> n.obs = 100   num.vars =  10 
#> b1p =  14.29   skew =  238.23  with probability  <=  0.19
#>  small sample skew =  246.71  with probability <=  0.1
#> b2p =  116.7   kurtosis =  -1.06  with probability <=  0.29

Calculate univariate or multivariate (Mardia's test) skew and kurtosis for a vector, matrix, or data.frame

Arguments

Details

Value

References

Author

Note

Note

See also

Examples