Compute some basic descriptive statistics.

Values of type factor, character and logical are treated as categorical. For logicals, the two categories are given the labels `Yes` for TRUE, and `No` for FALSE. Factor levels with zero counts are retained.

stats.default(x, quantile.type = 7, ...)

Arguments

x: A vector or numeric, factor, character or logical values.
quantile.type: An integer from 1 to 9, passed as the type argument to function quantile.
...: Further arguments (ignored).

Value

A list. For numeric x, the list contains the numeric elements:

N: the number of non-missing values
NMISS: the number of missing values
SUM: the sum of the non-missing values
MEAN: the mean of the non-missing values
SD: the standard deviation of the non-missing values
MIN: the minimum of the non-missing values
MEDIAN: the median of the non-missing values
CV: the percent coefficient of variation of the non-missing values
GMEAN: the geometric mean of the non-missing values if non-negative, or NA
GSD: the geometric standard deviation of the non-missing values if non-negative, or NA
GCV: the percent geometric coefficient of variation of the non-missing values if non-negative, or NA
qXX: various quantiles (percentiles) of the non-missing values (q01: 1%, q02.5: 2.5%, q05: 5%, q10: 10%, q25: 25% (first quartile), q33.3: 33.33333% (first tertile), q50: 50% (median, or second quartile), q66.7: 66.66667% (second tertile), q75: 75% (third quartile), q90: 90%, q95: 95%, q97.5: 97.5%, q99: 99%)
Q1: the first quartile of the non-missing values (alias q25)
Q2: the second quartile of the non-missing values (alias q50 or Median)
Q3: the third quartile of the non-missing values (alias q75)
IQR: the inter-quartile range of the non-missing values (i.e., Q3 - Q1)
T1: the first tertile of the non-missing values (alias q33.3)
T2: the second tertile of the non-missing values (alias q66.7)

If x is categorical (i.e. factor, character or logical), the list contains a sublist for each category, where each sublist contains the numeric elements:

FREQ: the frequency count
PCT: the percent relative frequency, including NA in the denominator
PCTnoNA: the percent relative frequency, excluding NA from the denominator
NMISS: the number of missing values

Examples

x <- exp(rnorm(100, 1, 1))
stats.default(x)
#> $N
#> [1] 100
#> 
#> $NMISS
#> [1] 0
#> 
#> $SUM
#> [1] 340.8459
#> 
#> $MEAN
#> [1] 3.408459
#> 
#> $SD
#> [1] 4.073481
#> 
#> $CV
#> [1] 119.5109
#> 
#> $GMEAN
#> [1] 2.128828
#> 
#> $GSD
#> [1] 2.630025
#> 
#> $GCV
#> [1] 124.3948
#> 
#> $MEDIAN
#>      0.5 
#> 2.310684 
#> 
#> $MIN
#> [1] 0.306127
#> 
#> $MAX
#> [1] 27.14736
#> 
#> $q01
#>      0.01 
#> 0.3147366 
#> 
#> $q02.5
#>     0.025 
#> 0.3705747 
#> 
#> $q05
#>      0.05 
#> 0.4809816 
#> 
#> $q10
#>       0.1 
#> 0.6293231 
#> 
#> $q25
#>      0.25 
#> 0.9246764 
#> 
#> $q50
#>      0.5 
#> 2.310684 
#> 
#> $q75
#>     0.75 
#> 4.010674 
#> 
#> $q90
#>      0.9 
#> 7.113366 
#> 
#> $q95
#>     0.95 
#> 9.728987 
#> 
#> $q97.5
#>    0.975 
#> 14.94395 
#> 
#> $q99
#>     0.99 
#> 20.24373 
#> 
#> $Q1
#>      0.25 
#> 0.9246764 
#> 
#> $Q2
#>      0.5 
#> 2.310684 
#> 
#> $Q3
#>     0.75 
#> 4.010674 
#> 
#> $IQR
#>     0.75 
#> 3.085998 
#> 
#> $T1
#>      1/3 
#> 1.500533 
#> 
#> $T2
#>     2/3 
#> 3.41731 
#> 

y <- factor(sample(0:1, 99, replace=TRUE), labels=c("Female", "Male"))
y[1:10] <- NA
stats.default(y)
#> $Female
#> $Female$FREQ
#> [1] 40
#> 
#> $Female$PCT
#> [1] 40.40404
#> 
#> $Female$PCTnoNA
#> [1] 44.94382
#> 
#> 
#> $Male
#> $Male$FREQ
#> [1] 49
#> 
#> $Male$PCT
#> [1] 49.49495
#> 
#> $Male$PCTnoNA
#> [1] 55.05618
#> 
#> 
stats.default(is.na(y))
#> $Yes
#> $Yes$FREQ
#> [1] 10
#> 
#> $Yes$PCT
#> [1] 10.10101
#> 
#> $Yes$PCTnoNA
#> [1] 10.10101
#> 
#> 
#> $No
#> $No$FREQ
#> [1] 89
#> 
#> $No$PCT
#> [1] 89.89899
#> 
#> $No$PCTnoNA
#> [1] 89.89899
#> 
#>