winsor.Rd
Among the robust estimates of central tendency are trimmed means and Winsorized means. This function finds the Winsorized scores. The top and bottom trim values are given values of the trimmed and 1- trimmed quantiles. Then means, sds, and variances are found.
winsor(x, trim = 0.2, na.rm = TRUE)
winsor.mean(x, trim = 0.2, na.rm = TRUE)
winsor.means(x, trim = 0.2, na.rm = TRUE)
winsor.sd(x, trim = 0.2, na.rm = TRUE)
winsor.var(x, trim = 0.2, na.rm = TRUE)
Among the many robust estimates of central tendency, some recommend the Winsorized mean. Rather than just dropping the top and bottom trim percent, these extreme values are replaced with values at the trim and 1- trim quantiles.
A scalar or vector of winsorized scores or winsorized means, sds, or variances (depending upon the call).
Wilcox, Rand R. (2005) Introduction to robust estimation and hypothesis testing. Elsevier/Academic Press. Amsterdam ; Boston.
data(sat.act)
winsor.means(sat.act) #compare with the means of the winsorized scores
#> gender education age ACT SATV SATQ
#> 1.647143 3.391429 23.954286 28.957143 615.570000 614.521106
y <- winsor(sat.act)
describe(y)
#> vars n mean sd median trimmed mad min max range skew
#> gender 1 700 1.65 0.48 2 1.68 0.00 1.0 2 1.0 -0.61
#> education 2 700 3.39 1.03 3 3.36 1.48 2.0 5 3.0 0.27
#> age 3 700 23.95 5.11 22 23.57 4.45 19.0 32 13.0 0.56
#> ACT 4 700 28.96 3.18 29 28.97 4.45 24.8 33 8.2 -0.06
#> SATV 5 700 615.57 72.79 620 618.21 118.61 510.0 700 190.0 -0.24
#> SATQ 6 687 614.52 80.88 620 616.87 118.61 500.0 710 210.0 -0.24
#> kurtosis se
#> gender -1.62 0.02
#> education -1.07 0.04
#> age -1.30 0.19
#> ACT -1.56 0.12
#> SATV -1.43 2.75
#> SATQ -1.47 3.09
xy <- data.frame(sat.act,y)
#pairs.panels(xy) #to see the effect of winsorizing
x <- matrix(1:100,ncol=5)
winsor(x)
#> [,1] [,2] [,3] [,4] [,5]
#> [1,] 4.8 24.8 44.8 64.8 84.8
#> [2,] 4.8 24.8 44.8 64.8 84.8
#> [3,] 4.8 24.8 44.8 64.8 84.8
#> [4,] 4.8 24.8 44.8 64.8 84.8
#> [5,] 5.0 25.0 45.0 65.0 85.0
#> [6,] 6.0 26.0 46.0 66.0 86.0
#> [7,] 7.0 27.0 47.0 67.0 87.0
#> [8,] 8.0 28.0 48.0 68.0 88.0
#> [9,] 9.0 29.0 49.0 69.0 89.0
#> [10,] 10.0 30.0 50.0 70.0 90.0
#> [11,] 11.0 31.0 51.0 71.0 91.0
#> [12,] 12.0 32.0 52.0 72.0 92.0
#> [13,] 13.0 33.0 53.0 73.0 93.0
#> [14,] 14.0 34.0 54.0 74.0 94.0
#> [15,] 15.0 35.0 55.0 75.0 95.0
#> [16,] 16.0 36.0 56.0 76.0 96.0
#> [17,] 16.2 36.2 56.2 76.2 96.2
#> [18,] 16.2 36.2 56.2 76.2 96.2
#> [19,] 16.2 36.2 56.2 76.2 96.2
#> [20,] 16.2 36.2 56.2 76.2 96.2
winsor.means(x)
#> [1] 10.5 30.5 50.5 70.5 90.5
y <- 1:11
winsor(y,trim=.5)
#> [1] 6 6 6 6 6 6 6 6 6 6 6