huberM.Rd(Generalized) Huber M-estimator of location with MAD scale, being
sensible also when the scale is zero where huber()
returns an error.
huberM(x, k = 1.5, weights = NULL, tol = 1e-06,
mu = if(is.null(weights)) median(x) else wgt.himedian(x, weights),
s = if(is.null(weights)) mad(x, center=mu)
else wgt.himedian(abs(x - mu), weights),
se = FALSE,
warn0scale = getOption("verbose"))numeric vector.
positive factor; the algorithm winsorizes at k
standard deviations.
numeric vector of non-negative weights of same length
as x, or NULL.
convergence tolerance.
initial location estimator.
scale estimator held constant through the iterations.
logical indicating if the standard error should be computed
and returned (as SE component). Currently only available
when weights is NULL.
logical; if true, and s is 0 and
length(x) > 1, this will be warned about.
list of location and scale parameters, and number of iterations used.
location estimate
the s argument, typically the mad.
the number of “Huber iterations” used.
Note that currently, when non-NULL weights are
specified, the default for initial location mu and scale
s is wgt.himedian, where strictly speaking a
weighted “non-hi” median should be used for consistency.
Since s is not updated, the results slightly differ, see the
examples below.
When se = TRUE, the standard error is computed using the
\(\tau\) correction factor but no finite sample correction.
Huber, P. J. (1981) Robust Statistics. Wiley.
huberM(c(1:9, 1000))
#> $mu
#> [1] 5.617749
#>
#> $s
#> [1] 3.7065
#>
#> $it
#> [1] 6
#>
#> $SE
#> [1] NA
#>
mad (c(1:9, 1000))
#> [1] 3.7065
mad (rep(9, 100))
#> [1] 0
huberM(rep(9, 100))
#> $mu
#> [1] 9
#>
#> $s
#> [1] 0
#>
#> $it
#> [1] 0
#>
#> $SE
#> [1] NA
#>
## When you have "binned" aka replicated observations:
set.seed(7)
x <- c(round(rnorm(1000),1), round(rnorm(50, m=10, sd = 10)))
t.x <- table(x) # -> unique values and multiplicities
x.uniq <- as.numeric(names(t.x)) ## == sort(unique(x))
x.mult <- unname(t.x)
str(Hx <- huberM(x.uniq, weights = x.mult), digits = 7)
#> List of 4
#> $ mu: num 0.05013769
#> $ s : num 0.7
#> $ it: int 11
#> $ SE: num NA
str(Hx. <- huberM(x, s = Hx$s, se=TRUE), digits = 7) ## should be ~= Hx
#> List of 4
#> $ mu: num 0.05013769
#> $ s : num 0.7
#> $ it: int 11
#> $ SE: num 0.03382601
stopifnot(all.equal(Hx[-4], Hx.[-4]))
str(Hx2 <- huberM(x, se=TRUE), digits = 7)## somewhat different, since 's' differs
#> List of 4
#> $ mu: num 0.05891084
#> $ s : num 1.03782
#> $ it: int 7
#> $ SE: num 0.03379949
## Confirm correctness of std.error :
# \donttest{
system.time(
SS <- replicate(10000, vapply(huberM(rnorm(400), se=TRUE), as.double, 1.))
) # ~ 2.8 seconds (was 12.2 s)
#> user system elapsed
#> 1.704 0.012 1.716
rbind(mean(SS["SE",]), sd(SS["mu",]))# both ~ 0.0508
#> [,1]
#> [1,] 0.05079230
#> [2,] 0.05084461
stopifnot(all.equal(mean(SS["SE",]),
sd ( SS["mu",]), tolerance= 0.002))
# }