Median absolute deviation (MAD) outlier in Time Series

hampel(x, k, t0 = 3)

Arguments

x

numeric vector representing a time series

k

window length 2*k+1 in indices

t0

threshold, default is 3 (Pearson's rule), see below.

Details

The `median absolute deviation' computation is done in the [-k...k] vicinity of each point at least k steps away from the end points of the interval. At the lower and upper end the time series values are preserved.

A high threshold makes the filter more forgiving, a low one will declare more points to be outliers. t0<-3 (the default) corresponds to Ron Pearson's 3 sigma edit rule, t0<-0 to John Tukey's median filter.

Value

Returning a list L with L$y the corrected time series and L$ind the indices of outliers in the `median absolut deviation' sense.

Note

Don't take the expression outlier too serious. It's just a hint to values in the time series that appear to be unusual in the vicinity of their neighbors under a normal distribution assumption.

References

Pearson, R. K. (1999). “Data cleaning for dynamic modeling and control”. European Control Conference, ETH Zurich, Switzerland.

See also

Examples

set.seed(8421)
x <- numeric(1024)
z <- rnorm(1024)
x[1] <- z[1]
for (i in 2:1024) {
  x[i] <- 0.4*x[i-1] + 0.8*x[i-1]*z[i-1] + z[i]
}
omad <- hampel(x, k=20)

if (FALSE) { # \dontrun{
plot(1:1024, x, type="l")
points(omad$ind, x[omad$ind], pch=21, col="darkred")
grid()} # }