S-regression estimators

Computes an S-estimator for linear regression, using the “fast S” algorithm.

lmrob.S(x, y, control,
        trace.lev = control$trace.lev,
        only.scale = FALSE, mf)

Arguments

x: design matrix (\(n \times p\))
y: numeric vector of responses (or residuals for only.scale=TRUE).
control: list as returned by lmrob.control; the following components are used for lmrob.S(): "trace.lev", "nResample", "groups", "n.group", "fast.s.large.n", "seed", "bb", "psi", "tuning.chi", "best.r.s", "k.fast.s", "k.max", "maxit.scale", "refine.tol", "solve.tol", "scale.tol", "mts", "subsampling".
trace.lev: integer indicating if the progress of the algorithm should be traced (increasingly); default trace.lev = 0 does no tracing.
only.scale: logical indicating if only the scale of y should be computed. In this case, y will typically contain residuals.

mf: defunct.

Details

This function is used by lmrob.fit and typically not to be used on its own (because an S-estimator has too low efficiency ‘on its own’).

By default, the subsampling algorithm uses a customized LU decomposition which ensures a non singular subsample (if this is at all possible). This makes the Fast-S algorithm also feasible for categorical and mixed continuous-categorical data.

One can revert to the old subsampling scheme by setting the parameter subsampling in control to "simple".

Value

By default (when only.scale is false), a list with components

coefficients: numeric vector (length \(p\)) of S-regression coefficient estimates.
scale: the S-scale residual estimate

fitted.values: numeric vector (length \(n\)) of the fitted values.
residuals: numeric vector (length \(n\)) of the residuals.
rweights: numeric vector (length \(n\)) of the robustness weights.
k.iter: (maximal) number of refinement iterations used.
converged: logical indicating if all refinement iterations had converged.
control: the same list as the control argument.

If only.scale is true, the computed scale (a number) is returned.

Author

Matias Salibian-Barrera and Manuel Koller; Martin Maechler for minor new options and more documentation.

Examples

set.seed(33)
x1 <- sort(rnorm(30)); x2 <- sort(rnorm(30)); x3 <- sort(rnorm(30))
X. <- cbind(x1, x2, x3)
y <-  10 + X. %*% (10*(2:4)) + rnorm(30)/10
y[1] <- 500   # a moderate outlier
X.[2,1] <- 20 # an X outlier
X1  <- cbind(1, X.)

(m.lm <- lm(y ~ X.))
#> 
#> Call:
#> lm(formula = y ~ X.)
#> 
#> Coefficients:
#> (Intercept)         X.x1         X.x2         X.x3  
#>     -23.756       -7.774     -276.785      253.358  
#> 
set.seed(12)
m.lmS <- lmrob.S(x=X1, y=y,
                 control = lmrob.control(nRes = 20), trace.lev=1)
#> lmrob_S(n = 30, nRes = 20): fast_s() [non-large n]:
#>  Subsampling 20 times to find candidate betas:
#>  Now refine() to convergence for 2 very best ones:
#>   Best[0]: convergence (22 iter.): -> improved scale to 0.106807908892489
#>   Best[1]: convergence (30 iter.): -> improved scale to 0.106805469318873
#> lmrob.S(): scale = 0.106805; coeff.=
#> [1]  9.980894 20.046884 30.056596 39.968754
m.lmS[c("coefficients","scale")]
#> $coefficients
#>                  x1        x2        x3 
#>  9.980894 20.046884 30.056596 39.968754 
#> 
#> $scale
#> [1] 0.1068055
#> 
all.equal(unname(m.lmS$coef), 10 * (1:4), tolerance = 0.005)
#> [1] TRUE
stopifnot(all.equal(unname(m.lmS$coef), 10 * (1:4), tolerance = 0.005),
          all.equal(m.lmS$scale, 1/10, tolerance = 0.09))

## only.scale = TRUE:  Compute the S scale, given residuals;
s.lmS <- lmrob.S(X1, y=residuals(m.lmS), only.scale = TRUE,
                 control = lmrob.control(trace.lev = 3))
#> lmrob_S(nRes = 0, n = 30): --> find_scale(*, scale=0.0793635) only:find_scale(*, ini.scale =0.079363506354, tol=1e-10):
#>   it | new scale
#>    0 | 0.08666906840
#>    1 | 0.09232255131
#>    2 | 0.09653960469
#>    3 | 0.09960084789
#>    4 |  0.1017828694
#>    5 |  0.1033196364
#>    6 |  0.1043934421
#>    7 |  0.1051398339
#>    8 |  0.1056568276
#>    9 |  0.1060140817
#>   10 |  0.1062605572
#>   11 |  0.1064304183
#>   12 |  0.1065473908
#>   13 |  0.1066279000
#>   14 |  0.1066832923
#>   15 |  0.1067213941
#>   16 |  0.1067475979
#>   17 |  0.1067656170
#>   18 |  0.1067780069
#>   19 |  0.1067865257
#>   20 |  0.1067923826
#>   21 |  0.1067964093
#>   22 |  0.1067991776
#>   23 |  0.1068010808
#>   24 |  0.1068023893
#>   25 |  0.1068032888
#>   26 |  0.1068039072
#>   27 |  0.1068043323
#>   28 |  0.1068046246
#>   29 |  0.1068048255
#>   30 |  0.1068049637
#>   31 |  0.1068050586
#>   32 |  0.1068051239
#>   33 |  0.1068051688
#>   34 |  0.1068051997
#>   35 |  0.1068052209
#>   36 |  0.1068052354
#>   37 |  0.1068052455
#>   38 |  0.1068052524
#>   39 |  0.1068052571
#>   40 |  0.1068052604
#>   41 |  0.1068052626
#>   42 |  0.1068052641
#>   43 |  0.1068052652
#>   44 |  0.1068052659
#>   45 |  0.1068052664
#>   46 |  0.1068052668
#>   47 |  0.1068052670
#>   48 |  0.1068052672
#>   49 |  0.1068052673
#>   50 |  0.1068052674
#>   51 |  0.1068052674
#>   52 |  0.1068052674
#>   53 |  0.1068052675
#>   54 |  0.1068052675
#>   55 |  0.1068052675
#>   56 |  0.1068052675
#>  used 56 iterations
#> lmrob.S(): scale = 0.106805
all.equal(s.lmS, m.lmS$scale) # close: 1.89e-6 [64b Lnx]
#> [1] "Mean relative difference: 1.889525e-06"

Arguments

Details

Value

See also

Author

Examples