lmrob.S.RdComputes an S-estimator for linear regression, using the “fast S” algorithm.
lmrob.S(x, y, control,
trace.lev = control$trace.lev,
only.scale = FALSE, mf)design matrix (\(n \times p\))
numeric vector of responses (or residuals for only.scale=TRUE).
list as returned by lmrob.control; the
following components are used for lmrob.S():
"trace.lev",
"nResample",
"groups",
"n.group",
"fast.s.large.n",
"seed",
"bb",
"psi", "tuning.chi",
"best.r.s",
"k.fast.s",
"k.max",
"maxit.scale",
"refine.tol", "solve.tol", "scale.tol",
"mts",
"subsampling".
integer indicating if the progress of the algorithm
should be traced (increasingly); default trace.lev = 0 does
no tracing.
logical indicating if only the scale
of y should be computed. In this case, y will
typically contain residuals.
This function is used by lmrob.fit and typically not
to be used on its own (because an S-estimator has too low
efficiency ‘on its own’).
By default, the subsampling algorithm uses a customized LU decomposition which ensures a non singular subsample (if this is at all possible). This makes the Fast-S algorithm also feasible for categorical and mixed continuous-categorical data.
One can revert to the old subsampling scheme by setting the parameter
subsampling in control to "simple".
By default (when only.scale is false), a list with components
numeric vector (length \(p\)) of S-regression coefficient estimates.
the S-scale residual estimate
numeric vector (length \(n\)) of the fitted values.
numeric vector (length \(n\)) of the residuals.
numeric vector (length \(n\)) of the robustness weights.
(maximal) number of refinement iterations used.
logical indicating if all refinement iterations had converged.
the same list as the control argument.
If only.scale is true, the computed scale (a number) is returned.
lmrob, also for references.
set.seed(33)
x1 <- sort(rnorm(30)); x2 <- sort(rnorm(30)); x3 <- sort(rnorm(30))
X. <- cbind(x1, x2, x3)
y <- 10 + X. %*% (10*(2:4)) + rnorm(30)/10
y[1] <- 500 # a moderate outlier
X.[2,1] <- 20 # an X outlier
X1 <- cbind(1, X.)
(m.lm <- lm(y ~ X.))
#>
#> Call:
#> lm(formula = y ~ X.)
#>
#> Coefficients:
#> (Intercept) X.x1 X.x2 X.x3
#> -23.756 -7.774 -276.785 253.358
#>
set.seed(12)
m.lmS <- lmrob.S(x=X1, y=y,
control = lmrob.control(nRes = 20), trace.lev=1)
#> lmrob_S(n = 30, nRes = 20): fast_s() [non-large n]:
#> Subsampling 20 times to find candidate betas:
#> Now refine() to convergence for 2 very best ones:
#> Best[0]: convergence (22 iter.): -> improved scale to 0.106807908892489
#> Best[1]: convergence (30 iter.): -> improved scale to 0.106805469318873
#> lmrob.S(): scale = 0.106805; coeff.=
#> [1] 9.980894 20.046884 30.056596 39.968754
m.lmS[c("coefficients","scale")]
#> $coefficients
#> x1 x2 x3
#> 9.980894 20.046884 30.056596 39.968754
#>
#> $scale
#> [1] 0.1068055
#>
all.equal(unname(m.lmS$coef), 10 * (1:4), tolerance = 0.005)
#> [1] TRUE
stopifnot(all.equal(unname(m.lmS$coef), 10 * (1:4), tolerance = 0.005),
all.equal(m.lmS$scale, 1/10, tolerance = 0.09))
## only.scale = TRUE: Compute the S scale, given residuals;
s.lmS <- lmrob.S(X1, y=residuals(m.lmS), only.scale = TRUE,
control = lmrob.control(trace.lev = 3))
#> lmrob_S(nRes = 0, n = 30): --> find_scale(*, scale=0.0793635) only:find_scale(*, ini.scale =0.079363506354, tol=1e-10):
#> it | new scale
#> 0 | 0.08666906840
#> 1 | 0.09232255131
#> 2 | 0.09653960469
#> 3 | 0.09960084789
#> 4 | 0.1017828694
#> 5 | 0.1033196364
#> 6 | 0.1043934421
#> 7 | 0.1051398339
#> 8 | 0.1056568276
#> 9 | 0.1060140817
#> 10 | 0.1062605572
#> 11 | 0.1064304183
#> 12 | 0.1065473908
#> 13 | 0.1066279000
#> 14 | 0.1066832923
#> 15 | 0.1067213941
#> 16 | 0.1067475979
#> 17 | 0.1067656170
#> 18 | 0.1067780069
#> 19 | 0.1067865257
#> 20 | 0.1067923826
#> 21 | 0.1067964093
#> 22 | 0.1067991776
#> 23 | 0.1068010808
#> 24 | 0.1068023893
#> 25 | 0.1068032888
#> 26 | 0.1068039072
#> 27 | 0.1068043323
#> 28 | 0.1068046246
#> 29 | 0.1068048255
#> 30 | 0.1068049637
#> 31 | 0.1068050586
#> 32 | 0.1068051239
#> 33 | 0.1068051688
#> 34 | 0.1068051997
#> 35 | 0.1068052209
#> 36 | 0.1068052354
#> 37 | 0.1068052455
#> 38 | 0.1068052524
#> 39 | 0.1068052571
#> 40 | 0.1068052604
#> 41 | 0.1068052626
#> 42 | 0.1068052641
#> 43 | 0.1068052652
#> 44 | 0.1068052659
#> 45 | 0.1068052664
#> 46 | 0.1068052668
#> 47 | 0.1068052670
#> 48 | 0.1068052672
#> 49 | 0.1068052673
#> 50 | 0.1068052674
#> 51 | 0.1068052674
#> 52 | 0.1068052674
#> 53 | 0.1068052675
#> 54 | 0.1068052675
#> 55 | 0.1068052675
#> 56 | 0.1068052675
#> used 56 iterations
#> lmrob.S(): scale = 0.106805
all.equal(s.lmS, m.lmS$scale) # close: 1.89e-6 [64b Lnx]
#> [1] "Mean relative difference: 1.889525e-06"