These versions of the safeguarded Newton solve the Newton equations with the R function solve(). In snewton a backtracking line search is used, while in snewtonm we rely on a Marquardt stabilization.

snewton(par, fn, gr, hess, control = list(trace=0, maxit=500), ...)

   snewtm(par, fn, gr, hess, bds, control = list(trace=0, maxit=500))

Arguments

par

A numeric vector of starting estimates.

fn

A function that returns the value of the objective at the supplied set of parameters par using auxiliary data in .... The first argument of fn must be par.

gr

A function that returns the gradient of the objective at the supplied set of parameters par using auxiliary data in .... The first argument of fn must be par. This function returns the gradient as a numeric vector.

hess

A function to compute the Hessian matrix. This should be provided as a square, symmetric matrix.

bds

Result of bmchk() for the current problem. Contains lower and upper etc.

control

An optional list of control settings.

...

Further arguments to be passed to fn.

Details

snewtm is intended to be called from optimr().

Functions fn must return a numeric value. gr must return a vector. hess must return a matrix. The control argument is a list. See the code for snewton.R for completeness. Some of the values that may be important for users are:

trace

Set 0 (default) for no output, > 0 for diagnostic output (larger values imply more output).

watch

Set TRUE if the routine is to stop for user input (e.g., Enter) after each iteration. Default is FALSE.

maxit

A limit on the number of iterations (default 500 + 2*n where n is the number of parameters). This is the maximum number of gradient evaluations allowed.

maxfeval

A limit on the number of function evaluations allowed (default 3000 + 10*n).

eps

a tolerance used for judging small gradient norm (default = 1e-07). a gradient norm smaller than (1 + abs(fmin))*eps*eps is considered small enough that a local optimum has been found, where fmin is the current estimate of the minimal function value.

acctol

To adjust the acceptable point tolerance (default 0.0001) in the test ( f <= fmin + gradproj * steplength * acctol ). This test is used to ensure progress is made at each iteration.

stepdec

Step reduction factor for backtrack line search (default 0.2)

defstep

Initial stepsize default (default 1)

reltest

Additive shift for equality test (default 100.0)

The (unconstrained) solver snewtonmu proved to be slower than the bounded solver called without bounds, so has been withdrawn.

The snewton safeguarded Newton uses a simple line search but no linear solution stabilization and has demonstrated POOR performance and reliability. NOT recommended.

Value

A list with components:

par

The best set of parameters found.

value

The value of the objective at the best set of parameters found.

grad

The value of the gradient at the best set of parameters found. A vector.

hessian

The value of the Hessian at the best set of parameters found. A matrix.

counts

A vector of 4 integers giving number of Newton equation solutions, the number of function evaluations, the number of gradient evaluations and the number of hessian evaluations.

message

A message giving some information on the status of the solution.

References

Nash, J C (1979, 1990) Compact Numerical Methods for Computers: Linear Algebra and Function Minimisation, Bristol: Adam Hilger. Second Edition, Bristol: Institute of Physics Publications.

See also

Examples

#Rosenbrock banana valley function
f <- function(x){
return(100*(x[2] - x[1]*x[1])^2 + (1-x[1])^2)
}
#gradient
gr <- function(x){
return(c(-400*x[1]*(x[2] - x[1]*x[1]) - 2*(1-x[1]), 200*(x[2] - x[1]*x[1])))
}
#Hessian
h <- function(x) {
a11 <- 2 - 400*x[2] + 1200*x[1]*x[1]; a21 <- -400*x[1]
return(matrix(c(a11, a21, a21, 200), 2, 2))
}

fg <- function(x){ #function and gradient
  val <- f(x)
  attr(val,"gradient") <- gr(x)
  val
}
fgh <- function(x){ #function and gradient
  val <- f(x)
  attr(val,"gradient") <- gr(x)
  attr(val,"hessian") <- h(x)
  val
}

x0 <- c(-1.2, 1)

sr <- snewton(x0, fn=f, gr=gr, hess=h, control=list(trace=1))
#> 0   1   0   fbest= 24.2 
#> Gradient projection =  -38.82876 
#> 1   2   1   fbest= 4.731884 
#> Gradient projection =  -8.433185 
#> 2   5   2   fbest= 4.404888 
#> Gradient projection =  -3.802796 
#> 3   7   3   fbest= 3.81876 
#> Gradient projection =  -1.143361 
#> 4   8   4   fbest= 3.116464 
#> Gradient projection =  -1.018618 
#> 5   9   5   fbest= 2.426322 
#> Gradient projection =  -0.9325458 
#> 6   10   6   fbest= 2.119506 
#> Gradient projection =  -1.21009 
#> 7   11   7   fbest= 1.419276 
#> Gradient projection =  -1.178522 
#> 8   13   8   fbest= 1.213719 
#> Gradient projection =  -0.6322948 
#> 9   14   9   fbest= 1.194519 
#> Gradient projection =  -1.123225 
#> 10   15   10   fbest= 0.5925966 
#> Gradient projection =  -0.7547131 
#> 11   17   11   fbest= 0.4648368 
#> Gradient projection =  -0.2953687 
#> 12   18   12   fbest= 0.3799336 
#> Gradient projection =  -0.3636001 
#> 13   19   13   fbest= 0.1767059 
#> Gradient projection =  -0.2269612 
#> 14   21   14   fbest= 0.1363627 
#> Gradient projection =  -0.1372565 
#> 15   23   15   fbest= 0.1115589 
#> Gradient projection =  -0.1102234 
#> 16   24   16   fbest= 0.09218166 
#> Gradient projection =  -0.1354565 
#> 17   25   17   fbest= 0.0204531 
#> Gradient projection =  -0.03521596 
#> 18   27   18   fbest= 0.0141311 
#> Gradient projection =  -0.02266352 
#> 19   28   19   fbest= 0.00854653 
#> Gradient projection =  -0.0163739 
#> 20   29   20   fbest= 0.0002310919 
#> Gradient projection =  -0.0004556213 
#> 21   30   21   fbest= 5.066762e-06 
#> Gradient projection =  -1.012952e-05 
#> 22   31   22   fbest= 8.60774e-11 
#> Gradient projection =  -1.721533e-10 
#> 23   32   23   fbest= 7.440484e-19 
#> Gradient projection =  -1.488097e-18 
#> 24   33   24   fbest= 0 
#> Small gradient norm 
print(sr)
#> $par
#> [1] 1 1
#> 
#> $value
#> [1] 0
#> 
#> $grad
#> [1] 0 0
#> 
#> $hessian
#>      [,1] [,2]
#> [1,]  802 -400
#> [2,] -400  200
#> 
#> $counts
#> $counts$niter
#> [1] 25
#> 
#> $counts$nfn
#> [1] 33
#> 
#> $counts$ngr
#> [1] 25
#> 
#> $counts$nhess
#> [1] 24
#> 
#> 
#> $convcode
#> [1] 0
#> 
#> $message
#> [1] "Small gradient norm"
#> 
# Call through optimr to get correct calling sequence, esp. with bounds
srm <- optimr(x0, fn=f, gr=gr, hess=h, control=list(trace=1))
#> Warning: Package  not found
#> Solver   missing
print(srm)
#> $convergence
#> [1] 8888
#> 
#> $value
#> [1] 8.988466e+307
#> 
#> $par
#> [1] NA NA
#> attr(,"status")
#> [1] "?" "?"
#> 
#> $counts
#> [1] NA NA
#> 
#> $message
#> [1] "Missing method  "
#> 

# bounds constrained example

lo <- rep((min(x0)-0.1), 2)
up <- rep((max(x0)+0.1), 2)
# Call through optimr to get correct calling sequence, esp. with bounds
srmb <- optimr(x0, fn=f, gr=gr, hess=h, lower=lo, upper=up, control=list(trace=1))
#> Warning: Package  not found
#> Solver   missing
proptimr(srmb)
#> Result  srmb (  ->  ) calc. min. = 8.988466e+307  at 
#> NA ?   NA ?   NA NA   NA NA   NA NA   
#> After  NA  fn evals, and  NA  gr evals and  NA  hessian evals
#> Termination code is  8888 : Missing method   
#> 
#> -------------------------------------------------


#Example 2: Wood function
#
wood.f <- function(x){
  res <- 100*(x[1]^2-x[2])^2+(1-x[1])^2+90*(x[3]^2-x[4])^2+(1-x[3])^2+
    10.1*((1-x[2])^2+(1-x[4])^2)+19.8*(1-x[2])*(1-x[4])
  return(res)
}
#gradient:
wood.g <- function(x){
  g1 <- 400*x[1]^3-400*x[1]*x[2]+2*x[1]-2
  g2 <- -200*x[1]^2+220.2*x[2]+19.8*x[4]-40
  g3 <- 360*x[3]^3-360*x[3]*x[4]+2*x[3]-2
  g4 <- -180*x[3]^2+200.2*x[4]+19.8*x[2]-40
  return(c(g1,g2,g3,g4))
}
#hessian:
wood.h <- function(x){
  h11 <- 1200*x[1]^2-400*x[2]+2;    h12 <- -400*x[1]; h13 <- h14 <- 0
  h22 <- 220.2; h23 <- 0;    h24 <- 19.8
  h33 <- 1080*x[3]^2-360*x[4]+2;    h34 <- -360*x[3]
  h44 <- 200.2
  H <- matrix(c(h11,h12,h13,h14,h12,h22,h23,h24,
                h13,h23,h33,h34,h14,h24,h34,h44),ncol=4)
  return(H)
}
#################################################
w0 <- c(-3, -1, -3, -1)

wd <- snewton(w0, fn=wood.f, gr=wood.g, hess=wood.h, control=list(trace=1))
#> 0   1   0   fbest= 19192 
#> Gradient projection =  -35111.91 
#> 1   2   1   fbest= 1291.439 
#> Gradient projection =  -1748.969 
#> 2   3   2   fbest= 295.9513 
#> Gradient projection =  -376.8133 
#> 3   4   3   fbest= 67.68559 
#> Gradient projection =  -84.58005 
#> 4   5   4   fbest= 17.33661 
#> Gradient projection =  -14.90304 
#> 5   6   5   fbest= 8.689077 
#> Gradient projection =  -1.467156 
#> 6   7   6   fbest= 7.892798 
#> Gradient projection =  -0.03184003 
#> 7   8   7   fbest= 7.876516 
#> Gradient projection =  0.001054892 
#> 8   9   8   fbest= 7.875255 
#> Gradient projection =  0.4054431 
#> 9   13   9   fbest= 7.872773 
#> Gradient projection =  -0.01694465 
#> 10   14   10   fbest= 7.865633 
#> Gradient projection =  -0.04328606 
#> 11   15   11   fbest= 7.83484 
#> Gradient projection =  -0.2624784 
#> 12   18   12   fbest= 7.824584 
#> Gradient projection =  -0.2491789 
#> 13   20   13   fbest= 7.800297 
#> Gradient projection =  -0.1497639 
#> 14   21   14   fbest= 7.692242 
#> Gradient projection =  -0.4213491 
#> 15   23   15   fbest= 7.6193 
#> Gradient projection =  -0.3589344 
#> 16   25   16   fbest= 7.553482 
#> Gradient projection =  -0.4765439 
#> 17   27   17   fbest= 7.466637 
#> Gradient projection =  -0.5791609 
#> 18   29   18   fbest= 7.36127 
#> Gradient projection =  -0.6791919 
#> 19   31   19   fbest= 7.237694 
#> Gradient projection =  -0.7851201 
#> 20   33   20   fbest= 7.094844 
#> Gradient projection =  -0.8943914 
#> 21   35   21   fbest= 6.932114 
#> Gradient projection =  -1.004506 
#> 22   37   22   fbest= 6.749355 
#> Gradient projection =  -1.112611 
#> 23   39   23   fbest= 6.54694 
#> Gradient projection =  -1.21558 
#> 24   41   24   fbest= 6.32581 
#> Gradient projection =  -1.31008 
#> 25   43   25   fbest= 6.087517 
#> Gradient projection =  -1.392648 
#> 26   45   26   fbest= 5.834241 
#> Gradient projection =  -1.459773 
#> 27   47   27   fbest= 5.568808 
#> Gradient projection =  -1.507982 
#> 28   49   28   fbest= 5.294674 
#> Gradient projection =  -1.533991 
#> 29   51   29   fbest= 5.015896 
#> Gradient projection =  -1.534959 
#> 30   53   30   fbest= 4.737046 
#> Gradient projection =  -1.509 
#> 31   55   31   fbest= 4.463032 
#> Gradient projection =  -1.456013 
#> 32   57   32   fbest= 4.198769 
#> Gradient projection =  -1.378712 
#> 33   59   33   fbest= 3.948653 
#> Gradient projection =  -1.283254 
#> 34   61   34   fbest= 3.715944 
#> Gradient projection =  -1.178576 
#> 35   63   35   fbest= 3.502264 
#> Gradient projection =  -1.074274 
#> 36   65   36   fbest= 3.307499 
#> Gradient projection =  -0.9781538 
#> 37   67   37   fbest= 3.130135 
#> Gradient projection =  -0.8949017 
#> 38   69   38   fbest= 2.967822 
#> Gradient projection =  -0.8262399 
#> 39   71   39   fbest= 2.817911 
#> Gradient projection =  -0.7718522 
#> 40   73   40   fbest= 2.677817 
#> Gradient projection =  -0.7303443 
#> 41   75   41   fbest= 2.545212 
#> Gradient projection =  -0.699909 
#> 42   77   42   fbest= 2.418092 
#> Gradient projection =  -0.6786873 
#> 43   79   43   fbest= 2.294795 
#> Gradient projection =  -0.6649234 
#> 44   81   44   fbest= 2.173973 
#> Gradient projection =  -0.6570044 
#> 45   83   45   fbest= 2.054571 
#> Gradient projection =  -0.6534481 
#> 46   85   46   fbest= 1.935801 
#> Gradient projection =  -0.6528721 
#> 47   87   47   fbest= 1.817127 
#> Gradient projection =  -0.6539652 
#> 48   89   48   fbest= 1.698249 
#> Gradient projection =  -0.6554675 
#> 49   91   49   fbest= 1.579096 
#> Gradient projection =  -0.6561669 
#> 50   93   50   fbest= 1.459818 
#> Gradient projection =  -0.6549099 
#> 51   95   51   fbest= 1.340774 
#> Gradient projection =  -0.6506273 
#> 52   97   52   fbest= 1.222516 
#> Gradient projection =  -0.6423714 
#> 53   99   53   fbest= 1.105769 
#> Gradient projection =  -0.6293617 
#> 54   101   54   fbest= 0.9914002 
#> Gradient projection =  -0.6110329 
#> 55   103   55   fbest= 0.8803785 
#> Gradient projection =  -0.5870819 
#> 56   105   56   fbest= 0.773727 
#> Gradient projection =  -0.557506 
#> 57   106   57   fbest= 0.7624271 
#> Gradient projection =  -1.016074 
#> 58   107   58   fbest= 0.2213491 
#> Gradient projection =  -0.4245304 
#> 59   109   59   fbest= 0.1464941 
#> Gradient projection =  -0.2105171 
#> 60   110   60   fbest= 0.1177979 
#> Gradient projection =  -0.2209695 
#> 61   111   61   fbest= 0.004807252 
#> Gradient projection =  -0.009211707 
#> 62   112   62   fbest= 0.000297515 
#> Gradient projection =  -0.0005929099 
#> 63   113   63   fbest= 1.028332e-07 
#> Gradient projection =  -2.056065e-07 
#> 64   114   64   fbest= 1.604976e-13 
#> Gradient projection =  -3.209951e-13 
#> 65   115   65   fbest= 8.213467e-26 
#> Gradient projection =  -1.669905e-25 
#> 66   116   66   fbest= 1.401584e-29 
#> Gradient projection =  -1.848666e-28 
#> No progress before linesearch! 
print(wd)
#> $par
#> [1] 1 1 1 1
#> 
#> $value
#> [1] 1.401584e-29
#> 
#> $grad
#> [1]  1.167955e-13 -5.684342e-14  5.417888e-14 -4.263256e-14
#> 
#> $hessian
#>      [,1]   [,2] [,3]   [,4]
#> [1,]  802 -400.0    0    0.0
#> [2,] -400  220.2    0   19.8
#> [3,]    0    0.0  722 -360.0
#> [4,]    0   19.8 -360  200.2
#> 
#> $counts
#> $counts$niter
#> [1] 67
#> 
#> $counts$nfn
#> [1] 116
#> 
#> $counts$ngr
#> [1] 67
#> 
#> $counts$nhess
#> [1] 67
#> 
#> 
#> $convcode
#> [1] 92
#> 
#> $message
#> [1] "No progress before linesearch!"
#> 

# Call through optimr to get correct calling sequence, esp. with bounds
wdm <- optimr(w0, fn=wood.f, gr=wood.g, hess=wood.h, control=list(trace=1))
#> Warning: Package  not found
#> Solver   missing
print(wdm)
#> $convergence
#> [1] 8888
#> 
#> $value
#> [1] 8.988466e+307
#> 
#> $par
#> [1] NA NA NA NA
#> attr(,"status")
#> [1] "?" "?" "?" "?"
#> 
#> $counts
#> [1] NA NA
#> 
#> $message
#> [1] "Missing method  "
#>