Cauchy Distribution Family Function

Estimates either the location parameter or both the location and scale parameters of the Cauchy distribution by maximum likelihood estimation.

cauchy(llocation = "identitylink", lscale = "loglink",
       imethod = 1, ilocation = NULL, iscale = NULL,
       gprobs.y = ppoints(19), gscale.mux = exp(-3:3), zero = "scale")
cauchy1(scale.arg = 1, llocation = "identitylink", ilocation = NULL,
        imethod = 1, gprobs.y = ppoints(19), zero = NULL)

Arguments

llocation, lscale

Parameter link functions for the location parameter $a$ and the scale parameter $b$. See Links for more choices.

ilocation, iscale

Optional initial value for $a$ and $b$. By default, an initial value is chosen internally for each.

imethod

Integer, either 1 or 2 or 3. Initial method, three algorithms are implemented. The user should try all possible values to help avoid converging to a local solution. Also, choose the another value if convergence fails, or use ilocation and/or iscale.

gprobs.y, gscale.mux, zero

See CommonVGAMffArguments for information.

scale.arg

Known (positive) scale parameter, called $b$ below.

Details

The Cauchy distribution has density function $$f(y;a,b) = \left\{ \pi b [1 + ((y-a)/b)^2] \right\}^{-1} $$ where $y$ and $a$ are real and finite, and $b>0$. The distribution is symmetric about $a$ and has a heavy tail. Its median and mode are $a$, but the mean does not exist. The fitted values are the estimates of $a$. Fisher scoring is used.

If the scale parameter is known (cauchy1) then there may be multiple local maximum likelihood solutions for the location parameter. However, if both location and scale parameters are to be estimated (cauchy) then there is a unique maximum likelihood solution provided $n > 2$ and less than half the data are located at any one point.

Value

An object of class "vglmff" (see vglmff-class). The object is used by modelling functions such as vglm, and vgam.

Warning

It is well-known that the Cauchy distribution may have local maximums in its likelihood function; make full use of imethod, ilocation, iscale etc.

References

Forbes, C., Evans, M., Hastings, N. and Peacock, B. (2011). Statistical Distributions, Hoboken, NJ, USA: John Wiley and Sons, Fourth edition.

Barnett, V. D. (1966). Evaluation of the maximum-likehood estimator where the likelihood equation has multiple roots. Biometrika, 53, 151–165.

Copas, J. B. (1975). On the unimodality of the likelihood for the Cauchy distribution. Biometrika, 62, 701–704.

Efron, B. and Hinkley, D. V. (1978). Assessing the accuracy of the maximum likelihood estimator: Observed versus expected Fisher information. Biometrika, 65, 457–481.

Author

T. W. Yee

Note

Good initial values are needed. By default cauchy searches for a starting value for $a$ and $b$ on a 2-D grid. Likewise, by default, cauchy1 searches for a starting value for $a$ on a 1-D grid. If convergence to the global maximum is not acheieved then it also pays to select a wide range of initial values via the ilocation and/or iscale and/or imethod arguments.

Examples

# Both location and scale parameters unknown
set.seed(123)
cdata <- data.frame(x2 = runif(nn <- 1000))
cdata <- transform(cdata, loc = exp(1 + 0.5 * x2), scale = exp(1))
cdata <- transform(cdata, y2 = rcauchy(nn, loc, scale))
fit2 <- vglm(y2 ~ x2, cauchy(lloc = "loglink"), data = cdata)
coef(fit2, matrix = TRUE)
#>             loglink(location) loglink(scale)
#> (Intercept)         0.9251979       1.047886
#> x2                  0.6149455       0.000000
head(fitted(fit2))  # Location estimates
#>       [,1]
#> 1 3.010308
#> 2 4.095802
#> 3 3.243641
#> 4 4.341437
#> 5 4.497555
#> 6 2.594030
summary(fit2)
#> 
#> Call:
#> vglm(formula = y2 ~ x2, family = cauchy(lloc = "loglink"), data = cdata)
#> 
#> Coefficients: 
#>               Estimate Std. Error z value Pr(>|z|)    
#> (Intercept):1  0.92520    0.08558  10.811  < 2e-16 ***
#> (Intercept):2  1.04789    0.04472  23.431  < 2e-16 ***
#> x2             0.61495    0.13006   4.728 2.26e-06 ***
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> 
#> Names of linear predictors: loglink(location), loglink(scale)
#> 
#> Log-likelihood: -3608.661 on 1997 degrees of freedom
#> 
#> Number of Fisher scoring iterations: 6 
#> 
#> No Hauck-Donner effect found in any of the estimates
#> 

# Location parameter unknown
cdata <- transform(cdata, scale1 = 0.4)
cdata <- transform(cdata, y1 = rcauchy(nn, loc, scale1))
fit1 <- vglm(y1 ~ x2, cauchy1(scale = 0.4), data = cdata, trace = TRUE)
#> Iteration 1: loglikelihood = -1562.2176
#> Iteration 2: loglikelihood = -1527.4069
#> Iteration 3: loglikelihood = -1527.342
#> Iteration 4: loglikelihood = -1527.3415
#> Iteration 5: loglikelihood = -1527.3415
coef(fit1, matrix = TRUE)
#>             location
#> (Intercept) 2.678798
#> x2          1.680406