Logarithmic Distribution

Estimating the (single) parameter of the logarithmic distribution.

logff(lshape = "logitlink", gshape = -expm1(-7 * ppoints(4)),
      zero = NULL)

Arguments

lshape

Parameter link function for the parameter \(c\), which lies between 0 and 1. See Links for more choices and information. Soon logfflink() will hopefully be available for event-rate data.

gshape, zero

Details at CommonVGAMffArguments. Practical experience shows that having the initial value for \(c\) being close to the solution is quite important.

Details

The logarithmic distribution is a generalized power series distribution that is based specifically on the logarithmic series (scaled to a probability function). Its probability function is \(f(y) = a c^y / y\), for \(y=1,2,3,\ldots\), where \(0 < c < 1\) (called shape), and \(a = -1 / \log(1-c)\). The mean is \(a c/(1-c)\) (returned as the fitted values) and variance is \(a c (1-ac) /(1-c)^2\). When the sample mean is large, the value of \(c\) tends to be very close to 1, hence it could be argued that logitlink is not the best choice.

Value

An object of class "vglmff" (see vglmff-class). The object is used by modelling functions such as vglm, and vgam.

References

Johnson N. L., Kemp, A. W. and Kotz S. (2005). Univariate Discrete Distributions, 3rd edition, ch.7. Hoboken, New Jersey: Wiley.

Forbes, C., Evans, M., Hastings, N. and Peacock, B. (2011) Statistical Distributions, Hoboken, NJ, USA: John Wiley and Sons, Fourth edition.

Author

T. W. Yee

Note

The function log computes the natural logarithm. In the VGAM library, a link function with option loglink corresponds to this.

Multiple responses are permitted.

The “logarithmic distribution” has various meanings in the literature. Sometimes it is also called the log-series distribution. Some others call some continuous distribution on \([a, b]\) by the name “logarithmic distribution”.

Examples

nn <- 1000
ldata <- data.frame(y = rlog(nn, shape = logitlink(0.2, inv = TRUE)))
fit <- vglm(y ~ 1, logff, data = ldata, trace = TRUE, crit = "c")
#> Iteration 1: coefficients = 0.27977067
#> Iteration 2: coefficients = 0.23342323
#> Iteration 3: coefficients = 0.23250166
#> Iteration 4: coefficients = 0.2325013
#> Iteration 5: coefficients = 0.2325013
coef(fit, matrix = TRUE)
#>             logitlink(shape)
#> (Intercept)        0.2325013
Coef(fit)
#>     shape 
#> 0.5578649 
if (FALSE) with(ldata, spikeplot(y, col = "blue", capped = TRUE))
x <- seq(1, with(ldata, max(y)), by = 1)
with(ldata, lines(x + 0.1, dlog(x, Coef(fit)[1]), col = "orange",
        type = "h", lwd = 2))  # \dontrun{}
#> Error in plot.xy(xy.coords(x, y), type = type, ...): plot.new has not been called yet

# Example: Corbet (1943) butterfly Malaya data
corbet <- data.frame(nindiv = 1:24,
                 ofreq = c(118, 74, 44, 24, 29, 22, 20, 19, 20, 15, 12,
                           14, 6, 12, 6, 9, 9, 6, 10, 10, 11, 5, 3, 3))
fit <- vglm(nindiv ~ 1, logff, data = corbet, weights = ofreq)
coef(fit, matrix = TRUE)
#>             logitlink(shape)
#> (Intercept)         3.002278
shapehat <- Coef(fit)["shape"]
pdf2 <- dlog(x = with(corbet, nindiv), shape = shapehat)
print(with(corbet, cbind(nindiv, ofreq, fitted = pdf2 * sum(ofreq))),
      digits = 1)
#>       nindiv ofreq fitted
#>  [1,]      1   118    156
#>  [2,]      2    74     75
#>  [3,]      3    44     47
#>  [4,]      4    24     34
#>  [5,]      5    29     26
#>  [6,]      6    22     20
#>  [7,]      7    20     17
#>  [8,]      8    19     14
#>  [9,]      9    20     12
#> [10,]     10    15     10
#> [11,]     11    12      9
#> [12,]     12    14      8
#> [13,]     13     6      7
#> [14,]     14    12      6
#> [15,]     15     6      5
#> [16,]     16     9      5
#> [17,]     17     9      4
#> [18,]     18     6      4
#> [19,]     19    10      3
#> [20,]     20    10      3
#> [21,]     21    11      3
#> [22,]     22     5      3
#> [23,]     23     3      2
#> [24,]     24     3      2