geometric.RdMaximum likelihood estimation for the geometric and truncated geometric distributions.
geometric(link = "logitlink", expected = TRUE, imethod = 1,
iprob = NULL, zero = NULL)
truncgeometric(upper.limit = Inf,
link = "logitlink", expected = TRUE, imethod = 1,
iprob = NULL, zero = NULL)Parameter link function applied to the
probability parameter \(p\), which lies in the unit interval.
See Links for more choices.
Logical.
Fisher scoring is used if expected = TRUE, else Newton-Raphson.
See CommonVGAMffArguments for details.
Numeric. Upper values. As a vector, it is recycled across responses first. The default value means both family functions should give the same result.
A random variable \(Y\) has a 1-parameter geometric distribution
if \(P(Y=y) = p (1-p)^y\)
for \(y=0,1,2,\ldots\).
Here, \(p\) is the probability of success,
and \(Y\) is the number of (independent) trials that are fails
until a success occurs.
Thus the response \(Y\) should be a non-negative integer.
The mean of \(Y\) is \(E(Y) = (1-p)/p\)
and its variance is \(Var(Y) = (1-p)/p^2\).
The geometric distribution is a special case of the
negative binomial distribution (see negbinomial).
The geometric distribution is also a special case of the
Borel distribution, which is a Lagrangian distribution.
If \(Y\) has a geometric distribution with parameter \(p\) then
\(Y+1\) has a positive-geometric distribution with the same parameter.
Multiple responses are permitted.
For truncgeometric(),
the (upper) truncated geometric distribution can have response integer
values from 0 to upper.limit.
It has density prob * (1 - prob)^y / [1-(1-prob)^(1+upper.limit)].
For a generalized truncated geometric distribution with integer values \(L\) to \(U\), say, subtract \(L\) from the response and feed in \(U-L\) as the upper limit.
An object of class "vglmff" (see vglmff-class).
The object is used by modelling functions such as vglm,
and vgam.
Forbes, C., Evans, M., Hastings, N. and Peacock, B. (2011). Statistical Distributions, Hoboken, NJ, USA: John Wiley and Sons, Fourth edition.
gdata <- data.frame(x2 = runif(nn <- 1000) - 0.5)
gdata <- transform(gdata, x3 = runif(nn) - 0.5,
x4 = runif(nn) - 0.5)
gdata <- transform(gdata, eta = -1.0 - 1.0 * x2 + 2.0 * x3)
gdata <- transform(gdata, prob = logitlink(eta, inverse = TRUE))
gdata <- transform(gdata, y1 = rgeom(nn, prob))
with(gdata, table(y1))
#> y1
#> 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
#> 270 196 125 94 85 53 28 22 21 19 10 13 8 11 4 5 6 1 6 3
#> 22 23 24 27 28 29 30 31 33 35 39 40 45 55
#> 1 2 3 1 1 1 2 1 1 1 1 2 2 1
fit1 <- vglm(y1 ~ x2 + x3 + x4, geometric, data = gdata, trace = TRUE)
#> Iteration 1: loglikelihood = -2253.5284
#> Iteration 2: loglikelihood = -2239.0745
#> Iteration 3: loglikelihood = -2238.9309
#> Iteration 4: loglikelihood = -2238.9309
#> Iteration 5: loglikelihood = -2238.9309
coef(fit1, matrix = TRUE)
#> logitlink(prob)
#> (Intercept) -1.0424831
#> x2 -1.1031388
#> x3 2.0009718
#> x4 -0.1133101
summary(fit1)
#>
#> Call:
#> vglm(formula = y1 ~ x2 + x3 + x4, family = geometric, data = gdata,
#> trace = TRUE)
#>
#> Coefficients:
#> Estimate Std. Error z value Pr(>|z|)
#> (Intercept) -1.04248 0.03782 -27.565 <2e-16 ***
#> x2 -1.10314 0.12796 -8.621 <2e-16 ***
#> x3 2.00097 0.13473 14.852 <2e-16 ***
#> x4 -0.11331 0.12652 -0.896 0.37
#> ---
#> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#>
#> Name of linear predictor: logitlink(prob)
#>
#> Log-likelihood: -2238.931 on 996 degrees of freedom
#>
#> Number of Fisher scoring iterations: 5
#>
#> No Hauck-Donner effect found in any of the estimates
#>
# Truncated geometric (between 0 and upper.limit)
upper.limit <- 5
tdata <- subset(gdata, y1 <= upper.limit)
nrow(tdata) # Less than nn
#> [1] 823
fit2 <- vglm(y1 ~ x2 + x3 + x4, truncgeometric(upper.limit),
data = tdata, trace = TRUE)
#> Iteration 1: loglikelihood = -1330.0291
#> Iteration 2: loglikelihood = -1328.6492
#> Iteration 3: loglikelihood = -1328.6364
#> Iteration 4: loglikelihood = -1328.6363
#> Iteration 5: loglikelihood = -1328.6363
coef(fit2, matrix = TRUE)
#> logitlink(prob)
#> (Intercept) -1.1591259
#> x2 -0.6966647
#> x3 2.6156253
#> x4 -0.1506785
# Generalized truncated geometric (between lower.limit and upper.limit)
lower.limit <- 1
upper.limit <- 8
gtdata <- subset(gdata, lower.limit <= y1 & y1 <= upper.limit)
with(gtdata, table(y1))
#> y1
#> 1 2 3 4 5 6 7 8
#> 196 125 94 85 53 28 22 21
nrow(gtdata) # Less than nn
#> [1] 624
fit3 <- vglm(y1 - lower.limit ~ x2 + x3 + x4,
truncgeometric(upper.limit - lower.limit),
data = gtdata, trace = TRUE)
#> Iteration 1: loglikelihood = -1121.4393
#> Iteration 2: loglikelihood = -1120.8169
#> Iteration 3: loglikelihood = -1120.8051
#> Iteration 4: loglikelihood = -1120.8049
#> Iteration 5: loglikelihood = -1120.8049
coef(fit3, matrix = TRUE)
#> logitlink(prob)
#> (Intercept) -0.9214736
#> x2 -0.7255879
#> x3 1.7786057
#> x4 0.2922239