Negative-Binomial Model for Counts

The function fits a negative-binomial log linear model accounting for overdispersion in counts $y$.

negbin(formula, random, data, phi.ini = NULL, warnings = FALSE, 
         na.action = na.omit, fixpar = list(),
         hessian = TRUE, control = list(maxit = 2000), ...)

Arguments

formula: A formula for the fixed effects. The left-hand side of the formula must be the counts y i.e., positive integers (y >= 0). The right-hand side can involve an offset term.
random: A right-hand formula for the overdispersion parameter(s) $\phi$.
data: A data frame containing the response (y) and explanatory variable(s).
phi.ini: Initial values for the overdispersion parameter(s) $\phi$. Default to 0.1.
warnings: Logical to control printing of warnings occurring during log-likelihood maximization. Default to FALSE (no printing).
na.action: A function name. Indicates which action should be taken in the case of missing value(s).
fixpar: A list with 2 components (scalars or vectors) of the same size, indicating which parameters are fixed (i.e., not optimized) in the global parameter vector $(b, \phi)$ and the corresponding fixed values.
For example, fixpar = list(c(4, 5), c(0, 0)) means that 4th and 5th parameters of the model are set to 0.
hessian: A logical. When set to FALSE, the hessian and the variances-covariances matrices of the parameters are not computed.
control: A list to control the optimization parameters. See optim. By default, set the maximum number of iterations to 2000.
...: Further arguments passed to optim.

Details

For a given count $y$, the model is: $$y~|~\lambda \sim Poisson(~\lambda)$$ with $\lambda$ following a Gamma distribution $Gamma(r,~\theta)$.
If $G$ denote the gamma function, then: $$P(\lambda) = r^{-\theta} * \lambda^{\theta - 1} * \frac{exp(-\frac{\lambda}{r})}{G(\theta)}$$ $$E[\lambda] = \theta * r$$ $$Var[\lambda] = \theta * r^2$$ The marginal negative-binomial distribution is: $$P(y) = G(y + \theta) * \left(\frac{1}{1 + r}\right)^\theta * \frac{(\frac{r}{1 + r})^y}{y! * G(\theta)} $$ The function uses the parameterization $\mu = \theta * r = exp(X b) = exp(\eta)$ and $\phi = 1 / \theta$, where $X$ is a design-matrix, $b$ is a vector of fixed effects, $\eta = X b$ is the linear predictor and $\phi$ the overdispersion parameter.
The marginal mean and variance are: $$E[y] = \mu$$ $$Var[y] = \mu + \phi * \mu^2$$ The parameters $b$ and $\phi$ are estimated by maximizing the log-likelihood of the marginal model (using the function optim()). Several explanatory variables are allowed in $b$. Only one is allowed in $\phi$.
An offset can be specified in the formula argument to model rates $y/T$. The offset and the marginal mean are $log(T)$ and $\mu = exp(log(T) + \eta)$, respectively.

Value

An object of formal class “glimML”: see glimML-class for details.

References

Lawless, J.F., 1987. Negative binomial and mixed Poisson regression. The Canadian Journal of Statistics, 15(3): 209-225.

Author

Matthieu Lesnoff matthieu.lesnoff@cirad.fr, Renaud Lancelot renaud.lancelot@cirad.fr

Examples

  # without offset
  data(salmonella)
  negbin(y ~ log(dose + 10) + dose, ~ 1, salmonella)
#> Negative-binomial model
#> -----------------------
#> negbin(formula = y ~ log(dose + 10) + dose, random = ~1, data = salmonella)
#> 
#> Convergence was obtained after 165 iterations.
#> 
#> Fixed-effect coefficients:
#>                  Estimate Std. Error    z value Pr(> |z|)
#> (Intercept)     2.198e+00  3.566e-01  6.162e+00 7.192e-10
#> log(dose + 10)  3.125e-01  9.940e-02  3.144e+00 1.666e-03
#> dose           -9.804e-04  4.813e-04 -2.037e+00 4.166e-02
#> 
#> Overdispersion coefficients:
#>                  Estimate Std. Error   z value   Pr(> z)
#> phi.(Intercept) 4.878e-02  3.238e-02 1.506e+00 6.601e-02
#> 
#> Log-likelihood statistics
#>    Log-lik      nbpar    df res.   Deviance        AIC       AICc 
#> -6.289e+01          4         14  3.324e+01  1.338e+02  1.369e+02 
  library(MASS) # function glm.nb in MASS
  fm.nb <- glm.nb(y ~ log(dose + 10) + dose,
                  link = log, data = salmonella)
  coef(fm.nb)
#>    (Intercept) log(dose + 10)           dose 
#>   2.1976273870   0.3125097501  -0.0009802996 
  1 / fm.nb$theta # theta = 1 / phi
#> [1] 0.0487684
  c(logLik(fm.nb), AIC(fm.nb))
#> [1] -62.88959 133.77918
  # with offset
  data(dja)
  negbin(y ~ group + offset(log(trisk)), ~ group, dja)
#> Negative-binomial model
#> -----------------------
#> negbin(formula = y ~ group + offset(log(trisk)), random = ~group, 
#>     data = dja)
#> 
#> Convergence was obtained after 243 iterations.
#> 
#> Fixed-effect coefficients:
#>               Estimate Std. Error    z value Pr(> |z|)
#> (Intercept) -5.405e-01  2.296e-01 -2.354e+00 1.859e-02
#> groupTREAT  -1.036e+00  2.616e-01 -3.959e+00 7.523e-05
#> 
#> Overdispersion coefficients:
#>                 Estimate Std. Error   z value   Pr(> z)
#> phi.groupCTRL  8.384e-01  4.178e-01 2.007e+00 2.238e-02
#> phi.groupTREAT 7.673e-08  2.000e-13 0.000e+00 1.000e+00
#> 
#> Log-likelihood statistics
#>    Log-lik      nbpar    df res.   Deviance        AIC       AICc 
#> -1.212e+02          4         71  1.118e+02  2.503e+02  2.509e+02 
  # phi fixed to zero in group TREAT
  negbin(y ~ group + offset(log(trisk)), ~ group, dja,
    fixpar = list(4, 0))
#> Negative-binomial model
#> -----------------------
#> negbin(formula = y ~ group + offset(log(trisk)), random = ~group, 
#>     data = dja, fixpar = list(4, 0))
#> 
#> Convergence was obtained after 113 iterations.
#> 
#> Fixed-effect coefficients:
#>               Estimate Std. Error    z value Pr(> |z|)
#> (Intercept) -5.526e-01  2.277e-01 -2.427e+00 1.524e-02
#> groupTREAT  -1.020e+00  2.598e-01 -3.929e+00 8.542e-05
#> 
#> Overdispersion coefficients:
#>                Estimate Std. Error   z value   Pr(> z)
#> phi.groupCTRL 8.287e-01   4.12e-01 2.012e+00 2.213e-02
#> 
#> Overdispersion coefficients set to fixed values:
#>                Value
#> phi.groupTREAT     0
#> 
#> Log-likelihood statistics
#>    Log-lik      nbpar    df res.   Deviance        AIC       AICc 
#> -1.211e+02          3         72  1.118e+02  2.483e+02  2.486e+02 
  # glim without overdispersion
  summary(glm(y ~ group + offset(log(trisk)),
    family = poisson, data = dja))
#> 
#> Call:
#> glm(formula = y ~ group + offset(log(trisk)), family = poisson, 
#>     data = dja)
#> 
#> Coefficients:
#>             Estimate Std. Error z value Pr(>|z|)    
#> (Intercept)  -0.6975     0.1170  -5.960 2.53e-09 ***
#> groupTREAT   -0.8754     0.1712  -5.112 3.19e-07 ***
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> 
#> (Dispersion parameter for poisson family taken to be 1)
#> 
#>     Null deviance: 162.67  on 74  degrees of freedom
#> Residual deviance: 136.86  on 73  degrees of freedom
#> AIC: 271.33
#> 
#> Number of Fisher Scoring iterations: 5
#> 
  # phi fixed to zero in both groups
  negbin(y ~ group + offset(log(trisk)), ~ group, dja,
    fixpar = list(c(3, 4), c(0, 0))) 
#> Negative-binomial model
#> -----------------------
#> negbin(formula = y ~ group + offset(log(trisk)), random = ~group, 
#>     data = dja, fixpar = list(c(3, 4), c(0, 0)))
#> 
#> Convergence was obtained after 41 iterations.
#> 
#> Fixed-effect coefficients:
#>               Estimate Std. Error    z value Pr(> |z|)
#> (Intercept) -6.975e-01  1.170e-01 -5.960e+00 2.528e-09
#> groupTREAT  -8.754e-01  1.712e-01 -5.112e+00 3.186e-07
#> 
#> Overdispersion coefficients set to fixed values:
#>                Value
#> phi.groupCTRL      0
#> phi.groupTREAT     0
#> 
#> Log-likelihood statistics
#>    Log-lik      nbpar    df res.   Deviance        AIC       AICc 
#> -1.337e+02          2         73  1.369e+02  2.713e+02  2.715e+02