Quasi-Likelihood Model for Proportions

The function fits the generalized linear model “II” proposed by Williams (1982) accounting for overdispersion in clustered binomial data $(n, y)$.

quasibin(formula, data, link = c("logit", "cloglog"), phi = NULL, tol =  0.001)

Arguments

formula: A formula for the fixed effects. The left-hand side of the formula must be of the form cbind(y, n - y) where the modelled probability is y/n.
link: The link function for the mean $p$: “logit” or “cloglog”.
data: A data frame containing the response (n and y) and explanatory variable(s).
phi: When phi is NULL (the default), the overdispersion parameter $\phi$ is estimated from the data. Otherwise, its value is considered as fixed.
tol: A positive scalar (default to 0.001). The algorithm stops at iteration $r + 1$ when the condition $\chi{^2}[r+1] - \chi{^2}[r] <= tol$ is met by the $\chi^2$ statistics .

Details

For a given cluster $(n, y)$, the model is: $$y~|~\lambda \sim Binomial(n,~\lambda)$$ with $\lambda$ a random variable of mean $E[\lambda] = p$ and variance $Var[\lambda] = \phi * p * (1 - p)$.
The marginal mean and variance are: $$E[y] = p$$ $$Var[y] = p * (1 - p) * [1 + (n - 1) * \phi]$$ The overdispersion parameter $\phi$ corresponds to the intra-cluster correlation coefficient, which is here restricted to be positive.
The function uses the function glm and the parameterization: $p = h(X b) = h(\eta)$, where $h$ is the inverse of a given link function, $X$ is a design-matrix, $b$ is a vector of fixed effects and $\eta = X b$ is the linear predictor.
The estimate of $b$ maximizes the quasi log-likelihood of the marginal model. The parameter $\phi$ is estimated with the moment method or can be set to a constant (a regular glim is fitted when $\phi$ is set to zero). The literature recommends to estimate $\phi$ from the saturated model. Several explanatory variables are allowed in $b$. None is allowed in $\phi$.

Value

An object of formal class “glimQL”: see glimQL-class for details.

References

Moore, D.F., 1987, Modelling the extraneous variance in the presence of extra-binomial variation. Appl. Statist. 36, 8-14.
Williams, D.A., 1982, Extra-binomial variation in logistic linear models. Appl. Statist. 31, 144-148.

Author

Matthieu Lesnoff matthieu.lesnoff@cirad.fr, Renaud Lancelot renaud.lancelot@cirad.fr

Examples

  data(orob2) 
  fm1 <- glm(cbind(y, n - y) ~ seed * root,
             family = binomial, data = orob2)
  fm2 <- quasibin(cbind(y, n - y) ~ seed * root,
                  data = orob2, phi = 0)
  fm3 <- quasibin(cbind(y, n - y) ~ seed * root,
                  data = orob2)
  rbind(fm1 = coef(fm1), fm2 = coef(fm2), fm3 = coef(fm3))
#>     (Intercept)     seedO75 rootCUCUMBER seedO75:rootCUCUMBER
#> fm1  -0.4122448 -0.14592695    0.5400782            0.7781037
#> fm2  -0.4122448 -0.14592695    0.5400782            0.7781037
#> fm3  -0.4653218 -0.07008965    0.5102360            0.8195562
  # show the model
  fm3
#> Quasi-likelihood generalized linear model
#> -----------------------------------------
#> quasibin(formula = cbind(y, n - y) ~ seed * root, data = orob2)
#> 
#> Fixed-effect coefficients:
#>                      Estimate Std. Error z value Pr(>|z|)
#> (Intercept)           -0.4653     0.2439 -1.9081   0.0564
#> seedO75               -0.0701     0.3115 -0.2250   0.8219
#> rootCUCUMBER           0.5102     0.3347  1.5244   0.1274
#> seedO75:rootCUCUMBER   0.8196     0.4352  1.8831   0.0597
#> 
#> Overdispersion parameter:
#>    phi 
#> 0.0249 
#> 
#> Pearson's chi-squared goodness-of-fit statistic = 17.0007 
#> 
  # dispersion parameter and goodness-of-fit statistic
  c(phi = fm3@phi, 
    X2 = sum(residuals(fm3, type = "pearson")^2))
#>         phi          X2 
#>  0.02493559 17.00065130 
  # model predictions
  predfm1 <- predict(fm1, type = "response", se = TRUE)
  predfm3 <- predict(fm3, type = "response", se = TRUE)
  New <- expand.grid(seed = levels(orob2$seed),
                     root = levels(orob2$root))
  predict(fm3, New, se = TRUE, type = "response")
#> $fit
#>         1         2         3         4 
#> 0.3857241 0.3692557 0.5112267 0.6887712 
#> 
#> $se.fit
#>          1          2          3          4 
#> 0.05778166 0.04512309 0.05728607 0.04278711 
#> 
#> $residual.scale
#> [1] 1
#> 
  data.frame(orob2, p1 = predfm1$fit, 
                    se.p1 = predfm1$se.fit,
                    p3 = predfm3$fit,
                    se.p3 = predfm3$se.fit)
#>    seed     root  n  y        p1      se.p1        p3      se.p3
#> 1   O75     BEAN 39 10 0.3639706 0.02917342 0.3692557 0.04512309
#> 2   O75     BEAN 62 23 0.3639706 0.02917342 0.3692557 0.04512309
#> 3   O75     BEAN 81 23 0.3639706 0.02917342 0.3692557 0.04512309
#> 4   O75     BEAN 51 26 0.3639706 0.02917342 0.3692557 0.04512309
#> 5   O75     BEAN 39 17 0.3639706 0.02917342 0.3692557 0.04512309
#> 6   O75 CUCUMBER  6  5 0.6813559 0.02712870 0.6887712 0.04278711
#> 7   O75 CUCUMBER 74 53 0.6813559 0.02712870 0.6887712 0.04278711
#> 8   O75 CUCUMBER 72 55 0.6813559 0.02712870 0.6887712 0.04278711
#> 9   O75 CUCUMBER 51 32 0.6813559 0.02712870 0.6887712 0.04278711
#> 10  O75 CUCUMBER 79 46 0.6813559 0.02712870 0.6887712 0.04278711
#> 11  O75 CUCUMBER 13 10 0.6813559 0.02712870 0.6887712 0.04278711
#> 12  O73     BEAN 16  8 0.3983740 0.04414243 0.3857241 0.05778166
#> 13  O73     BEAN 30 10 0.3983740 0.04414243 0.3857241 0.05778166
#> 14  O73     BEAN 28  8 0.3983740 0.04414243 0.3857241 0.05778166
#> 15  O73     BEAN 45 23 0.3983740 0.04414243 0.3857241 0.05778166
#> 16  O73     BEAN  4  0 0.3983740 0.04414243 0.3857241 0.05778166
#> 17  O73 CUCUMBER 12  3 0.5319149 0.04202173 0.5112267 0.05728607
#> 18  O73 CUCUMBER 41 22 0.5319149 0.04202173 0.5112267 0.05728607
#> 19  O73 CUCUMBER 30 15 0.5319149 0.04202173 0.5112267 0.05728607
#> 20  O73 CUCUMBER 51 32 0.5319149 0.04202173 0.5112267 0.05728607
#> 21  O73 CUCUMBER  7  3 0.5319149 0.04202173 0.5112267 0.05728607
  fm4 <- quasibin(cbind(y, n - y) ~ seed + root,
                  data = orob2, phi = fm3@phi)
  # Pearson's chi-squared goodness-of-fit statistic
  # compare with fm3's X2
  sum(residuals(fm4, type = "pearson")^2)
#> [1] 20.72527