Generalized Beta Distribution Family Function

Maximum likelihood estimation of the 3-parameter generalized beta distribution as proposed by Libby and Novick (1982).

lino(lshape1 = "loglink", lshape2 = "loglink", llambda = "loglink",
     ishape1 = NULL,   ishape2 = NULL,   ilambda = 1, zero = NULL)

Arguments

lshape1, lshape2

Parameter link functions applied to the two (positive) shape parameters $a$ and $b$. See Links for more choices.

llambda

Parameter link function applied to the parameter $\lambda$. See Links for more choices.

ishape1, ishape2, ilambda

Initial values for the parameters. A NULL value means one is computed internally. The argument ilambda must be numeric, and the default corresponds to a standard beta distribution.

zero

Can be an integer-valued vector specifying which linear/additive predictors are modelled as intercepts only. Here, the values must be from the set {1,2,3} which correspond to $a$, $b$, $\lambda$, respectively. See CommonVGAMffArguments for more information.

Details

Proposed by Libby and Novick (1982), this distribution has density $$f(y;a,b,\lambda) = \frac{\lambda^{a} y^{a-1} (1-y)^{b-1}}{ B(a,b) \{1 - (1-\lambda) y\}^{a+b}}$$ for $a > 0$, $b > 0$, $\lambda > 0$, $0 < y < 1$. Here $B$ is the beta function (see beta). The mean is a complicated function involving the Gauss hypergeometric function. If $X$ has a lino distribution with parameters shape1, shape2, lambda, then $Y=\lambda X/(1-(1-\lambda)X)$ has a standard beta distribution with parameters shape1, shape2.

Since $\log(\lambda)=0$ corresponds to the standard beta distribution, a summary of the fitted model performs a t-test for whether the data belongs to a standard beta distribution (provided the loglink link for $\lambda$ is used; this is the default).

Value

An object of class "vglmff" (see vglmff-class). The object is used by modelling functions such as vglm, and vgam.

References

Libby, D. L. and Novick, M. R. (1982). Multivariate generalized beta distributions with applications to utility assessment. Journal of Educational Statistics, 7, 271–294.

Gupta, A. K. and Nadarajah, S. (2004). Handbook of Beta Distribution and Its Applications, NY: Marcel Dekker, Inc.

Author

T. W. Yee

Note

The fitted values, which is usually the mean, have not been implemented yet. Currently the median is returned as the fitted values.

Although Fisher scoring is used, the working weight matrices are positive-definite only in a certain region of the parameter space. Problems with this indicate poor initial values or an ill-conditioned model or insufficient data etc.

This model is can be difficult to fit. A reasonably good value of ilambda seems to be needed so if the self-starting initial values fail, try experimenting with the initial value arguments. Experience suggests ilambda is better a little larger, rather than smaller, compared to the true value.

Examples

ldata <- data.frame(y1 = rbeta(n = 1000, exp(0.5), exp(1)))  # Std beta
fit <- vglm(y1 ~ 1, lino, data = ldata, trace = TRUE)
#> Iteration 1: loglikelihood = 215.0564
#> Iteration 2: loglikelihood = 215.05693
#> Iteration 3: loglikelihood = 215.05693
coef(fit, matrix = TRUE)
#>             loglink(shape1) loglink(shape2) loglink(lambda)
#> (Intercept)       0.4801395       0.9822259      0.03628847
Coef(fit)
#>   shape1   shape2   lambda 
#> 1.616300 2.670394 1.036955 
head(fitted(fit))
#>          [,1]
#> [1,] 0.348158
#> [2,] 0.348158
#> [3,] 0.348158
#> [4,] 0.348158
#> [5,] 0.348158
#> [6,] 0.348158
summary(fit)
#> 
#> Call:
#> vglm(formula = y1 ~ 1, family = lino, data = ldata, trace = TRUE)
#> 
#> Coefficients: 
#>               Estimate Std. Error z value Pr(>|z|)    
#> (Intercept):1  0.48014    0.07356   6.527 6.71e-11 ***
#> (Intercept):2  0.98223    0.11488   8.550  < 2e-16 ***
#> (Intercept):3  0.03629    0.21162   0.171    0.864    
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> 
#> Names of linear predictors: loglink(shape1), loglink(shape2), loglink(lambda)
#> 
#> Log-likelihood: 215.0569 on 2997 degrees of freedom
#> 
#> Number of Fisher scoring iterations: 3 
#> 
#> No Hauck-Donner effect found in any of the estimates
#> 

# Nonstandard beta distribution
ldata <- transform(ldata, y2 = rlino(1000, shape1 = exp(1),
                                     shape2 = exp(2), lambda = exp(1)))
fit2 <- vglm(y2 ~ 1,
             lino(lshape1 = "identitylink", lshape2 = "identitylink",
                  ilamb = 10), data = ldata, trace = TRUE)
#> Iteration 1: loglikelihood = 1292.9045
#> Iteration 2: loglikelihood = 1309.3041
#> Iteration 3: loglikelihood = 1310.124
#> Iteration 4: loglikelihood = 1311.4496
#> Iteration 5: loglikelihood = 1311.4793
#> Iteration 6: loglikelihood = 1311.4795
#> Iteration 7: loglikelihood = 1311.4795
coef(fit2, matrix = TRUE)
#>               shape1   shape2 loglink(lambda)
#> (Intercept) 2.985477 9.271763       0.8490487