simplex.RdThe two parameters of the univariate standard simplex distribution are estimated by full maximum likelihood estimation.
simplex(lmu = "logitlink", lsigma = "loglink", imu = NULL, isigma = NULL,
imethod = 1, ishrinkage = 0.95, zero = "sigma")Link function for mu and sigma.
See Links for more choices.
Optional initial values for mu and sigma.
A NULL means a value is obtained internally.
See CommonVGAMffArguments for information.
The probability density function can be written
$$f(y; \mu, \sigma) = [2 \pi \sigma^2 (y (1-y))^3]^{-0.5}
\exp[-0.5 (y-\mu)^2 / (\sigma^2 y (1-y) \mu^2 (1-\mu)^2)]
$$
for \(0 < y < 1\),
\(0 < \mu < 1\),
and \(\sigma > 0\).
The mean of \(Y\) is \(\mu\) (called mu, and
returned as the fitted values).
The second parameter, sigma, of this standard simplex
distribution is known as the dispersion parameter.
The unit variance function is
\(V(\mu) = \mu^3 (1-\mu)^3\).
Fisher scoring is applied to both parameters.
An object of class "vglmff" (see vglmff-class).
The object is used by modelling functions such as vglm,
and vgam.
Jorgensen, B. (1997). The Theory of Dispersion Models. London: Chapman & Hall
Song, P. X.-K. (2007). Correlated Data Analysis: Modeling, Analytics, and Applications. Springer.
This distribution is potentially useful for dispersion modelling.
Numerical problems may occur when mu is very close to 0 or 1.
sdata <- data.frame(x2 = runif(nn <- 1000))
sdata <- transform(sdata, eta1 = 1 + 2 * x2,
eta2 = 1 - 2 * x2)
sdata <- transform(sdata, y = rsimplex(nn, mu = logitlink(eta1, inverse = TRUE),
dispersion = exp(eta2)))
(fit <- vglm(y ~ x2, simplex(zero = NULL), data = sdata, trace = TRUE))
#> Iteration 1: loglikelihood = 1316.3575
#> Iteration 2: loglikelihood = 1682.8941
#> Iteration 3: loglikelihood = 1936.7601
#> Iteration 4: loglikelihood = 2056.5911
#> Iteration 5: loglikelihood = 2085.3816
#> Iteration 6: loglikelihood = 2087.2666
#> Iteration 7: loglikelihood = 2087.2804
#> Iteration 8: loglikelihood = 2087.2804
#> Iteration 9: loglikelihood = 2087.2804
#>
#> Call:
#> vglm(formula = y ~ x2, family = simplex(zero = NULL), data = sdata,
#> trace = TRUE)
#>
#>
#> Coefficients:
#> (Intercept):1 (Intercept):2 x2:1 x2:2
#> 0.9786911 1.0332532 2.0166739 -2.0563965
#>
#> Degrees of Freedom: 2000 Total; 1996 Residual
#> Log-likelihood: 2087.28
coef(fit, matrix = TRUE)
#> logitlink(mu) loglink(sigma)
#> (Intercept) 0.9786911 1.033253
#> x2 2.0166739 -2.056397
summary(fit)
#>
#> Call:
#> vglm(formula = y ~ x2, family = simplex(zero = NULL), data = sdata,
#> trace = TRUE)
#>
#> Coefficients:
#> Estimate Std. Error z value Pr(>|z|)
#> (Intercept):1 0.97869 0.02836 34.51 <2e-16 ***
#> (Intercept):2 1.03325 0.04354 23.73 <2e-16 ***
#> x2:1 2.01667 0.03357 60.07 <2e-16 ***
#> x2:2 -2.05640 0.07603 -27.05 <2e-16 ***
#> ---
#> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#>
#> Names of linear predictors: logitlink(mu), loglink(sigma)
#>
#> Log-likelihood: 2087.28 on 1996 degrees of freedom
#>
#> Number of Fisher scoring iterations: 9
#>
#> No Hauck-Donner effect found in any of the estimates
#>