binom2.rho.RdFits a bivariate probit model to two binary responses. The correlation parameter rho is the measure of dependency.
binom2.rho(lmu = "probitlink", lrho = "rhobitlink",
imu1 = NULL, imu2 = NULL,
irho = NULL, imethod = 1, zero = "rho",
exchangeable = FALSE, grho = seq(-0.95, 0.95, by = 0.05),
nsimEIM = NULL)
binom2.Rho(rho = 0, imu1 = NULL, imu2 = NULL,
exchangeable = FALSE, nsimEIM = NULL)Link function applied to the marginal probabilities. Should be left alone.
Link function applied to the \(\rho\) association
parameter.
See Links for more choices.
Optional initial values for the two marginal probabilities. May be a vector.
Optional initial value for \(\rho\). If given, this should lie between \(-1\) and \(1\). See below for more comments.
Specifies which linear/additive predictors are modelled as
intercept-only.
A NULL means none.
Numerically, the \(\rho\) parameter is easiest
modelled as an intercept only, hence the default.
See CommonVGAMffArguments for more information.
Logical.
If TRUE, the two marginal probabilities are constrained
to be equal.
See CommonVGAMffArguments for more information.
A value of at least 100 for nsimEIM is recommended;
the larger the value the better.
Numeric vector. Values are recycled to the needed length, and ought to be in range, which is \((-1, 1)\).
The bivariate probit model was one of the earliest regression models to handle two binary responses jointly. It has a probit link for each of the two marginal probabilities, and models the association between the responses by the \(\rho\) parameter of a standard bivariate normal distribution (with zero means and unit variances). One can think of the joint probabilities being \(\Phi(\eta_1,\eta_2;\rho)\) where \(\Phi\) is the cumulative distribution function of a standard bivariate normal distribution.
Explicitly, the default model is $$probit[P(Y_j=1)] = \eta_j,\ \ \ j=1,2$$ for the marginals, and $$rhobit[rho] = \eta_3.$$ The joint probability \(P(Y_1=1,Y_2=1)= \Phi(\eta_1,\eta_2;\rho)\), and from these the other three joint probabilities are easily computed. The model is fitted by maximum likelihood estimation since the full likelihood is specified. Fisher scoring is implemented.
The default models \(\eta_3\) as a single parameter
only, i.e., an intercept-only model for rho, but this can be
circumvented by setting zero = NULL in order to model
rho as a function of all the explanatory variables.
The bivariate probit model should not be confused with
a bivariate logit model with a probit link (see
binom2.or). The latter uses the odds ratio to
quantify the association. Actually, the bivariate logit model
is recommended over the bivariate probit model because the
odds ratio is a more natural way of measuring the association
between two binary responses.
An object of class "vglmff" (see
vglmff-class). The object is used by modelling
functions such as vglm, and vgam.
When fitted, the fitted.values slot of the object
contains the four joint probabilities, labelled as
\((Y_1,Y_2)\) = (0,0), (0,1), (1,0), (1,1),
respectively.
Ashford, J. R. and Sowden, R. R. (1970). Multi-variate probit analysis. Biometrics, 26, 535–546.
Freedman, D. A. (2010). Statistical Models and Causal Inference: a Dialogue with the Social Sciences, Cambridge: Cambridge University Press.
Freedman, D. A. and Sekhon, J. S. (2010). Endogeneity in probit response models. Political Analysis, 18, 138–150.
See binom2.or about the form of input the response
should have.
By default, a constant \(\rho\) is fitted because
zero = "rho". Set zero = NULL if you want
the \(\rho\) parameter to be modelled as a function
of the explanatory variables. The value \(\rho\)
lies in the interval \((-1,1)\), therefore a
rhobitlink link is default.
Converge problems can occur.
If so, assign irho a range of
values and monitor convergence (e.g., set trace = TRUE).
Else try imethod.
Practical experience shows that local solutions can occur,
and that irho needs to be quite close to the (global)
solution.
Also, imu1 and imu2 may be used.
This help file is mainly about binom2.rho().
binom2.Rho() fits a bivariate probit model with
known \(\rho\).
The inputted rho is saved in the misc slot of
the fitted object, with rho as the component name.
In some econometrics applications (e.g., Freedman 2010, Freedman and Sekhon 2010) one response is used as an explanatory variable, e.g., a recursive binomial probit model. Such will not work here. Historically, the bivariate probit model was the first VGAM I ever wrote, based on Ashford and Sowden (1970). I don't think they ever thought of it either! Hence the criticisms raised go beyond the use of what was originally intended.
coalminers <- transform(coalminers, Age = (age - 42) / 5)
fit <- vglm(cbind(nBnW, nBW, BnW, BW) ~ Age,
binom2.rho, data = coalminers, trace = TRUE)
#> Iteration 1: loglikelihood = -95.59942
#> Iteration 2: loglikelihood = -95.59848
#> Iteration 3: loglikelihood = -95.59848
summary(fit)
#>
#> Call:
#> vglm(formula = cbind(nBnW, nBW, BnW, BW) ~ Age, family = binom2.rho,
#> data = coalminers, trace = TRUE)
#>
#> Coefficients:
#> Estimate Std. Error z value Pr(>|z|)
#> (Intercept):1 -1.279160 0.014435 -88.61 <2e-16 ***
#> (Intercept):2 -0.881462 0.011242 -78.41 <2e-16 ***
#> (Intercept):3 2.044267 0.043254 47.26 <2e-16 ***
#> Age:1 0.273350 0.006178 44.25 <2e-16 ***
#> Age:2 0.184643 0.004901 37.68 <2e-16 ***
#> ---
#> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#>
#> Names of linear predictors: probitlink(mu1), probitlink(mu2), rhobitlink(rho)
#>
#> Log-likelihood: -95.5985 on 22 degrees of freedom
#>
#> Number of Fisher scoring iterations: 3
#>
#> No Hauck-Donner effect found in any of the estimates
#>
coef(fit, matrix = TRUE)
#> probitlink(mu1) probitlink(mu2) rhobitlink(rho)
#> (Intercept) -1.2791601 -0.8814624 2.044267
#> Age 0.2733501 0.1846432 0.000000