Bivariate Probit Regression

Fits a bivariate probit model to two binary responses. The correlation parameter rho is the measure of dependency.

binom2.rho(lmu = "probitlink", lrho = "rhobitlink",
           imu1 = NULL, imu2 = NULL,
           irho = NULL, imethod = 1, zero = "rho",
           exchangeable = FALSE, grho = seq(-0.95, 0.95, by = 0.05),
           nsimEIM = NULL)
binom2.Rho(rho = 0, imu1 = NULL, imu2 = NULL,
           exchangeable = FALSE, nsimEIM = NULL)

Arguments

lmu

Link function applied to the marginal probabilities. Should be left alone.

lrho

Link function applied to the $\rho$ association parameter. See Links for more choices.

imu1, imu2

Optional initial values for the two marginal probabilities. May be a vector.

irho

Optional initial value for $\rho$. If given, this should lie between $-1$ and $1$. See below for more comments.

zero

Specifies which linear/additive predictors are modelled as intercept-only. A NULL means none. Numerically, the $\rho$ parameter is easiest modelled as an intercept only, hence the default. See CommonVGAMffArguments for more information.

exchangeable

Logical. If TRUE, the two marginal probabilities are constrained to be equal.

imethod, nsimEIM, grho

See CommonVGAMffArguments for more information. A value of at least 100 for nsimEIM is recommended; the larger the value the better.

rho

Numeric vector. Values are recycled to the needed length, and ought to be in range, which is $(-1, 1)$.

Details

The bivariate probit model was one of the earliest regression models to handle two binary responses jointly. It has a probit link for each of the two marginal probabilities, and models the association between the responses by the $\rho$ parameter of a standard bivariate normal distribution (with zero means and unit variances). One can think of the joint probabilities being $\Phi(\eta_1,\eta_2;\rho)$ where $\Phi$ is the cumulative distribution function of a standard bivariate normal distribution.

Explicitly, the default model is $$probit[P(Y_j=1)] = \eta_j,\ \ \ j=1,2$$ for the marginals, and $$rhobit[rho] = \eta_3.$$ The joint probability $P(Y_1=1,Y_2=1)= \Phi(\eta_1,\eta_2;\rho)$, and from these the other three joint probabilities are easily computed. The model is fitted by maximum likelihood estimation since the full likelihood is specified. Fisher scoring is implemented.

The default models $\eta_3$ as a single parameter only, i.e., an intercept-only model for rho, but this can be circumvented by setting zero = NULL in order to model rho as a function of all the explanatory variables.

The bivariate probit model should not be confused with a bivariate logit model with a probit link (see binom2.or). The latter uses the odds ratio to quantify the association. Actually, the bivariate logit model is recommended over the bivariate probit model because the odds ratio is a more natural way of measuring the association between two binary responses.

Value

An object of class "vglmff" (see vglmff-class). The object is used by modelling functions such as vglm, and vgam.

When fitted, the fitted.values slot of the object contains the four joint probabilities, labelled as $(Y_1,Y_2)$ = (0,0), (0,1), (1,0), (1,1), respectively.

References

Ashford, J. R. and Sowden, R. R. (1970). Multi-variate probit analysis. Biometrics, 26, 535–546.

Freedman, D. A. (2010). Statistical Models and Causal Inference: a Dialogue with the Social Sciences, Cambridge: Cambridge University Press.

Freedman, D. A. and Sekhon, J. S. (2010). Endogeneity in probit response models. Political Analysis, 18, 138–150.

Author

Thomas W. Yee

Note

See binom2.or about the form of input the response should have.

By default, a constant $\rho$ is fitted because zero = "rho". Set zero = NULL if you want the $\rho$ parameter to be modelled as a function of the explanatory variables. The value $\rho$ lies in the interval $(-1,1)$, therefore a rhobitlink link is default.

Converge problems can occur. If so, assign irho a range of values and monitor convergence (e.g., set trace = TRUE). Else try imethod. Practical experience shows that local solutions can occur, and that irho needs to be quite close to the (global) solution. Also, imu1 and imu2 may be used.

This help file is mainly about binom2.rho(). binom2.Rho() fits a bivariate probit model with known $\rho$. The inputted rho is saved in the misc slot of the fitted object, with rho as the component name.

In some econometrics applications (e.g., Freedman 2010, Freedman and Sekhon 2010) one response is used as an explanatory variable, e.g., a recursive binomial probit model. Such will not work here. Historically, the bivariate probit model was the first VGAM I ever wrote, based on Ashford and Sowden (1970). I don't think they ever thought of it either! Hence the criticisms raised go beyond the use of what was originally intended.

Examples

coalminers <- transform(coalminers, Age = (age - 42) / 5)
fit <- vglm(cbind(nBnW, nBW, BnW, BW) ~ Age,
            binom2.rho, data = coalminers, trace = TRUE)
#> Iteration 1: loglikelihood = -95.59942
#> Iteration 2: loglikelihood = -95.59848
#> Iteration 3: loglikelihood = -95.59848
summary(fit)
#> 
#> Call:
#> vglm(formula = cbind(nBnW, nBW, BnW, BW) ~ Age, family = binom2.rho, 
#>     data = coalminers, trace = TRUE)
#> 
#> Coefficients: 
#>                Estimate Std. Error z value Pr(>|z|)    
#> (Intercept):1 -1.279160   0.014435  -88.61   <2e-16 ***
#> (Intercept):2 -0.881462   0.011242  -78.41   <2e-16 ***
#> (Intercept):3  2.044267   0.043254   47.26   <2e-16 ***
#> Age:1          0.273350   0.006178   44.25   <2e-16 ***
#> Age:2          0.184643   0.004901   37.68   <2e-16 ***
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> 
#> Names of linear predictors: probitlink(mu1), probitlink(mu2), rhobitlink(rho)
#> 
#> Log-likelihood: -95.5985 on 22 degrees of freedom
#> 
#> Number of Fisher scoring iterations: 3 
#> 
#> No Hauck-Donner effect found in any of the estimates
#> 
coef(fit, matrix = TRUE)
#>             probitlink(mu1) probitlink(mu2) rhobitlink(rho)
#> (Intercept)      -1.2791601      -0.8814624        2.044267
#> Age               0.2733501       0.1846432        0.000000