Gaussian Copula (Bivariate) Family Function

Estimate the correlation parameter of the (bivariate) Gaussian copula distribution by maximum likelihood estimation.

binormalcop(lrho = "rhobitlink", irho = NULL,
            imethod = 1, parallel = FALSE, zero = NULL)

Arguments

lrho, irho, imethod

Details at CommonVGAMffArguments. See Links for more link function choices.

parallel, zero

Details at CommonVGAMffArguments. If parallel = TRUE then the constraint is applied to the intercept too.

Details

The cumulative distribution function is $$P(Y_1 \leq y_1, Y_2 \leq y_2) = \Phi_2 ( \Phi^{-1}(y_1), \Phi^{-1}(y_2); \rho ) $$ for $-1 < \rho < 1$, $\Phi_2$ is the cumulative distribution function of a standard bivariate normal (see pbinorm), and $\Phi$ is the cumulative distribution function of a standard univariate normal (see pnorm).

The support of the function is the interior of the unit square; however, values of 0 and/or 1 are not allowed. The marginal distributions are the standard uniform distributions. When $\rho = 0$ the random variables are independent.

This VGAM family function can handle multiple responses, for example, a six-column matrix where the first 2 columns is the first out of three responses, the next 2 columns being the next response, etc.

Value

An object of class "vglmff" (see vglmff-class). The object is used by modelling functions such as vglm and vgam.

References

Schepsmeier, U. and Stober, J. (2014). Derivatives and Fisher information of bivariate copulas. Statistical Papers 55, 525–542.

Author

T. W. Yee

Note

The response matrix must have a multiple of two-columns. Currently, the fitted value is a matrix with the same number of columns and values equal to 0.5. This is because each marginal distribution corresponds to a standard uniform distribution.

This VGAM family function is fragile; each response must be in the interior of the unit square. Setting crit = "coef" is sometimes a good idea because inaccuracies in pbinorm might mean unnecessary half-stepping will occur near the solution.

Examples

nn <- 1000
ymat <- rbinormcop(nn, rho = rhobitlink(-0.9, inverse = TRUE))
bdata <- data.frame(y1 = ymat[, 1], y2 = ymat[, 2],
                    y3 = ymat[, 1], y4 = ymat[, 2],
                    x2 = runif(nn))
summary(bdata)
#>        y1                  y2                 y3                  y4          
#>  Min.   :0.0008767   Min.   :0.001081   Min.   :0.0008767   Min.   :0.001081  
#>  1st Qu.:0.2612007   1st Qu.:0.255460   1st Qu.:0.2612007   1st Qu.:0.255460  
#>  Median :0.5003742   Median :0.486282   Median :0.5003742   Median :0.486282  
#>  Mean   :0.4997595   Mean   :0.492337   Mean   :0.4997595   Mean   :0.492337  
#>  3rd Qu.:0.7423279   3rd Qu.:0.740228   3rd Qu.:0.7423279   3rd Qu.:0.740228  
#>  Max.   :0.9981033   Max.   :0.999829   Max.   :0.9981033   Max.   :0.999829  
#>        x2           
#>  Min.   :0.0007134  
#>  1st Qu.:0.2675616  
#>  Median :0.4967031  
#>  Mean   :0.5019641  
#>  3rd Qu.:0.7467823  
#>  Max.   :0.9978814  
if (FALSE)  plot(ymat, col = "blue")  # \dontrun{}
fit1 <-  # 2 responses, e.g., (y1,y2) is the 1st
  vglm(cbind(y1, y2, y3, y4) ~ 1, fam = binormalcop,
       crit = "coef",  # Sometimes a good idea
       data = bdata, trace = TRUE)
#> Iteration 1: coefficients = -0.89711150, -0.88855148
#> Iteration 2: coefficients = -0.89700836, -0.89690581
#> Iteration 3: coefficients = -0.89700730, -0.89700626
#> Iteration 4: coefficients = -0.89700729, -0.89700728
#> Iteration 5: coefficients = -0.89700729, -0.89700729
coef(fit1, matrix = TRUE)
#>             rhobitlink(rho1) rhobitlink(rho2)
#> (Intercept)       -0.8970073       -0.8970073
Coef(fit1)
#>       rho1       rho2 
#> -0.4206682 -0.4206682 
head(fitted(fit1))
#>    y1  y2  y3  y4
#> 1 0.5 0.5 0.5 0.5
#> 2 0.5 0.5 0.5 0.5
#> 3 0.5 0.5 0.5 0.5
#> 4 0.5 0.5 0.5 0.5
#> 5 0.5 0.5 0.5 0.5
#> 6 0.5 0.5 0.5 0.5
summary(fit1)
#> 
#> Call:
#> vglm(formula = cbind(y1, y2, y3, y4) ~ 1, family = binormalcop, 
#>     data = bdata, crit = "coef", trace = TRUE)
#> 
#> Coefficients: 
#>               Estimate Std. Error z value Pr(>|z|)    
#> (Intercept):1  -0.8970     0.0583  -15.39   <2e-16 ***
#> (Intercept):2  -0.8970     0.0583  -15.39   <2e-16 ***
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> 
#> Names of linear predictors: rhobitlink(rho1), rhobitlink(rho2)
#> 
#> Log-likelihood: 192.1689 on 1998 degrees of freedom
#> 
#> Number of Fisher scoring iterations: 5 
#> 
#> No Hauck-Donner effect found in any of the estimates
#> 

# Another example; rho is a linear function of x2
bdata <- transform(bdata, rho = -0.5 + x2)
ymat <- rbinormcop(n = nn, rho = with(bdata, rho))
bdata <- transform(bdata, y5 = ymat[, 1], y6 = ymat[, 2])
fit2 <- vgam(cbind(y5, y6) ~ s(x2), data = bdata,
             binormalcop(lrho = "identitylink"), trace = TRUE)
#> VGAM  s.vam  loop  1 :  loglikelihood = 48.464904
#> VGAM  s.vam  loop  2 :  loglikelihood = 49.117563
#> VGAM  s.vam  loop  3 :  loglikelihood = 49.161243
#> VGAM  s.vam  loop  4 :  loglikelihood = 49.170907
#> VGAM  s.vam  loop  5 :  loglikelihood = 49.173403
#> VGAM  s.vam  loop  6 :  loglikelihood = 49.174068
#> VGAM  s.vam  loop  7 :  loglikelihood = 49.174246
#> VGAM  s.vam  loop  8 :  loglikelihood = 49.174294
#> VGAM  s.vam  loop  9 :  loglikelihood = 49.174307
#> VGAM  s.vam  loop  10 :  loglikelihood = 49.17431
#> VGAM  s.vam  loop  11 :  loglikelihood = 49.174311
#> VGAM  s.vam  loop  12 :  loglikelihood = 49.174311
if (FALSE) plot(fit2, lcol = "blue", scol = "orange", se = TRUE) # \dontrun{}