Test of Proportion Homogeneity using Rao and Scott's Adjustment

Tests the homogeneity of proportions between $I$ groups (H0: $p_1 = p_2 = ... = p_I$ ) from clustered binomial data $(n, y)$ using the adjusted $\chi^2$ statistic proposed by Rao and Scott (1993).

raoscott(formula = NULL, response = NULL, weights = NULL, 
              group = NULL, data, pooled = FALSE, deff = NULL)

Arguments

formula: An optional formula where the left-hand side is either a matrix of the form cbind(y, n-y), where the modelled probability is y/n, or a vector of proportions to be modelled (y/n). In both cases, the right-hand side must specify a single grouping variable. When the left-hand side of the formula is a vector of proportions, the argument weight must be used to indicate the denominators of the proportions.
response: An optional argument: either a matrix of the form cbind(y, n-y), where the modelled probability is y/n, or a vector of proportions to be modelled (y/n).
weights: An optional argument used when the left-hand side of formula or response is a vector of proportions: weight is the denominator of the proportions.
group: An optional argument only used when response is used. In this case, this argument is a factor indicating a grouping variable.
data: A data frame containing the response (n and y) and the grouping variable.
pooled: Logical indicating if a pooled design effect is estimated over the $I$ groups.
deff: A numerical vector of $I$ design effects.

Details

The method is based on the concepts of design effect and effective sample size.

The design effect in each group $i$ is estimated by $deff_i = vratio_i / vbin_i$, where $vratio_i$ is the variance of the ratio estimate of the probability in group $i$ (Cochran, 1999, p. 32 and p. 66) and $vbin_i$ is the standard binomial variance. A pooled design effect (i.e., over the $I$ groups) is estimated if argument pooled = TRUE (see Rao and Scott, 1993, Eq. 6). Fixed design effects can be specified with the argument deff.
The $deff_i$ are used to compute the effective sample sizes $nadj_i = n_i / deff_i$, the effective numbers of successes $yadj_i = y_i / deff_i$ in each group $i$, and the overall effective proportion $padj = \sum_{i} yadj_i / \sum_{i} deff_i$. The test statistic is obtained by substituting these quantities in the usual $\chi^2$ statistic, yielding: $$X^2 = \sum_{i}\frac{(yadj_i - nadj_i * padj)^2}{nadj_i * padj * (1 - padj)}$$ which is compared to a $\chi^2$ distribution with $I - 1$ degrees of freedom.

Value

An object of formal class “drs”: see drs-class for details. The slot tab provides the proportion of successes, the variances of the proportion and the design effect for each group.

References

Cochran, W.G., 1999, 2nd ed. Sampling techniques. John Wiley & Sons, New York.
Rao, J.N.K., Scott, A.J., 1992. A simple method for the analysis of clustered binary data. Biometrics 48, 577-585.

Author

Matthieu Lesnoff matthieu.lesnoff@cirad.fr, Renaud Lancelot renaud.lancelot@cirad.fr

Examples

  data(rats)
  # deff by group
  raoscott(cbind(y, n - y) ~ group, data = rats)
#> 
#> Test of proportion homogeneity (Rao and Scott, 1993) 
#> ---------------------------------------------------- 
#> raoscott(formula = cbind(y, n - y) ~ group, data = rats)
#> N = 32 clusters, n = 303 subjects, y = 254 cases, I = 2 groups.
#> 
#> Data and design effects:
#>   group  N   n   y      p     vbin    vratio  deff
#> 1  CTRL 16 158 142 0.8987 0.000576 0.0007099 1.232
#> 2 TREAT 16 145 112 0.7724 0.001212 0.0047922 3.953
#> 
#> Adjusted chi-squared test:
#> X2 = 4, df = 1, P(> X2) = 0.0444
  raoscott(y/n ~ group, weights = n, data = rats)
#> 
#> Test of proportion homogeneity (Rao and Scott, 1993) 
#> ---------------------------------------------------- 
#> raoscott(formula = y/n ~ group, weights = n, data = rats)
#> N = 32 clusters, n = 303 subjects, y = 254 cases, I = 2 groups.
#> 
#> Data and design effects:
#>   group + n  N   n   y      p     vbin    vratio  deff
#> 1      CTRL 16 158 142 0.8987 0.000576 0.0007099 1.232
#> 2     TREAT 16 145 112 0.7724 0.001212 0.0047922 3.953
#> 
#> Adjusted chi-squared test:
#> X2 = 4, df = 1, P(> X2) = 0.0444
  raoscott(response = cbind(y, n - y), group = group, data = rats)
#> 
#> Test of proportion homogeneity (Rao and Scott, 1993) 
#> ---------------------------------------------------- 
#> raoscott(response = cbind(y, n - y), group = group, data = rats)
#> N = 32 clusters, n = 303 subjects, y = 254 cases, I = 2 groups.
#> 
#> Data and design effects:
#>   group + NULL  N   n   y      p     vbin    vratio  deff
#> 1         CTRL 16 158 142 0.8987 0.000576 0.0007099 1.232
#> 2        TREAT 16 145 112 0.7724 0.001212 0.0047922 3.953
#> 
#> Adjusted chi-squared test:
#> X2 = 4, df = 1, P(> X2) = 0.0444
  raoscott(response = y/n, weights = n, group = group, data = rats)
#> 
#> Test of proportion homogeneity (Rao and Scott, 1993) 
#> ---------------------------------------------------- 
#> raoscott(response = y/n, weights = n, group = group, data = rats)
#> N = 32 clusters, n = 303 subjects, y = 254 cases, I = 2 groups.
#> 
#> Data and design effects:
#>   group + n  N   n   y      p     vbin    vratio  deff
#> 1      CTRL 16 158 142 0.8987 0.000576 0.0007099 1.232
#> 2     TREAT 16 145 112 0.7724 0.001212 0.0047922 3.953
#> 
#> Adjusted chi-squared test:
#> X2 = 4, df = 1, P(> X2) = 0.0444
  # pooled deff
  raoscott(cbind(y, n - y) ~ group, data = rats, pooled = TRUE)
#> 
#> Test of proportion homogeneity (Rao and Scott, 1993) 
#> ---------------------------------------------------- 
#> raoscott(formula = cbind(y, n - y) ~ group, data = rats, pooled = TRUE)
#> N = 32 clusters, n = 303 subjects, y = 254 cases, I = 2 groups.
#> 
#> Data and design effects:
#>   group  N   n   y      p     vbin    vratio  deff
#> 1  CTRL 16 158 142 0.8987 0.000576 0.0007099 3.069
#> 2 TREAT 16 145 112 0.7724 0.001212 0.0047922 3.069
#> 
#> Adjusted chi-squared test:
#> X2 = 2.9, df = 1, P(> X2) = 0.0886
  # standard test
  raoscott(cbind(y, n - y) ~ group, data = rats, deff = c(1, 1))
#> 
#> Test of proportion homogeneity (Rao and Scott, 1993) 
#> ---------------------------------------------------- 
#> raoscott(formula = cbind(y, n - y) ~ group, data = rats, deff = c(1, 
#>     1))
#> N = 32 clusters, n = 303 subjects, y = 254 cases, I = 2 groups.
#> 
#> Data and design effects:
#>   group  N   n   y      p     vbin    vratio deff
#> 1  CTRL 16 158 142 0.8987 0.000576 0.0007099    1
#> 2 TREAT 16 145 112 0.7724 0.001212 0.0047922    1
#> 
#> Adjusted chi-squared test:
#> X2 = 8.9, df = 1, P(> X2) = 0.0029
  data(antibio)
  raoscott(cbind(y, n - y) ~ treatment, data = antibio)
#> 
#> Test of proportion homogeneity (Rao and Scott, 1993) 
#> ---------------------------------------------------- 
#> raoscott(formula = cbind(y, n - y) ~ treatment, data = antibio)
#> N = 24 clusters, n = 542 subjects, y = 67 cases, I = 4 groups.
#> 
#> Data and design effects:
#>   treatment N   n  y       p      vbin    vratio  deff
#> 1         1 7 144 18 0.12500 0.0007595 0.0028676 3.775
#> 2         2 6 129  8 0.06202 0.0004509 0.0007568 1.678
#> 3         3 5 130 24 0.18462 0.0011579 0.0014880 1.285
#> 4         4 6 139 17 0.12230 0.0007723 0.0020771 2.690
#> 
#> Adjusted chi-squared test:
#> X2 = 5.9, df = 3, P(> X2) = 0.1174