regression.logistic.RdCalculates power or sample size (only one can be NULL at a time) to test a single coefficient in logistic regression. power.z.logistic() and power.z.logreg() are the same functions, as well as pwrss.z.logistic() and pwrss.z.logreg().
The distribution of the predictor variable can be one of the following: c("normal", "poisson", "uniform", "exponential", "binomial", "bernouilli", "lognormal") for Demidenko (2007) procedure but only c("normal", "binomial", "bernouilli") for Hsieh et al. (1998) procedure. The default parameters for these distributions are
distribution = list(dist = "normal", mean = 0, sd = 1) distribution = list(dist = "poisson", lambda = 1) distribution = list(dist = "uniform", min = 0, max = 1) distribution = list(dist = "exponential", rate = 1) distribution = list(dist = "binomial", size = 1, prob = 0.50) distribution = list(dist = "bernoulli", prob = 0.50) distribution = list(dist = "lognormal", meanlog = 0, sdlog = 1)
Parameters defined in list() form can be modified, but element names should be kept the same. It is sufficient to use distribution's name for default parameters (e.g. dist = "normal").
NOTE: The pwrss.z.logistic() and its alias pwrss.z.logreg() are deprecated. However, they will remain available as wrappers for the power.z.logistic() function.
Formulas are validated using G*Power and tables in PASS documentation.
power.z.logistic(prob = NULL, base.prob = NULL,
odds.ratio = (prob/(1-prob))/(base.prob/(1-base.prob)),
beta0 = log(base.prob/(1-base.prob)), beta1 = log(odds.ratio),
n = NULL, power = NULL, r.squared.predictor = 0,
alpha = 0.05, alternative = c("two.sided", "one.sided"),
method = c("demidenko(vc)", "demidenko", "hsieh"),
distribution = "normal", ceiling = TRUE,
verbose = TRUE, pretty = FALSE)base probability under null hypothesis (probability that an event occurs without the influence of the predictor - or when the value of the predictor is zero).
probability under alternative hypothesis (probability that an event occurs when the value of the predictor is increased from 0 to 1). Warning: This is base probability + incremental increase.
regression coefficient defined as beta0 = log(base.prob/(1-base.prob))
regression coefficient for the predictor X defined as beta1 = log((prob/(1-prob))/(base.prob/(1-base.prob)))
odds ratio defined as odds.ratio = exp(beta1) = (prob/(1-prob))/(base.prob/(1-base.prob))
integer; sample size
statistical power, defined as the probability of correctly rejecting a false null hypothesis, denoted as \(1 - \beta\).
proportion of variance in the predictor accounted for by other covariates. This is not a pseudo R-squared. To compute it, regress the predictor on the covariates and extract the adjusted R-squared from that model.
type 1 error rate, defined as the probability of incorrectly rejecting a true null hypothesis, denoted as \(\alpha\).
character; direction or type of the hypothesis test: "not equal", "greater", "less"
character; analytic method. "demidenko(vc)" stands for Demidenko (2007) procedure with variance correction; "demidenko" stands for Demidenko (2007) procedure without variance correction; "hsieh" stands for Hsieh et al. (1998) procedure. "demidenko" and "hsieh" methods produce similar results but "demidenko(vc)" is more precise
character; distribution family. Can be one of the c("noramal", "poisson", "uniform", "exponential", "binomial", "bernouilli", "lognormal") for Demidenko (2007) procedure but only c("normal", "binomial", "bernouilli") for Hsieh et al. (1998) procedure.
logical; whether sample size should be rounded up. TRUE by default.
logical; whether the output should be printed on the console. TRUE by default.
logical; whether the output should show Unicode characters (if encoding allows for it). FALSE by default.
list of parameters used in calculation.
type of the statistical test (Z-Test).
mean of the alternative distribution.
standard deviation of the alternative distribution.
mean of the null distribution.
standard deviation of the null distribution.
critical value(s).
statistical power \((1-\beta)\).
sample size.
Demidenko, E. (2007). Sample size determination for logistic regression revisited. Statistics in Medicine, 26(18), 3385-3397. doi:10.1002/sim.2771
Hsieh, F. Y., Bloch, D. A., & Larsen, M. D. (1998). A simple method of sample size calculation for linear and logistic regression. Statistics in Medicine, 17(4), 1623-1634.
###########################################
# predictor X follows normal distribution #
###########################################
## probability specification
power.z.logistic(base.prob = 0.15, prob = 0.20,
alpha = 0.05, power = 0.80,
dist = "normal")
#> +--------------------------------------------------+
#> | SAMPLE SIZE CALCULATION |
#> +--------------------------------------------------+
#>
#> Logistic Regression Coefficient (Wald's Z-Test)
#>
#> Method : Demidenko (Variance Corrected)
#> Predictor Dist. : Normal
#>
#> ---------------------------------------------------
#> Hypotheses
#> ---------------------------------------------------
#> H0 (Null Claim) : Odds Ratio = 1
#> H1 (Alt. Claim) : Odds Ratio != 1
#>
#> ---------------------------------------------------
#> Results
#> ---------------------------------------------------
#> Sample Size = 511 <<
#> Type 1 Error (alpha) = 0.050
#> Type 2 Error (beta) = 0.199
#> Statistical Power = 0.801
#>
## odds ratio specification
power.z.logistic(base.prob = 0.15, odds.ratio = 1.416667,
alpha = 0.05, power = 0.80,
dist = "normal")
#> +--------------------------------------------------+
#> | SAMPLE SIZE CALCULATION |
#> +--------------------------------------------------+
#>
#> Logistic Regression Coefficient (Wald's Z-Test)
#>
#> Method : Demidenko (Variance Corrected)
#> Predictor Dist. : Normal
#>
#> ---------------------------------------------------
#> Hypotheses
#> ---------------------------------------------------
#> H0 (Null Claim) : Odds Ratio = 1
#> H1 (Alt. Claim) : Odds Ratio != 1
#>
#> ---------------------------------------------------
#> Results
#> ---------------------------------------------------
#> Sample Size = 511 <<
#> Type 1 Error (alpha) = 0.050
#> Type 2 Error (beta) = 0.199
#> Statistical Power = 0.801
#>
## regression coefficient specification
power.z.logistic(beta0 = -1.734601, beta1 = 0.3483067,
alpha = 0.05, power = 0.80,
dist = "normal")
#> +--------------------------------------------------+
#> | SAMPLE SIZE CALCULATION |
#> +--------------------------------------------------+
#>
#> Logistic Regression Coefficient (Wald's Z-Test)
#>
#> Method : Demidenko (Variance Corrected)
#> Predictor Dist. : Normal
#>
#> ---------------------------------------------------
#> Hypotheses
#> ---------------------------------------------------
#> H0 (Null Claim) : Odds Ratio = 1
#> H1 (Alt. Claim) : Odds Ratio != 1
#>
#> ---------------------------------------------------
#> Results
#> ---------------------------------------------------
#> Sample Size = 511 <<
#> Type 1 Error (alpha) = 0.050
#> Type 2 Error (beta) = 0.199
#> Statistical Power = 0.801
#>
## change parameters associated with predictor X
pred.dist <- list(dist = "normal", mean = 10, sd = 2)
power.z.logistic(base.prob = 0.15, beta1 = 0.3483067,
alpha = 0.05, power = 0.80,
dist = pred.dist)
#> +--------------------------------------------------+
#> | SAMPLE SIZE CALCULATION |
#> +--------------------------------------------------+
#>
#> Logistic Regression Coefficient (Wald's Z-Test)
#>
#> Method : Demidenko (Variance Corrected)
#> Predictor Dist. : Normal
#>
#> ---------------------------------------------------
#> Hypotheses
#> ---------------------------------------------------
#> H0 (Null Claim) : Odds Ratio = 1
#> H1 (Alt. Claim) : Odds Ratio != 1
#>
#> ---------------------------------------------------
#> Results
#> ---------------------------------------------------
#> Sample Size = 134 <<
#> Type 1 Error (alpha) = 0.050
#> Type 2 Error (beta) = 0.199
#> Statistical Power = 0.801
#>
##############################################
# predictor X follows Bernoulli distribution #
# (such as treatment/control groups) #
##############################################
## odds ratio specification
power.z.logistic(base.prob = 0.15, odds.ratio = 1.416667,
alpha = 0.05, power = 0.80,
dist = "bernoulli")
#> +--------------------------------------------------+
#> | SAMPLE SIZE CALCULATION |
#> +--------------------------------------------------+
#>
#> Logistic Regression Coefficient (Wald's Z-Test)
#>
#> Method : Demidenko (Variance Corrected)
#> Predictor Dist. : Bernoulli
#>
#> ---------------------------------------------------
#> Hypotheses
#> ---------------------------------------------------
#> H0 (Null Claim) : Odds Ratio = 1
#> H1 (Alt. Claim) : Odds Ratio != 1
#>
#> ---------------------------------------------------
#> Results
#> ---------------------------------------------------
#> Sample Size = 1816 <<
#> Type 1 Error (alpha) = 0.050
#> Type 2 Error (beta) = 0.200
#> Statistical Power = 0.8
#>
## change parameters associated with predictor X
pred.dist <- list(dist = "bernoulli", prob = 0.30)
power.z.logistic(base.prob = 0.15, odds.ratio = 1.416667,
alpha = 0.05, power = 0.80,
dist = pred.dist)
#> +--------------------------------------------------+
#> | SAMPLE SIZE CALCULATION |
#> +--------------------------------------------------+
#>
#> Logistic Regression Coefficient (Wald's Z-Test)
#>
#> Method : Demidenko (Variance Corrected)
#> Predictor Dist. : Bernoulli
#>
#> ---------------------------------------------------
#> Hypotheses
#> ---------------------------------------------------
#> H0 (Null Claim) : Odds Ratio = 1
#> H1 (Alt. Claim) : Odds Ratio != 1
#>
#> ---------------------------------------------------
#> Results
#> ---------------------------------------------------
#> Sample Size = 2114 <<
#> Type 1 Error (alpha) = 0.050
#> Type 2 Error (beta) = 0.200
#> Statistical Power = 0.8
#>
####################################
# predictor X is an ordinal factor #
####################################
## generating an ordinal predictor
x.ord <- sample(
x = c(1, 2, 3, 4), # levels
size = 1e5, # sample size large enough to get stable estimates
prob = c(0.25, 0.25, 0.25, 0.25), # category probabilities
replace = TRUE
)
## dummy coding the ordinal predictor
x.ord <- factor(x.ord, ordered = TRUE)
contrasts(x.ord) <- contr.treatment(4, base = 4)
x.dummy <- model.matrix( ~ x.ord)[,-1]
x.data <- as.data.frame(x.dummy)
## fit linear regression to get multiple r-squared
x.fit <- lm(x.ord1 ~ x.ord2 + x.ord3, data = x.data)
## extract parameters
bern.prob <- mean(x.data$x.ord1)
r.squared.pred <- summary(x.fit)$adj.r.squared
## change parameters associated with predictor X
pred.dist <- list(dist = "bernoulli", prob = bern.prob)
power.z.logistic(base.prob = 0.15, odds.ratio = 1.416667,
alpha = 0.05, power = 0.80,
r.squared.pred = r.squared.pred,
dist = pred.dist)
#> +--------------------------------------------------+
#> | SAMPLE SIZE CALCULATION |
#> +--------------------------------------------------+
#>
#> Logistic Regression Coefficient (Wald's Z-Test)
#>
#> Method : Demidenko (Variance Corrected)
#> Predictor Dist. : Bernoulli
#>
#> ---------------------------------------------------
#> Hypotheses
#> ---------------------------------------------------
#> H0 (Null Claim) : Odds Ratio = 1
#> H1 (Alt. Claim) : Odds Ratio != 1
#>
#> ---------------------------------------------------
#> Results
#> ---------------------------------------------------
#> Sample Size = 3549 <<
#> Type 1 Error (alpha) = 0.050
#> Type 2 Error (beta) = 0.200
#> Statistical Power = 0.8
#>