Adds p-values to tables created by tbl_summary()
by comparing values across groups.
# S3 method for class 'tbl_summary'
add_p(
x,
test = NULL,
pvalue_fun = label_style_pvalue(digits = 1),
group = NULL,
include = everything(),
test.args = NULL,
adj.vars = NULL,
...
)
(tbl_summary
)
table created with tbl_summary()
(formula-list-selector
)
Specifies the statistical tests to perform for each variable, e.g.
list(all_continuous() ~ "t.test", all_categorical() ~ "fisher.test")
.
See below for details on default tests and ?tests for details on available tests and creating custom tests.
(function
)
Function to round and format p-values. Default is label_style_pvalue()
.
The function must have a numeric vector input, and return a string that is
the rounded/formatted p-value (e.g. pvalue_fun = label_style_pvalue(digits = 2)
).
(tidy-select
)
Variable name of an ID or grouping variable. The column can be used to
calculate p-values with correlated data.
Default is NULL
. See tests for methods that utilize the group
argument.
(tidy-select
)
Variables to include in output. Default is everything()
.
(formula-list-selector
)
Containing additional arguments to pass to tests that accept arguments.
For example, add an argument for all t-tests, use
test.args = all_tests("t.test") ~ list(var.equal = TRUE)
.
(tidy-select
)
Variables to include in adjusted calculations (e.g. in ANCOVA models).
Default is NULL
.
These dots are for future extensions and must be empty.
a gtsummary table of class "tbl_summary"
See the ?tests help file for details on available tests and creating custom tests. The ?tests help file also includes pseudo-code for each test to be clear precisely how the calculation is performed.
The default test used in add_p()
primarily depends on these factors:
whether the variable is categorical/dichotomous vs continuous
number of levels in the tbl_summary(by)
variable
whether the add_p(group)
argument is specified
whether the add_p(adj.vars)
argument is specified
add_p(group)
nor add_p(adj.vars)
"wilcox.test"
when by
variable has two levels and variable is continuous.
"kruskal.test"
when by
variable has more than two levels and variable is continuous.
"chisq.test.no.correct"
for categorical variables with all expected cell counts >=5,
and "fisher.test"
for categorical variables with any expected cell count <5.
# Example 1 ----------------------------------
trial |>
tbl_summary(by = trt, include = c(age, grade)) |>
add_p()
Characteristic
Drug A
N = 981
Drug B
N = 1021
p-value2
1 Median (Q1, Q3); n (%)
2 Wilcoxon rank sum test; Pearson’s Chi-squared test
# Example 2 ----------------------------------
trial |>
select(trt, age, marker) |>
tbl_summary(by = trt, missing = "no") |>
add_p(
# perform t-test for all variables
test = everything() ~ "t.test",
# assume equal variance in the t-test
test.args = all_tests("t.test") ~ list(var.equal = TRUE)
)
Characteristic
Drug A
N = 981
Drug B
N = 1021
p-value2
1 Median (Q1, Q3)
2 Two Sample t-test