Tidy summarizes information about the components of a model. A model component might be a single term in a regression, a single hypothesis, a cluster, or a class. Exactly what tidy considers to be a model component varies across models but is usually self-evident. If a model has several distinct types of components, you will need to specify which components to return.
# S3 method for class 'emmGrid'
tidy(x, conf.int = FALSE, conf.level = 0.95, ...)
An emmGrid
object.
Logical indicating whether or not to include a confidence
interval in the tidied output. Defaults to FALSE
.
The confidence level to use for the confidence interval
if conf.int = TRUE
. Must be strictly greater than 0 and less than 1.
Defaults to 0.95, which corresponds to a 95 percent confidence interval.
Additional arguments passed to emmeans::summary.emmGrid()
or
lsmeans::summary.ref.grid()
. Cautionary note: misspecified arguments
may be silently ignored!
Returns a data frame with one observation for each estimated marginal mean, and one column for each combination of factors. When the input is a contrast, each row will contain one estimated contrast.
There are a large number of arguments that can be
passed on to emmeans::summary.emmGrid()
or lsmeans::summary.ref.grid()
.
tidy()
, emmeans::ref_grid()
, emmeans::emmeans()
,
emmeans::contrast()
Other emmeans tidiers:
tidy.lsmobj()
,
tidy.ref.grid()
,
tidy.summary_emm()
A tibble::tibble()
with columns:
Upper bound on the confidence interval for the estimate.
Lower bound on the confidence interval for the estimate.
Degrees of freedom used by this term in the model.
The two-sided p-value associated with the observed statistic.
The standard error of the regression term.
Expected marginal mean
T-ratio statistic
# load libraries for models and data
library(emmeans)
#> Welcome to emmeans.
#> Caution: You lose important information if you filter this package's results.
#> See '? untidy'
# linear model for sales of oranges per day
oranges_lm1 <- lm(sales1 ~ price1 + price2 + day + store, data = oranges)
# reference grid; see vignette("basics", package = "emmeans")
oranges_rg1 <- ref_grid(oranges_lm1)
td <- tidy(oranges_rg1)
#> Warning: Negative variance estimate obtained!
td
#> # A tibble: 36 × 9
#> price1 price2 day store estimate std.error df statistic p.value
#> <dbl> <dbl> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 51.2 48.6 1 1 2.92 2.72 23 1.07 0.294
#> 2 51.2 48.6 2 1 3.85 2.70 23 1.42 0.168
#> 3 51.2 48.6 3 1 11.0 2.53 23 4.35 0.000237
#> 4 51.2 48.6 4 1 6.10 2.65 23 2.30 0.0309
#> 5 51.2 48.6 5 1 12.8 3.51 23 3.64 0.00135
#> 6 51.2 48.6 6 1 8.75 3.59 23 2.44 0.0229
#> 7 51.2 48.6 1 2 4.96 3.12 23 1.59 0.125
#> 8 51.2 48.6 2 2 5.89 2.76 23 2.13 0.0438
#> 9 51.2 48.6 3 2 13.1 2.74 23 4.77 0.0000823
#> 10 51.2 48.6 4 2 8.14 2.74 23 2.97 0.00692
#> # ℹ 26 more rows
# marginal averages
marginal <- emmeans(oranges_rg1, "day")
tidy(marginal)
#> # A tibble: 6 × 6
#> day estimate std.error df statistic p.value
#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 1 5.56 1.88 23 2.95 0.00715
#> 2 2 6.49 1.59 23 4.08 0.000460
#> 3 3 13.7 1.58 23 8.64 0.0000000111
#> 4 4 8.74 1.14 23 7.64 0.0000000931
#> 5 5 15.4 3.16 23 4.89 0.0000612
#> 6 6 11.4 2.94 23 3.87 0.000772
# contrasts
tidy(contrast(marginal))
#> # A tibble: 6 × 8
#> term contrast null.value estimate std.error df statistic adj.p.value
#> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 day day1 effect 0 -4.65 1.31 23 -3.54 0.0104
#> 2 day day2 effect 0 -3.72 1.79 23 -2.08 0.0982
#> 3 day day3 effect 0 3.45 1.79 23 1.92 0.101
#> 4 day day4 effect 0 -1.47 1.97 23 -0.749 0.554
#> 5 day day5 effect 0 5.22 2.01 23 2.60 0.0475
#> 6 day day6 effect 0 1.18 2.22 23 0.530 0.601
tidy(contrast(marginal, method = "pairwise"))
#> # A tibble: 15 × 8
#> term contrast null.value estimate std.error df statistic adj.p.value
#> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 day day1 - day2 0 -0.930 2.47 23 -0.377 0.999
#> 2 day day1 - day3 0 -8.10 2.47 23 -3.29 0.0337
#> 3 day day1 - day4 0 -3.18 2.51 23 -1.27 0.799
#> 4 day day1 - day5 0 -9.88 2.56 23 -3.86 0.00913
#> 5 day day1 - day6 0 -5.83 2.52 23 -2.31 0.229
#> 6 day day2 - day3 0 -7.17 2.48 23 -2.89 0.0777
#> 7 day day2 - day4 0 -2.25 2.44 23 -0.920 0.937
#> 8 day day2 - day5 0 -8.95 3.08 23 -2.90 0.0756
#> 9 day day2 - day6 0 -4.90 3.54 23 -1.38 0.737
#> 10 day day3 - day4 0 4.92 2.49 23 1.98 0.385
#> 11 day day3 - day5 0 -1.78 3.08 23 -0.578 0.992
#> 12 day day3 - day6 0 2.27 3.52 23 0.644 0.986
#> 13 day day4 - day5 0 -6.70 3.62 23 -1.85 0.455
#> 14 day day4 - day6 0 -2.65 3.57 23 -0.744 0.974
#> 15 day day5 - day6 0 4.05 2.56 23 1.58 0.617
# plot confidence intervals
library(ggplot2)
ggplot(tidy(marginal, conf.int = TRUE), aes(day, estimate)) +
geom_point() +
geom_errorbar(aes(ymin = conf.low, ymax = conf.high))
# by multiple prices
by_price <- emmeans(oranges_lm1, "day",
by = "price2",
at = list(
price1 = 50, price2 = c(40, 60, 80),
day = c("2", "3", "4")
)
)
by_price
#> price2 = 40:
#> day emmean SE df lower.CL upper.CL
#> 2 6.24 1.74 23 2.63 9.84
#> 3 13.41 1.96 23 9.35 17.46
#> 4 8.48 1.31 23 5.78 11.19
#>
#> price2 = 60:
#> day emmean SE df lower.CL upper.CL
#> 2 9.21 1.98 23 5.11 13.32
#> 3 16.38 1.73 23 12.80 19.97
#> 4 11.46 1.73 23 7.88 15.04
#>
#> price2 = 80:
#> day emmean SE df lower.CL upper.CL
#> 2 12.19 3.58 23 4.79 19.59
#> 3 19.36 3.18 23 12.78 25.94
#> 4 14.44 3.50 23 7.20 21.67
#>
#> Results are averaged over the levels of: store
#> Confidence level used: 0.95
tidy(by_price)
#> # A tibble: 9 × 7
#> day price2 estimate std.error df statistic p.value
#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 2 40 6.24 1.74 23 3.58 0.00158
#> 2 3 40 13.4 1.96 23 6.83 0.000000574
#> 3 4 40 8.48 1.31 23 6.48 0.00000129
#> 4 2 60 9.21 1.98 23 4.64 0.000113
#> 5 3 60 16.4 1.73 23 9.45 0.00000000220
#> 6 4 60 11.5 1.73 23 6.63 0.000000923
#> 7 2 80 12.2 3.58 23 3.41 0.00242
#> 8 3 80 19.4 3.18 23 6.09 0.00000330
#> 9 4 80 14.4 3.50 23 4.13 0.000408
ggplot(tidy(by_price, conf.int = TRUE), aes(price2, estimate, color = day)) +
geom_line() +
geom_errorbar(aes(ymin = conf.low, ymax = conf.high))
# joint_tests
tidy(joint_tests(oranges_lm1))
#> # A tibble: 4 × 5
#> term num.df den.df statistic p.value
#> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 price1 1 23 30.3 0.0000134
#> 2 price2 1 23 2.23 0.149
#> 3 day 5 23 5.10 0.00273
#> 4 store 5 23 2.52 0.0583