Group and ungroup — group_by.dtplyr

These are methods for dplyr's dplyr::group_by() and dplyr::ungroup() generics. Grouping is translated to the either keyby and by argument of [.data.table depending on the value of the arrange argument.

# S3 method for class 'dtplyr_step'
group_by(.data, ..., .add = FALSE, arrange = TRUE)

# S3 method for class 'dtplyr_step'
ungroup(x, ...)

Arguments

.data

A lazy_dt()

...

In group_by(), variables or computations to group by. Computations are always done on the ungrouped data frame. To perform computations on the grouped data, you need to use a separate mutate() step before the group_by(). Computations are not allowed in nest_by(). In ungroup(), variables to remove from the grouping.

.add, add

When FALSE, the default, group_by() will override existing groups. To add to the existing groups, use .add = TRUE.

This argument was previously called add, but that prevented creating a new grouping variable called add, and conflicts with our naming conventions.

arrange

If TRUE, will automatically arrange the output of subsequent grouped operations by group. If FALSE, output order will be left unchanged. In the generated data.table code this switches between using the keyby (TRUE) and by (FALSE) arguments.

x

A tbl()

Examples

library(dplyr, warn.conflicts = FALSE)
dt <- lazy_dt(mtcars)

# group_by() is usually translated to `keyby` so that the groups
# are ordered in the output
dt %>%
 group_by(cyl) %>%
 summarise(mpg = mean(mpg))
#> Source: local data table [3 x 2]
#> Call:   `_DT15`[, .(mpg = mean(mpg)), keyby = .(cyl)]
#> 
#>     cyl   mpg
#>   <dbl> <dbl>
#> 1     4  26.7
#> 2     6  19.7
#> 3     8  15.1
#> 
#> # Use as.data.table()/as.data.frame()/as_tibble() to access results

# use `arrange = FALSE` to instead use `by` so the original order
# or groups is preserved
dt %>%
 group_by(cyl, arrange = FALSE) %>%
 summarise(mpg = mean(mpg))
#> Source: local data table [3 x 2]
#> Call:   `_DT15`[, .(mpg = mean(mpg)), by = .(cyl)]
#> 
#>     cyl   mpg
#>   <dbl> <dbl>
#> 1     6  19.7
#> 2     4  26.7
#> 3     8  15.1
#> 
#> # Use as.data.table()/as.data.frame()/as_tibble() to access results