compress_rows.data.frame "compresses" a data frame, returning unique rows and a tally of the number of times each row is repeated, as well as a permutation vector that can reconstruct the original data frame. decompress_rows.compressed_rows_df reconstructs the original data frame.

# S3 method for class 'data.frame'
compress_rows(x, ...)

# S3 method for class 'compressed_rows_df'
decompress_rows(x, ...)

Arguments

x

For compress_rows.data.frame a data.frame to be compressed. For decompress_rows.compress_rows_df a list as returned by compress_rows.data.frame.

...

Additional arguments, currently unused.

Value

For compress_rows.data.frame, a list with three elements:

rows

Unique rows of x

frequencies

A vector of the same length as the number or rows, giving the number of times the corresponding row is repeated

ordering

A vector such that if c is the compressed data frame, c$rows[c$ordering,,drop=FALSE] equals the original data frame, except for row names

rownames

Row names of x

For decompress_rows.compressed_rows_df, the original data frame.

See also

Examples


(x <- data.frame(V1=sample.int(3,30,replace=TRUE),
                 V2=sample.int(2,30,replace=TRUE),
                 V3=sample.int(4,30,replace=TRUE)))
#>    V1 V2 V3
#> 1   2  1  1
#> 2   2  1  1
#> 3   3  1  1
#> 4   3  2  1
#> 5   1  2  3
#> 6   3  1  4
#> 7   2  2  2
#> 8   2  1  2
#> 9   1  2  3
#> 10  2  2  2
#> 11  2  2  1
#> 12  1  1  1
#> 13  2  2  1
#> 14  2  1  2
#> 15  2  2  1
#> 16  1  1  3
#> 17  1  2  2
#> 18  1  1  1
#> 19  1  1  4
#> 20  1  2  1
#> 21  1  1  3
#> 22  2  2  4
#> 23  3  1  3
#> 24  3  2  3
#> 25  1  2  4
#> 26  1  2  4
#> 27  1  2  1
#> 28  2  1  3
#> 29  1  1  2
#> 30  2  2  2

(c <- compress_rows(x))
#>    V1 V2 V3
#> 12  1  1  1
#> 29  1  1  2
#> 16  1  1  3
#> 19  1  1  4
#> 20  1  2  1
#> 17  1  2  2
#> 5   1  2  3
#> 25  1  2  4
#> 1   2  1  1
#> 8   2  1  2
#> 28  2  1  3
#> 11  2  2  1
#> 7   2  2  2
#> 22  2  2  4
#> 3   3  1  1
#> 23  3  1  3
#> 6   3  1  4
#> 4   3  2  1
#> 24  3  2  3

stopifnot(all(decompress_rows(c)==x))