add_columns() combines two or more data frames, but unlike
cbind or dplyr::bind_cols(), this function
binds data as last columns of a data frame (i.e., behind columns
specified in ...). This can be useful in a "pipe"-workflow, where
a data frame returned by a previous function should be appended
at the end of another data frame that is processed in
add_colums().
replace_columns() replaces all columns in data with
identically named columns in ..., and adds remaining (non-duplicated)
columns from ... to data.
add_id() simply adds an ID-column to the data frame, with values
from 1 to nrow(data), respectively for grouped data frames, values
from 1 to group size. See 'Examples'.
add_columns(data, ..., replace = TRUE)
replace_columns(data, ..., add.unique = TRUE)
add_id(data, var = "ID")A data frame. For add_columns(), will be bound after data
frames specified in .... For replace_columns(), duplicated
columns in data will be replaced by columns in ....
More data frames to combine, resp. more data frames with columns
that should replace columns in data.
Logical, if TRUE (default), columns in ... with
identical names in data will replace the columns in data.
The order of columns after replacing is preserved.
Logical, if TRUE (default), remaining columns in
... that did not replace any column in data, are appended
as new columns to data.
Name of new the ID-variable.
For add_columns(), a data frame, where columns of data
are appended after columns of ....
For replace_columns(), a data frame where columns in data
will be replaced by identically named columns in ..., and remaining
columns from ... will be appended to data (if
add.unique = TRUE).
For add_id(), a new column with ID numbers. This column is always
the first column in the returned data frame.
For add_columns(), by default, columns in data with
identical names like columns in one of the data frames in ...
will be dropped (i.e. variables with identical names in ... will
replace existing variables in data). Use replace = FALSE to
keep all columns. Identical column names will then be renamed, to ensure
unique column names (which happens by default when using
dplyr::bind_cols()). When replacing columns, replaced columns
are not added to the end of the data frame. Rather, the original order of
columns will be preserved.
data(efc)
d1 <- efc[, 1:3]
d2 <- efc[, 4:6]
if (require("dplyr") && require("sjlabelled")) {
head(bind_cols(d1, d2))
add_columns(d1, d2) %>% head()
d1 <- efc[, 1:3]
d2 <- efc[, 2:6]
add_columns(d1, d2, replace = TRUE) %>% head()
add_columns(d1, d2, replace = FALSE) %>% head()
# use case: we take the original data frame, select specific
# variables and do some transformations or recodings
# (standardization in this example) and add the new, transformed
# variables *to the end* of the original data frame
efc %>%
select(e17age, c160age) %>%
std() %>%
add_columns(efc) %>%
head()
# new variables with same name will overwrite old variables
# in "efc". order of columns is not changed.
efc %>%
select(e16sex, e42dep) %>%
to_factor() %>%
add_columns(efc) %>%
head()
# keep both old and new variables, automatically
# rename variables with identical name
efc %>%
select(e16sex, e42dep) %>%
to_factor() %>%
add_columns(efc, replace = FALSE) %>%
head()
# create sample data frames
d1 <- efc[, 1:10]
d2 <- efc[, 2:3]
d3 <- efc[, 7:8]
d4 <- efc[, 10:12]
# show original
head(d1)
library(sjlabelled)
# slightly change variables, to see effect
d2 <- as_label(d2)
d3 <- as_label(d3)
# replace duplicated columns, append remaining
replace_columns(d1, d2, d3, d4) %>% head()
# replace duplicated columns, omit remaining
replace_columns(d1, d2, d3, d4, add.unique = FALSE) %>% head()
# add ID to dataset
library(dplyr)
data(mtcars)
add_id(mtcars)
mtcars %>%
group_by(gear) %>%
add_id() %>%
arrange(gear, ID) %>%
print(n = 100)
}
#> Loading required package: dplyr
#>
#> Attaching package: ‘dplyr’
#> The following objects are masked from ‘package:stats’:
#>
#> filter, lag
#> The following objects are masked from ‘package:base’:
#>
#> intersect, setdiff, setequal, union
#> Loading required package: sjlabelled
#>
#> Attaching package: ‘sjlabelled’
#> The following object is masked from ‘package:dplyr’:
#>
#> as_label
#> The following objects are masked from ‘package:haven’:
#>
#> as_factor, read_sas, read_spss, read_stata, write_sas, zap_labels
#> New names:
#> • `e15relat` -> `e15relat...1`
#> • `e16sex` -> `e16sex...2`
#> • `e15relat` -> `e15relat...7`
#> • `e16sex` -> `e16sex...8`
#> New names:
#> • `e16sex` -> `e16sex...3`
#> • `e42dep` -> `e42dep...5`
#> • `e16sex` -> `e16sex...27`
#> • `e42dep` -> `e42dep...28`
#> # A tibble: 32 × 12
#> ID mpg cyl disp hp drat wt qsec vs am gear carb
#> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 1 21.4 6 258 110 3.08 3.22 19.4 1 0 3 1
#> 2 2 18.7 8 360 175 3.15 3.44 17.0 0 0 3 2
#> 3 3 18.1 6 225 105 2.76 3.46 20.2 1 0 3 1
#> 4 4 14.3 8 360 245 3.21 3.57 15.8 0 0 3 4
#> 5 5 16.4 8 276. 180 3.07 4.07 17.4 0 0 3 3
#> 6 6 17.3 8 276. 180 3.07 3.73 17.6 0 0 3 3
#> 7 7 15.2 8 276. 180 3.07 3.78 18 0 0 3 3
#> 8 8 10.4 8 472 205 2.93 5.25 18.0 0 0 3 4
#> 9 9 10.4 8 460 215 3 5.42 17.8 0 0 3 4
#> 10 10 14.7 8 440 230 3.23 5.34 17.4 0 0 3 4
#> 11 11 21.5 4 120. 97 3.7 2.46 20.0 1 0 3 1
#> 12 12 15.5 8 318 150 2.76 3.52 16.9 0 0 3 2
#> 13 13 15.2 8 304 150 3.15 3.44 17.3 0 0 3 2
#> 14 14 13.3 8 350 245 3.73 3.84 15.4 0 0 3 4
#> 15 15 19.2 8 400 175 3.08 3.84 17.0 0 0 3 2
#> 16 1 21 6 160 110 3.9 2.62 16.5 0 1 4 4
#> 17 2 21 6 160 110 3.9 2.88 17.0 0 1 4 4
#> 18 3 22.8 4 108 93 3.85 2.32 18.6 1 1 4 1
#> 19 4 24.4 4 147. 62 3.69 3.19 20 1 0 4 2
#> 20 5 22.8 4 141. 95 3.92 3.15 22.9 1 0 4 2
#> 21 6 19.2 6 168. 123 3.92 3.44 18.3 1 0 4 4
#> 22 7 17.8 6 168. 123 3.92 3.44 18.9 1 0 4 4
#> 23 8 32.4 4 78.7 66 4.08 2.2 19.5 1 1 4 1
#> 24 9 30.4 4 75.7 52 4.93 1.62 18.5 1 1 4 2
#> 25 10 33.9 4 71.1 65 4.22 1.84 19.9 1 1 4 1
#> 26 11 27.3 4 79 66 4.08 1.94 18.9 1 1 4 1
#> 27 12 21.4 4 121 109 4.11 2.78 18.6 1 1 4 2
#> 28 1 26 4 120. 91 4.43 2.14 16.7 0 1 5 2
#> 29 2 30.4 4 95.1 113 3.77 1.51 16.9 1 1 5 2
#> 30 3 15.8 8 351 264 4.22 3.17 14.5 0 1 5 4
#> 31 4 19.7 6 145 175 3.62 2.77 15.5 0 1 5 6
#> 32 5 15 8 301 335 3.54 3.57 14.6 0 1 5 8