Adds a flag variable to the input dataset which indicates if there exists at least one observation in another dataset fulfilling a certain condition.
Note: This is a wrapper function for the more generic derive_vars_merged()
.
derive_var_merged_exist_flag(
dataset,
dataset_add,
by_vars,
new_var,
condition,
true_value = "Y",
false_value = NA_character_,
missing_value = NA_character_,
filter_add = NULL
)
Input dataset
The variables specified by the by_vars
argument are expected to be in the dataset.
a dataset, i.e., a data.frame
or tibble
none
Additional dataset
The variables specified by the by_vars
argument are expected.
a dataset, i.e., a data.frame
or tibble
none
Grouping variables
list of variables created by exprs()
, e.g., exprs(USUBJID, VISIT)
none
New variable
The specified variable is added to the input dataset.
an unquoted symbol, e.g., AVAL
none
Condition
The condition is evaluated at the additional dataset (dataset_add
). For
all by groups where it evaluates as TRUE
at least once the new variable
is set to the true value (true_value
). For all by groups where it
evaluates as FALSE
or NA
for all observations the new variable is set
to the false value (false_value
). The new variable is set to the missing
value (missing_value
) for by groups not present in the additional
dataset.
an unquoted condition, e.g., AVISIT == "BASELINE"
none
True value
a character scalar, i.e., a character vector of length one
"Y"
False value
a character scalar, i.e., a character vector of length one
NA_character_
Value used for missing information
The new variable is set to the specified value for all by groups without observations in the additional dataset.
a character scalar, i.e., a character vector of length one
NA_character_
Filter for additional data
Only observations fulfilling the specified condition are taken into account for flagging. If the argument is not specified, all observations are considered.
an unquoted condition, e.g., AVISIT == "BASELINE"
NULL
The output dataset contains all observations and variables of the
input dataset and additionally the variable specified for new_var
derived
from the additional dataset (dataset_add
).
The additional dataset is restricted to the observations matching the
filter_add
condition.
The new variable is added to the input dataset and set to the true value
(true_value
) if for the by group at least one observation exists in the
(restricted) additional dataset where the condition evaluates to TRUE
. It
is set to the false value (false_value
) if for the by group at least one
observation exists and for all observations the condition evaluates to
FALSE
or NA
. Otherwise, it is set to the missing value
(missing_value
).
General Derivation Functions for all ADaMs that returns variable appended to dataset:
derive_var_extreme_flag()
,
derive_var_joined_exist_flag()
,
derive_var_merged_ef_msrc()
,
derive_var_merged_summary()
,
derive_var_obs_number()
,
derive_var_relative_flag()
,
derive_vars_cat()
,
derive_vars_computed()
,
derive_vars_joined()
,
derive_vars_joined_summary()
,
derive_vars_merged()
,
derive_vars_merged_lookup()
,
derive_vars_transposed()
library(dplyr, warn.conflicts = FALSE)
dm <- tribble(
~STUDYID, ~DOMAIN, ~USUBJID, ~AGE, ~AGEU,
"PILOT01", "DM", "01-1028", 71, "YEARS",
"PILOT01", "DM", "04-1127", 84, "YEARS",
"PILOT01", "DM", "06-1049", 60, "YEARS"
)
ae <- tribble(
~STUDYID, ~DOMAIN, ~USUBJID, ~AETERM, ~AEREL,
"PILOT01", "AE", "01-1028", "ERYTHEMA", "POSSIBLE",
"PILOT01", "AE", "01-1028", "PRURITUS", "PROBABLE",
"PILOT01", "AE", "06-1049", "SYNCOPE", "POSSIBLE",
"PILOT01", "AE", "06-1049", "SYNCOPE", "PROBABLE"
)
derive_var_merged_exist_flag(
dm,
dataset_add = ae,
by_vars = exprs(STUDYID, USUBJID),
new_var = AERELFL,
condition = AEREL == "PROBABLE"
) %>%
select(STUDYID, USUBJID, AGE, AGEU, AERELFL)
#> # A tibble: 3 × 5
#> STUDYID USUBJID AGE AGEU AERELFL
#> <chr> <chr> <dbl> <chr> <chr>
#> 1 PILOT01 01-1028 71 YEARS Y
#> 2 PILOT01 04-1127 84 YEARS NA
#> 3 PILOT01 06-1049 60 YEARS Y
vs <- tribble(
~STUDYID, ~DOMAIN, ~USUBJID, ~VISIT, ~VSTESTCD, ~VSSTRESN, ~VSBLFL,
"PILOT01", "VS", "01-1028", "SCREENING", "HEIGHT", 177.8, NA,
"PILOT01", "VS", "01-1028", "SCREENING", "WEIGHT", 98.88, NA,
"PILOT01", "VS", "01-1028", "BASELINE", "WEIGHT", 99.34, "Y",
"PILOT01", "VS", "01-1028", "WEEK 4", "WEIGHT", 98.88, NA,
"PILOT01", "VS", "04-1127", "SCREENING", "HEIGHT", 165.1, NA,
"PILOT01", "VS", "04-1127", "SCREENING", "WEIGHT", 42.87, NA,
"PILOT01", "VS", "04-1127", "BASELINE", "WEIGHT", 41.05, "Y",
"PILOT01", "VS", "04-1127", "WEEK 4", "WEIGHT", 41.73, NA,
"PILOT01", "VS", "06-1049", "SCREENING", "HEIGHT", 167.64, NA,
"PILOT01", "VS", "06-1049", "SCREENING", "WEIGHT", 57.61, NA,
"PILOT01", "VS", "06-1049", "BASELINE", "WEIGHT", 57.83, "Y",
"PILOT01", "VS", "06-1049", "WEEK 4", "WEIGHT", 58.97, NA
)
derive_var_merged_exist_flag(
dm,
dataset_add = vs,
by_vars = exprs(STUDYID, USUBJID),
filter_add = VSTESTCD == "WEIGHT" & VSBLFL == "Y",
new_var = WTBLHIFL,
condition = VSSTRESN > 90,
false_value = "N",
missing_value = "M"
) %>%
select(STUDYID, USUBJID, AGE, AGEU, WTBLHIFL)
#> # A tibble: 3 × 5
#> STUDYID USUBJID AGE AGEU WTBLHIFL
#> <chr> <chr> <dbl> <chr> <chr>
#> 1 PILOT01 01-1028 71 YEARS Y
#> 2 PILOT01 04-1127 84 YEARS N
#> 3 PILOT01 06-1049 60 YEARS N