AVALCATy
and AVALCAyN
R/derive_vars_cat.R
derive_vars_cat.Rd
Derive Categorization Variables Like AVALCATy
and AVALCAyN
derive_vars_cat(dataset, definition, by_vars = NULL)
Input dataset
The variables specified by the by_vars
and definition
arguments are expected to be in the dataset.
none
List of expressions created by exprs()
.
Must be in rectangular format and specified using the same syntax as when creating
a tibble
using the tribble()
function.
The definition
object will be converted to a tibble
using tribble()
inside this function.
Must contain:
the column condition
which will be converted to a logical expression and
will be used on the dataset
input.
at least one additional column with the new column name and the category value(s) used by the logical expression.
the column specified in by_vars
(if by_vars
is specified)
e.g. if by_vars
is not specified:
exprs(~condition, ~AVALCAT1, ~AVALCA1N,
AVAL >= 140, ">=140 cm", 1,
AVAL < 140, "<140 cm", 2)
e.g. if by_vars
is specified as exprs(VSTEST)
:
exprs(~VSTEST, ~condition, ~AVALCAT1, ~AVALCA1N,
"Height", AVAL >= 140, ">=140 cm", 1,
"Height", AVAL < 140, "<140 cm", 2)
none
list of expressions with one element. NULL
by default.
Allows for specifying by groups, e.g. exprs(PARAMCD)
.
Variable must be present in both dataset
and definition
.
The conditions in definition
are applied only to those records that match by_vars
.
The categorization variables are set to NA
for records
not matching any of the by groups in definition
.
NULL
The input dataset with the new variables defined in definition
added
If conditions are overlapping, the row order of definitions
must be carefully considered.
The first match will determine the category.
i.e. if
AVAL = 155
and the definition
is:
definition <- exprs(
~VSTEST, ~condition, ~AVALCAT1, ~AVALCA1N,
"Height", AVAL > 170, ">170 cm", 1,
"Height", AVAL <= 170, "<=170 cm", 2,
"Height", AVAL <= 160, "<=160 cm", 3
)
then AVALCAT1
will be "<=170 cm"
, as this is the first match for AVAL
.
If you specify:
definition <- exprs(
~VSTEST, ~condition, ~AVALCAT1, ~AVALCA1N,
"Height", AVAL <= 160, "<=160 cm", 3,
"Height", AVAL <= 170, "<=170 cm", 2,
"Height", AVAL > 170, ">170 cm", 1
)
Then AVAL <= 160
will lead to AVALCAT1 == "<=160 cm"
,
AVAL
in-between 160
and 170
will lead to AVALCAT1 == "<=170 cm"
,
and AVAL <= 170
will lead to AVALCAT1 == ">170 cm"
.
However, we suggest to be more explicit when defining the condition
, to avoid overlap.
In this case, the middle condition should be:
AVAL <= 170 & AVAL > 160
General Derivation Functions for all ADaMs that returns variable appended to dataset:
derive_var_extreme_flag()
,
derive_var_joined_exist_flag()
,
derive_var_merged_ef_msrc()
,
derive_var_merged_exist_flag()
,
derive_var_merged_summary()
,
derive_var_obs_number()
,
derive_var_relative_flag()
,
derive_vars_computed()
,
derive_vars_joined()
,
derive_vars_joined_summary()
,
derive_vars_merged()
,
derive_vars_merged_lookup()
,
derive_vars_transposed()
library(dplyr)
library(tibble)
advs <- tibble::tribble(
~USUBJID, ~VSTEST, ~AVAL,
"01-701-1015", "Height", 147.32,
"01-701-1015", "Weight", 53.98,
"01-701-1023", "Height", 162.56,
"01-701-1023", "Weight", NA,
"01-701-1028", "Height", NA,
"01-701-1028", "Weight", NA,
"01-701-1033", "Height", 175.26,
"01-701-1033", "Weight", 88.45
)
definition <- exprs(
~condition, ~AVALCAT1, ~AVALCA1N, ~NEWCOL,
VSTEST == "Height" & AVAL > 160, ">160 cm", 1, "extra1",
VSTEST == "Height" & AVAL <= 160, "<=160 cm", 2, "extra2"
)
derive_vars_cat(
dataset = advs,
definition = definition
)
#> # A tibble: 8 × 6
#> USUBJID VSTEST AVAL AVALCAT1 AVALCA1N NEWCOL
#> <chr> <chr> <dbl> <chr> <dbl> <chr>
#> 1 01-701-1015 Height 147. <=160 cm 2 extra2
#> 2 01-701-1015 Weight 54.0 NA NA NA
#> 3 01-701-1023 Height 163. >160 cm 1 extra1
#> 4 01-701-1023 Weight NA NA NA NA
#> 5 01-701-1028 Height NA NA NA NA
#> 6 01-701-1028 Weight NA NA NA NA
#> 7 01-701-1033 Height 175. >160 cm 1 extra1
#> 8 01-701-1033 Weight 88.4 NA NA NA
# Using by_vars:
definition2 <- exprs(
~VSTEST, ~condition, ~AVALCAT1, ~AVALCA1N,
"Height", AVAL > 160, ">160 cm", 1,
"Height", AVAL <= 160, "<=160 cm", 2,
"Weight", AVAL > 70, ">70 kg", 1,
"Weight", AVAL <= 70, "<=70 kg", 2
)
derive_vars_cat(
dataset = advs,
definition = definition2,
by_vars = exprs(VSTEST)
)
#> # A tibble: 8 × 5
#> USUBJID VSTEST AVAL AVALCAT1 AVALCA1N
#> <chr> <chr> <dbl> <chr> <dbl>
#> 1 01-701-1015 Height 147. <=160 cm 2
#> 2 01-701-1015 Weight 54.0 <=70 kg 2
#> 3 01-701-1023 Height 163. >160 cm 1
#> 4 01-701-1023 Weight NA NA NA
#> 5 01-701-1028 Height NA NA NA
#> 6 01-701-1028 Weight NA NA NA
#> 7 01-701-1033 Height 175. >160 cm 1
#> 8 01-701-1033 Weight 88.4 >70 kg 1
# With three conditions:
definition3 <- exprs(
~VSTEST, ~condition, ~AVALCAT1, ~AVALCA1N,
"Height", AVAL > 170, ">170 cm", 1,
"Height", AVAL <= 170 & AVAL > 160, "<=170 cm", 2,
"Height", AVAL <= 160, "<=160 cm", 3
)
derive_vars_cat(
dataset = advs,
definition = definition3,
by_vars = exprs(VSTEST)
)
#> # A tibble: 8 × 5
#> USUBJID VSTEST AVAL AVALCAT1 AVALCA1N
#> <chr> <chr> <dbl> <chr> <dbl>
#> 1 01-701-1015 Height 147. <=160 cm 3
#> 2 01-701-1015 Weight 54.0 NA NA
#> 3 01-701-1023 Height 163. <=170 cm 2
#> 4 01-701-1023 Weight NA NA NA
#> 5 01-701-1028 Height NA NA NA
#> 6 01-701-1028 Weight NA NA NA
#> 7 01-701-1033 Height 175. >170 cm 1
#> 8 01-701-1033 Weight 88.4 NA NA
# Let's derive both the MCRITyML and the MCRITyMN variables
adlb <- tibble::tribble(
~USUBJID, ~PARAM, ~AVAL, ~AVALU, ~ANRHI,
"01-701-1015", "ALT", 150, "U/L", 40,
"01-701-1023", "ALT", 70, "U/L", 40,
"01-701-1036", "ALT", 130, "U/L", 40,
"01-701-1048", "ALT", 30, "U/L", 40,
"01-701-1015", "AST", 50, "U/L", 35
)
definition_mcrit <- exprs(
~PARAM, ~condition, ~MCRIT1ML, ~MCRIT1MN,
"ALT", AVAL <= ANRHI, "<=ANRHI", 1,
"ALT", ANRHI < AVAL & AVAL <= 3 * ANRHI, ">1-3*ANRHI", 2,
"ALT", 3 * ANRHI < AVAL, ">3*ANRHI", 3
)
adlb %>%
derive_vars_cat(
definition = definition_mcrit,
by_vars = exprs(PARAM)
)
#> # A tibble: 5 × 7
#> USUBJID PARAM AVAL AVALU ANRHI MCRIT1ML MCRIT1MN
#> <chr> <chr> <dbl> <chr> <dbl> <chr> <dbl>
#> 1 01-701-1015 ALT 150 U/L 40 >3*ANRHI 3
#> 2 01-701-1023 ALT 70 U/L 40 >1-3*ANRHI 2
#> 3 01-701-1036 ALT 130 U/L 40 >3*ANRHI 3
#> 4 01-701-1048 ALT 30 U/L 40 <=ANRHI 1
#> 5 01-701-1015 AST 50 U/L 35 NA NA