formulaR/validation.R
validate_no_formula_duplication.Rdvalidate - asserts the following:
formula must not have duplicates terms on the left and right hand
side of the formula.
check - returns the following:
ok A logical. Does the check pass?
duplicates A character vector. The duplicate terms.
validate_no_formula_duplication(formula, original = FALSE)
check_no_formula_duplication(formula, original = FALSE)validate_no_formula_duplication() returns formula invisibly.
check_no_formula_duplication() returns a named list of two components,
ok and duplicates.
hardhat provides validation functions at two levels.
check_*(): check a condition, and return a list. The list
always contains at least one element, ok, a logical that specifies if the
check passed. Each check also has check specific elements in the returned
list that can be used to construct meaningful error messages.
validate_*(): check a condition, and error if it does not pass. These
functions call their corresponding check function, and
then provide a default error message. If you, as a developer, want a
different error message, then call the check_*() function yourself,
and provide your own validation function.
Other validation functions:
validate_column_names(),
validate_outcomes_are_binary(),
validate_outcomes_are_factors(),
validate_outcomes_are_numeric(),
validate_outcomes_are_univariate(),
validate_prediction_size(),
validate_predictors_are_numeric()
# All good
check_no_formula_duplication(y ~ x)
#> $ok
#> [1] TRUE
#>
#> $duplicates
#> character(0)
#>
# Not good!
check_no_formula_duplication(y ~ y)
#> $ok
#> [1] FALSE
#>
#> $duplicates
#> [1] "y"
#>
# This is generally okay
check_no_formula_duplication(y ~ log(y))
#> $ok
#> [1] TRUE
#>
#> $duplicates
#> character(0)
#>
# But you can be more strict
check_no_formula_duplication(y ~ log(y), original = TRUE)
#> $ok
#> [1] FALSE
#>
#> $duplicates
#> [1] "y"
#>
# This would throw an error
try(validate_no_formula_duplication(log(y) ~ log(y)))
#> Error in validate_no_formula_duplication(log(y) ~ log(y)) :
#> Terms must not be duplicated on the left- and right-hand side of the
#> `formula`.
#> ℹ The following duplicated term was found: "log(y)"