Tool to transform any type of vector, or even combination of vectors, into an integer vector ranging from 1 to the number of unique values. This actually creates an unique identifier vector.
to_integer(
...,
sorted = FALSE,
add_items = FALSE,
items.list = FALSE,
multi.df = FALSE,
multi.join = "_",
internal = FALSE
)Vectors of any type, to be transformed in integer.
Logical, default is FALSE. Whether the integer vector should make reference
to sorted values?
Logical, default is FALSE. Whether to add the unique values of the
original vector(s). If requested, an attribute items is created containing the
values (alternatively, they can appear in a list if items.list=TRUE).
Logical, default is FALSE. Only used if add_items=TRUE. If TRUE,
then a list of length 2 is returned with x the integer vector and items the vector of items.
Logical, default is FALSE. If TRUE then a data.frame listing the
unique elements is returned in the form of a data.frame. Ignored if add_items = FALSE.
Character scalar used to join the items of multiple vectors.
The default is "_". Ignored if add_items = FALSE.
Logical, default is FALSE. For programming only. If this function
is used within another function, setting internal = TRUE is needed to make the
evaluation of ... valid. End users of to_integer should not care.
Reruns a vector of the same length as the input vectors.
If add_items=TRUE and items.list=TRUE, a list of two elements is returned: x
being the integer vector and items being the unique values to which the values
in x make reference.
x1 = iris$Species
x2 = as.integer(iris$Sepal.Length)
# transforms the species vector into integers
to_integer(x1)
#> [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
#> [38] 1 1 1 1 1 1 1 1 1 1 1 1 1 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
#> [75] 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 2 2 2 2 2 2 2 2 2 2 2
#> [112] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
#> [149] 2 2
# To obtain the "items":
to_integer(x1, add_items = TRUE)
#> [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
#> [38] 1 1 1 1 1 1 1 1 1 1 1 1 1 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
#> [75] 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 2 2 2 2 2 2 2 2 2 2 2
#> [112] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
#> [149] 2 2
#> attr(,"items")
#> [1] "setosa" "virginica" "versicolor"
# same but in list form
to_integer(x1, add_items = TRUE, items.list = TRUE)
#> $x
#> [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
#> [38] 1 1 1 1 1 1 1 1 1 1 1 1 1 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
#> [75] 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 2 2 2 2 2 2 2 2 2 2 2
#> [112] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
#> [149] 2 2
#>
#> $items
#> [1] "setosa" "virginica" "versicolor"
#>
# transforms x2 into an integer vector from 1 to 4
to_integer(x2, add_items = TRUE)
#> [1] 1 2 2 2 1 1 2 1 2 2 1 2 2 2 1 1 1 1 1 1 1 1 2 1 2 1 1 1 1 2 2 1 1 1 2 1 1
#> [38] 2 2 1 1 2 2 1 1 2 1 2 1 1 3 4 4 1 4 1 4 2 4 1 1 1 4 4 1 4 1 1 4 1 1 4 4 4
#> [75] 4 4 4 4 4 1 1 1 1 4 1 4 4 4 1 1 1 4 1 1 1 1 1 4 1 1 4 1 3 4 4 3 2 3 4 3 4
#> [112] 4 4 1 1 4 4 3 3 4 4 1 3 4 4 3 4 4 4 3 3 3 4 4 4 3 4 4 4 4 4 4 1 4 4 4 4 4
#> [149] 4 1
#> attr(,"items")
#> [1] 5 4 7 6
# To have the sorted items:
to_integer(x2, add_items = TRUE, sorted = TRUE)
#> [1] 2 1 1 1 2 2 1 2 1 1 2 1 1 1 2 2 2 2 2 2 2 2 1 2 1 2 2 2 2 1 1 2 2 2 1 2 2
#> [38] 1 1 2 2 1 1 2 2 1 2 1 2 2 4 3 3 2 3 2 3 1 3 2 2 2 3 3 2 3 2 2 3 2 2 3 3 3
#> [75] 3 3 3 3 3 2 2 2 2 3 2 3 3 3 2 2 2 3 2 2 2 2 2 3 2 2 3 2 4 3 3 4 1 4 3 4 3
#> [112] 3 3 2 2 3 3 4 4 3 3 2 4 3 3 4 3 3 3 4 4 4 3 3 3 4 3 3 3 3 3 3 2 3 3 3 3 3
#> [149] 3 2
#> attr(,"items")
#> [1] 4 5 6 7
# The result can safely be used as an index
res = to_integer(x2, add_items = TRUE, sorted = TRUE, items.list = TRUE)
all(res$items[res$x] == x2)
#> [1] TRUE
#
# Multiple vectors
#
to_integer(x1, x2, add_items = TRUE)
#> [1] 2 1 1 1 2 2 1 2 1 1 2 1 1 1 2 2 2 2 2 2 2 2 1 2 1
#> [26] 2 2 2 2 1 1 2 2 2 1 2 2 1 1 2 2 1 1 2 2 1 2 1 2 2
#> [51] 6 5 5 4 5 4 5 3 5 4 4 4 5 5 4 5 4 4 5 4 4 5 5 5 5
#> [76] 5 5 5 5 4 4 4 4 5 4 5 5 5 4 4 4 5 4 4 4 4 4 5 4 4
#> [101] 9 8 10 9 9 10 7 10 9 10 9 9 9 8 8 9 9 10 10 9 9 8 10 9 9
#> [126] 10 9 9 9 10 10 10 9 9 9 10 9 9 9 9 9 9 8 9 9 9 9 9 9 8
#> attr(,"items")
#> [1] "setosa_4" "setosa_5" "versicolor_4" "versicolor_5" "versicolor_6"
#> [6] "versicolor_7" "virginica_4" "virginica_5" "virginica_6" "virginica_7"
# You can use multi.join to handle the join of the items:
to_integer(x1, x2, add_items = TRUE, multi.join = "; ")
#> [1] 2 1 1 1 2 2 1 2 1 1 2 1 1 1 2 2 2 2 2 2 2 2 1 2 1
#> [26] 2 2 2 2 1 1 2 2 2 1 2 2 1 1 2 2 1 1 2 2 1 2 1 2 2
#> [51] 6 5 5 4 5 4 5 3 5 4 4 4 5 5 4 5 4 4 5 4 4 5 5 5 5
#> [76] 5 5 5 5 4 4 4 4 5 4 5 5 5 4 4 4 5 4 4 4 4 4 5 4 4
#> [101] 9 8 10 9 9 10 7 10 9 10 9 9 9 8 8 9 9 10 10 9 9 8 10 9 9
#> [126] 10 9 9 9 10 10 10 9 9 9 10 9 9 9 9 9 9 8 9 9 9 9 9 9 8
#> attr(,"items")
#> [1] "setosa; 4" "setosa; 5" "versicolor; 4" "versicolor; 5"
#> [5] "versicolor; 6" "versicolor; 7" "virginica; 4" "virginica; 5"
#> [9] "virginica; 6" "virginica; 7"