
Recode and Replace Values in Matrix-Like Objects
recode-replace.RdA small suite of functions to efficiently perform common recoding and replacing tasks in matrix-like objects.
Usage
recode_num(X, ..., default = NULL, missing = NULL, set = FALSE)
recode_char(X, ..., default = NULL, missing = NULL, regex = FALSE,
ignore.case = FALSE, fixed = FALSE, set = FALSE)
replace_na(X, value = 0, cols = NULL, set = FALSE, type = "const")
replace_inf(X, value = NA, replace.nan = FALSE, set = FALSE)
replace_outliers(X, limits, value = NA,
single.limit = c("sd", "mad", "min", "max"),
ignore.groups = FALSE, set = FALSE)Arguments
- X
a vector, matrix, array, data frame or list of atomic objects.
replace_outliershas internal methods for grouped and indexed data.- ...
comma-separated recode arguments of the form:
value = replacement, `2` = 0, Secondary = "SEC"etc.recode_charwithregex = TRUEalso supports regular expressions i.e.`^S|D$` = "STD"etc.- default
optional argument to specify a scalar value to replace non-matched elements with.
- missing
optional argument to specify a scalar value to replace missing elements with. Note that to increase efficiency this is done before the rest of the recoding i.e. the recoding is performed on data where missing values are filled!
- set
logical.
TRUEdoes replacements by reference (i.e. in-place modification of the data) and returns the result invisibly.- type
character. One of
"const","locf"(last non-missing observation carried forward) or"focb"(first non-missing observation carried back). The latter two ignorevalue.- regex
logical. If
TRUE, all recode-argument names are (sequentially) passed togreplas a pattern to searchX. All matches are replaced. Note thatNA's are also matched as strings bygrepl.- value
a single (scalar) value to replace matching elements with. In
replace_outlierssettingvalue = "clip"will replace outliers with the corresponding threshold values. See Examples.- cols
select columns to replace missing values in using a function, column names, indices or a logical vector.
- replace.nan
logical.
TRUEreplacesNaN/Inf/-Inf.FALSE(default) replaces onlyInf/-Inf.- limits
either a vector of two-numeric values
c(minval, maxval)constituting a two-sided outlier threshold, or a single numeric value:- single.limit
character, controls the behavior if
length(limits) == 1:"sd"/"mad":limitswill be interpreted as a (two-sided) outlier threshold in terms of (column) standard deviations/median absolute deviations. For the standard deviation this is equivalent toX[abs(fscale(X)) > limits] <- value. Sincefscaleis S3 generic with methods for 'grouped_df', 'pseries' and 'pdata.frame', the standardizing will be grouped if such objects are passed (i.e. the outlier threshold is then measured in within-group standard deviations) unlessignore.groups = TRUE. The same holds for median absolute deviations."min"/"max":limitswill be interpreted as a (one-sided) minimum/maximum threshold. The underlying code is equivalent toX[X </> limits] <- value.
- ignore.groups
logical. If
length(limits) == 1andsingle.limit %in% c("sd", "mad")andXis a 'grouped_df', 'pseries' or 'pdata.frame',TRUEwill ignore the grouped nature of the data and calculate outlier thresholds on the entire dataset rather than within each group.- ignore.case, fixed
logical. Passed to
grepland only applicable ifregex = TRUE.
Details
recode_numandrecode_charcan be used to efficiently recode multiple numeric or character values, respectively. The syntax is inspired bydplyr::recode, but the functionality is enhanced in the following respects: (1) when passed a data frame / list, all appropriately typed columns will be recoded. (2) They preserve the attributes of the data object and of columns in a data frame / list, and (3)recode_charalso supports regular expression matching usinggrepl.replace_naefficiently replacesNA/NaNwith a value (default is0). data can be multi-typed, in which case appropriate columns can be selected through thecolsargument. For numeric data a more versatile alternative is provided bydata.table::nafillanddata.table::setnafill.replace_infreplacesInf/-Inf(or optionallyNaN/Inf/-Inf) with a value (default isNA). It skips non-numeric columns in a data frame.replace_outliersreplaces values falling outside a 1- or 2-sided numeric threshold or outside a certain number of standard deviations or median absolute deviation with a value (default isNA). It skips non-numeric columns in a data frame.
Note
These functions are not generic and do not offer support for factors or date(-time) objects. see dplyr::recode_factor, forcats and other appropriate packages for dealing with these classes.
Simple replacing tasks on a vector can also effectively be handled by, setv / copyv. Fast vectorized switches are offered by package kit (functions iif, nif, vswitch, nswitch) as well as data.table::fcase and data.table::fifelse. Using switches is more efficient than recode_*, as recode_* creates an internal copy of the object to enable cross-replacing.
Function TRA, and the associated TRA ('transform') argument to Fast Statistical Functions also has option "replace_na", to replace missing values with a statistic computed on the non-missing observations, e.g. fmedian(airquality, TRA = "replace_na") does median imputation.
Examples
recode_char(c("a","b","c"), a = "b", b = "c")
#> [1] "b" "c" "c"
recode_char(month.name, ber = NA, regex = TRUE)
#> [1] "January" "February" "March" "April" "May" "June"
#> [7] "July" "August" NA NA NA NA
mtcr <- recode_num(mtcars, `0` = 2, `4` = Inf, `1` = NaN)
replace_inf(mtcr)
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> Mazda RX4 21.0 6 160.0 110 3.90 2.620 16.46 2 NaN NA NA
#> Mazda RX4 Wag 21.0 6 160.0 110 3.90 2.875 17.02 2 NaN NA NA
#> Datsun 710 22.8 NA 108.0 93 3.85 2.320 18.61 NaN NaN NA NaN
#> Hornet 4 Drive 21.4 6 258.0 110 3.08 3.215 19.44 NaN 2 3 NaN
#> Hornet Sportabout 18.7 8 360.0 175 3.15 3.440 17.02 2 2 3 2
#> Valiant 18.1 6 225.0 105 2.76 3.460 20.22 NaN 2 3 NaN
#> Duster 360 14.3 8 360.0 245 3.21 3.570 15.84 2 2 3 NA
#> Merc 240D 24.4 NA 146.7 62 3.69 3.190 20.00 NaN 2 NA 2
#> Merc 230 22.8 NA 140.8 95 3.92 3.150 22.90 NaN 2 NA 2
#> Merc 280 19.2 6 167.6 123 3.92 3.440 18.30 NaN 2 NA NA
#> Merc 280C 17.8 6 167.6 123 3.92 3.440 18.90 NaN 2 NA NA
#> Merc 450SE 16.4 8 275.8 180 3.07 4.070 17.40 2 2 3 3
#> Merc 450SL 17.3 8 275.8 180 3.07 3.730 17.60 2 2 3 3
#> Merc 450SLC 15.2 8 275.8 180 3.07 3.780 18.00 2 2 3 3
#> Cadillac Fleetwood 10.4 8 472.0 205 2.93 5.250 17.98 2 2 3 NA
#> Lincoln Continental 10.4 8 460.0 215 3.00 5.424 17.82 2 2 3 NA
#> Chrysler Imperial 14.7 8 440.0 230 3.23 5.345 17.42 2 2 3 NA
#> Fiat 128 32.4 NA 78.7 66 4.08 2.200 19.47 NaN NaN NA NaN
#> Honda Civic 30.4 NA 75.7 52 4.93 1.615 18.52 NaN NaN NA 2
#> Toyota Corolla 33.9 NA 71.1 65 4.22 1.835 19.90 NaN NaN NA NaN
#> Toyota Corona 21.5 NA 120.1 97 3.70 2.465 20.01 NaN 2 3 NaN
#> Dodge Challenger 15.5 8 318.0 150 2.76 3.520 16.87 2 2 3 2
#> AMC Javelin 15.2 8 304.0 150 3.15 3.435 17.30 2 2 3 2
#> Camaro Z28 13.3 8 350.0 245 3.73 3.840 15.41 2 2 3 NA
#> Pontiac Firebird 19.2 8 400.0 175 3.08 3.845 17.05 2 2 3 2
#> Fiat X1-9 27.3 NA 79.0 66 4.08 1.935 18.90 NaN NaN NA NaN
#> Porsche 914-2 26.0 NA 120.3 91 4.43 2.140 16.70 2 NaN 5 2
#> Lotus Europa 30.4 NA 95.1 113 3.77 1.513 16.90 NaN NaN 5 2
#> Ford Pantera L 15.8 8 351.0 264 4.22 3.170 14.50 2 NaN 5 NA
#> Ferrari Dino 19.7 6 145.0 175 3.62 2.770 15.50 2 NaN 5 6
#> Maserati Bora 15.0 8 301.0 335 3.54 3.570 14.60 2 NaN 5 8
#> Volvo 142E 21.4 NA 121.0 109 4.11 2.780 18.60 NaN NaN NA 2
replace_inf(mtcr, replace.nan = TRUE)
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> Mazda RX4 21.0 6 160.0 110 3.90 2.620 16.46 2 NA NA NA
#> Mazda RX4 Wag 21.0 6 160.0 110 3.90 2.875 17.02 2 NA NA NA
#> Datsun 710 22.8 NA 108.0 93 3.85 2.320 18.61 NA NA NA NA
#> Hornet 4 Drive 21.4 6 258.0 110 3.08 3.215 19.44 NA 2 3 NA
#> Hornet Sportabout 18.7 8 360.0 175 3.15 3.440 17.02 2 2 3 2
#> Valiant 18.1 6 225.0 105 2.76 3.460 20.22 NA 2 3 NA
#> Duster 360 14.3 8 360.0 245 3.21 3.570 15.84 2 2 3 NA
#> Merc 240D 24.4 NA 146.7 62 3.69 3.190 20.00 NA 2 NA 2
#> Merc 230 22.8 NA 140.8 95 3.92 3.150 22.90 NA 2 NA 2
#> Merc 280 19.2 6 167.6 123 3.92 3.440 18.30 NA 2 NA NA
#> Merc 280C 17.8 6 167.6 123 3.92 3.440 18.90 NA 2 NA NA
#> Merc 450SE 16.4 8 275.8 180 3.07 4.070 17.40 2 2 3 3
#> Merc 450SL 17.3 8 275.8 180 3.07 3.730 17.60 2 2 3 3
#> Merc 450SLC 15.2 8 275.8 180 3.07 3.780 18.00 2 2 3 3
#> Cadillac Fleetwood 10.4 8 472.0 205 2.93 5.250 17.98 2 2 3 NA
#> Lincoln Continental 10.4 8 460.0 215 3.00 5.424 17.82 2 2 3 NA
#> Chrysler Imperial 14.7 8 440.0 230 3.23 5.345 17.42 2 2 3 NA
#> Fiat 128 32.4 NA 78.7 66 4.08 2.200 19.47 NA NA NA NA
#> Honda Civic 30.4 NA 75.7 52 4.93 1.615 18.52 NA NA NA 2
#> Toyota Corolla 33.9 NA 71.1 65 4.22 1.835 19.90 NA NA NA NA
#> Toyota Corona 21.5 NA 120.1 97 3.70 2.465 20.01 NA 2 3 NA
#> Dodge Challenger 15.5 8 318.0 150 2.76 3.520 16.87 2 2 3 2
#> AMC Javelin 15.2 8 304.0 150 3.15 3.435 17.30 2 2 3 2
#> Camaro Z28 13.3 8 350.0 245 3.73 3.840 15.41 2 2 3 NA
#> Pontiac Firebird 19.2 8 400.0 175 3.08 3.845 17.05 2 2 3 2
#> Fiat X1-9 27.3 NA 79.0 66 4.08 1.935 18.90 NA NA NA NA
#> Porsche 914-2 26.0 NA 120.3 91 4.43 2.140 16.70 2 NA 5 2
#> Lotus Europa 30.4 NA 95.1 113 3.77 1.513 16.90 NA NA 5 2
#> Ford Pantera L 15.8 8 351.0 264 4.22 3.170 14.50 2 NA 5 NA
#> Ferrari Dino 19.7 6 145.0 175 3.62 2.770 15.50 2 NA 5 6
#> Maserati Bora 15.0 8 301.0 335 3.54 3.570 14.60 2 NA 5 8
#> Volvo 142E 21.4 NA 121.0 109 4.11 2.780 18.60 NA NA NA 2
replace_outliers(mtcars, c(2, 100)) # Replace all values below 2 and above 100 w. NA
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> Mazda RX4 21.0 6 NA NA 3.90 2.620 16.46 NA NA 4 4
#> Mazda RX4 Wag 21.0 6 NA NA 3.90 2.875 17.02 NA NA 4 4
#> Datsun 710 22.8 4 NA 93 3.85 2.320 18.61 NA NA 4 NA
#> Hornet 4 Drive 21.4 6 NA NA 3.08 3.215 19.44 NA NA 3 NA
#> Hornet Sportabout 18.7 8 NA NA 3.15 3.440 17.02 NA NA 3 2
#> Valiant 18.1 6 NA NA 2.76 3.460 20.22 NA NA 3 NA
#> Duster 360 14.3 8 NA NA 3.21 3.570 15.84 NA NA 3 4
#> Merc 240D 24.4 4 NA 62 3.69 3.190 20.00 NA NA 4 2
#> Merc 230 22.8 4 NA 95 3.92 3.150 22.90 NA NA 4 2
#> Merc 280 19.2 6 NA NA 3.92 3.440 18.30 NA NA 4 4
#> Merc 280C 17.8 6 NA NA 3.92 3.440 18.90 NA NA 4 4
#> Merc 450SE 16.4 8 NA NA 3.07 4.070 17.40 NA NA 3 3
#> Merc 450SL 17.3 8 NA NA 3.07 3.730 17.60 NA NA 3 3
#> Merc 450SLC 15.2 8 NA NA 3.07 3.780 18.00 NA NA 3 3
#> Cadillac Fleetwood 10.4 8 NA NA 2.93 5.250 17.98 NA NA 3 4
#> Lincoln Continental 10.4 8 NA NA 3.00 5.424 17.82 NA NA 3 4
#> Chrysler Imperial 14.7 8 NA NA 3.23 5.345 17.42 NA NA 3 4
#> Fiat 128 32.4 4 78.7 66 4.08 2.200 19.47 NA NA 4 NA
#> Honda Civic 30.4 4 75.7 52 4.93 NA 18.52 NA NA 4 2
#> Toyota Corolla 33.9 4 71.1 65 4.22 NA 19.90 NA NA 4 NA
#> Toyota Corona 21.5 4 NA 97 3.70 2.465 20.01 NA NA 3 NA
#> Dodge Challenger 15.5 8 NA NA 2.76 3.520 16.87 NA NA 3 2
#> AMC Javelin 15.2 8 NA NA 3.15 3.435 17.30 NA NA 3 2
#> Camaro Z28 13.3 8 NA NA 3.73 3.840 15.41 NA NA 3 4
#> Pontiac Firebird 19.2 8 NA NA 3.08 3.845 17.05 NA NA 3 2
#> Fiat X1-9 27.3 4 79.0 66 4.08 NA 18.90 NA NA 4 NA
#> Porsche 914-2 26.0 4 NA 91 4.43 2.140 16.70 NA NA 5 2
#> Lotus Europa 30.4 4 95.1 NA 3.77 NA 16.90 NA NA 5 2
#> Ford Pantera L 15.8 8 NA NA 4.22 3.170 14.50 NA NA 5 4
#> Ferrari Dino 19.7 6 NA NA 3.62 2.770 15.50 NA NA 5 6
#> Maserati Bora 15.0 8 NA NA 3.54 3.570 14.60 NA NA 5 8
#> Volvo 142E 21.4 4 NA NA 4.11 2.780 18.60 NA NA 4 2
replace_outliers(mtcars, c(2, 100), value = "clip") # Clipping outliers to the thresholds
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> Mazda RX4 21.0 6 100.0 100 3.90 2.620 16.46 2 2 4 4
#> Mazda RX4 Wag 21.0 6 100.0 100 3.90 2.875 17.02 2 2 4 4
#> Datsun 710 22.8 4 100.0 93 3.85 2.320 18.61 2 2 4 2
#> Hornet 4 Drive 21.4 6 100.0 100 3.08 3.215 19.44 2 2 3 2
#> Hornet Sportabout 18.7 8 100.0 100 3.15 3.440 17.02 2 2 3 2
#> Valiant 18.1 6 100.0 100 2.76 3.460 20.22 2 2 3 2
#> Duster 360 14.3 8 100.0 100 3.21 3.570 15.84 2 2 3 4
#> Merc 240D 24.4 4 100.0 62 3.69 3.190 20.00 2 2 4 2
#> Merc 230 22.8 4 100.0 95 3.92 3.150 22.90 2 2 4 2
#> Merc 280 19.2 6 100.0 100 3.92 3.440 18.30 2 2 4 4
#> Merc 280C 17.8 6 100.0 100 3.92 3.440 18.90 2 2 4 4
#> Merc 450SE 16.4 8 100.0 100 3.07 4.070 17.40 2 2 3 3
#> Merc 450SL 17.3 8 100.0 100 3.07 3.730 17.60 2 2 3 3
#> Merc 450SLC 15.2 8 100.0 100 3.07 3.780 18.00 2 2 3 3
#> Cadillac Fleetwood 10.4 8 100.0 100 2.93 5.250 17.98 2 2 3 4
#> Lincoln Continental 10.4 8 100.0 100 3.00 5.424 17.82 2 2 3 4
#> Chrysler Imperial 14.7 8 100.0 100 3.23 5.345 17.42 2 2 3 4
#> Fiat 128 32.4 4 78.7 66 4.08 2.200 19.47 2 2 4 2
#> Honda Civic 30.4 4 75.7 52 4.93 2.000 18.52 2 2 4 2
#> Toyota Corolla 33.9 4 71.1 65 4.22 2.000 19.90 2 2 4 2
#> Toyota Corona 21.5 4 100.0 97 3.70 2.465 20.01 2 2 3 2
#> Dodge Challenger 15.5 8 100.0 100 2.76 3.520 16.87 2 2 3 2
#> AMC Javelin 15.2 8 100.0 100 3.15 3.435 17.30 2 2 3 2
#> Camaro Z28 13.3 8 100.0 100 3.73 3.840 15.41 2 2 3 4
#> Pontiac Firebird 19.2 8 100.0 100 3.08 3.845 17.05 2 2 3 2
#> Fiat X1-9 27.3 4 79.0 66 4.08 2.000 18.90 2 2 4 2
#> Porsche 914-2 26.0 4 100.0 91 4.43 2.140 16.70 2 2 5 2
#> Lotus Europa 30.4 4 95.1 100 3.77 2.000 16.90 2 2 5 2
#> Ford Pantera L 15.8 8 100.0 100 4.22 3.170 14.50 2 2 5 4
#> Ferrari Dino 19.7 6 100.0 100 3.62 2.770 15.50 2 2 5 6
#> Maserati Bora 15.0 8 100.0 100 3.54 3.570 14.60 2 2 5 8
#> Volvo 142E 21.4 4 100.0 100 4.11 2.780 18.60 2 2 4 2
replace_outliers(mtcars, 2, single.limit = "min") # Replace all value smaller than 2 with NA
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> Mazda RX4 21.0 6 160.0 110 3.90 2.620 16.46 NA NA 4 4
#> Mazda RX4 Wag 21.0 6 160.0 110 3.90 2.875 17.02 NA NA 4 4
#> Datsun 710 22.8 4 108.0 93 3.85 2.320 18.61 NA NA 4 NA
#> Hornet 4 Drive 21.4 6 258.0 110 3.08 3.215 19.44 NA NA 3 NA
#> Hornet Sportabout 18.7 8 360.0 175 3.15 3.440 17.02 NA NA 3 2
#> Valiant 18.1 6 225.0 105 2.76 3.460 20.22 NA NA 3 NA
#> Duster 360 14.3 8 360.0 245 3.21 3.570 15.84 NA NA 3 4
#> Merc 240D 24.4 4 146.7 62 3.69 3.190 20.00 NA NA 4 2
#> Merc 230 22.8 4 140.8 95 3.92 3.150 22.90 NA NA 4 2
#> Merc 280 19.2 6 167.6 123 3.92 3.440 18.30 NA NA 4 4
#> Merc 280C 17.8 6 167.6 123 3.92 3.440 18.90 NA NA 4 4
#> Merc 450SE 16.4 8 275.8 180 3.07 4.070 17.40 NA NA 3 3
#> Merc 450SL 17.3 8 275.8 180 3.07 3.730 17.60 NA NA 3 3
#> Merc 450SLC 15.2 8 275.8 180 3.07 3.780 18.00 NA NA 3 3
#> Cadillac Fleetwood 10.4 8 472.0 205 2.93 5.250 17.98 NA NA 3 4
#> Lincoln Continental 10.4 8 460.0 215 3.00 5.424 17.82 NA NA 3 4
#> Chrysler Imperial 14.7 8 440.0 230 3.23 5.345 17.42 NA NA 3 4
#> Fiat 128 32.4 4 78.7 66 4.08 2.200 19.47 NA NA 4 NA
#> Honda Civic 30.4 4 75.7 52 4.93 NA 18.52 NA NA 4 2
#> Toyota Corolla 33.9 4 71.1 65 4.22 NA 19.90 NA NA 4 NA
#> Toyota Corona 21.5 4 120.1 97 3.70 2.465 20.01 NA NA 3 NA
#> Dodge Challenger 15.5 8 318.0 150 2.76 3.520 16.87 NA NA 3 2
#> AMC Javelin 15.2 8 304.0 150 3.15 3.435 17.30 NA NA 3 2
#> Camaro Z28 13.3 8 350.0 245 3.73 3.840 15.41 NA NA 3 4
#> Pontiac Firebird 19.2 8 400.0 175 3.08 3.845 17.05 NA NA 3 2
#> Fiat X1-9 27.3 4 79.0 66 4.08 NA 18.90 NA NA 4 NA
#> Porsche 914-2 26.0 4 120.3 91 4.43 2.140 16.70 NA NA 5 2
#> Lotus Europa 30.4 4 95.1 113 3.77 NA 16.90 NA NA 5 2
#> Ford Pantera L 15.8 8 351.0 264 4.22 3.170 14.50 NA NA 5 4
#> Ferrari Dino 19.7 6 145.0 175 3.62 2.770 15.50 NA NA 5 6
#> Maserati Bora 15.0 8 301.0 335 3.54 3.570 14.60 NA NA 5 8
#> Volvo 142E 21.4 4 121.0 109 4.11 2.780 18.60 NA NA 4 2
replace_outliers(mtcars, 100, single.limit = "max") # Replace all value larger than 100 with NA
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> Mazda RX4 21.0 6 NA NA 3.90 2.620 16.46 0 1 4 4
#> Mazda RX4 Wag 21.0 6 NA NA 3.90 2.875 17.02 0 1 4 4
#> Datsun 710 22.8 4 NA 93 3.85 2.320 18.61 1 1 4 1
#> Hornet 4 Drive 21.4 6 NA NA 3.08 3.215 19.44 1 0 3 1
#> Hornet Sportabout 18.7 8 NA NA 3.15 3.440 17.02 0 0 3 2
#> Valiant 18.1 6 NA NA 2.76 3.460 20.22 1 0 3 1
#> Duster 360 14.3 8 NA NA 3.21 3.570 15.84 0 0 3 4
#> Merc 240D 24.4 4 NA 62 3.69 3.190 20.00 1 0 4 2
#> Merc 230 22.8 4 NA 95 3.92 3.150 22.90 1 0 4 2
#> Merc 280 19.2 6 NA NA 3.92 3.440 18.30 1 0 4 4
#> Merc 280C 17.8 6 NA NA 3.92 3.440 18.90 1 0 4 4
#> Merc 450SE 16.4 8 NA NA 3.07 4.070 17.40 0 0 3 3
#> Merc 450SL 17.3 8 NA NA 3.07 3.730 17.60 0 0 3 3
#> Merc 450SLC 15.2 8 NA NA 3.07 3.780 18.00 0 0 3 3
#> Cadillac Fleetwood 10.4 8 NA NA 2.93 5.250 17.98 0 0 3 4
#> Lincoln Continental 10.4 8 NA NA 3.00 5.424 17.82 0 0 3 4
#> Chrysler Imperial 14.7 8 NA NA 3.23 5.345 17.42 0 0 3 4
#> Fiat 128 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1
#> Honda Civic 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2
#> Toyota Corolla 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1
#> Toyota Corona 21.5 4 NA 97 3.70 2.465 20.01 1 0 3 1
#> Dodge Challenger 15.5 8 NA NA 2.76 3.520 16.87 0 0 3 2
#> AMC Javelin 15.2 8 NA NA 3.15 3.435 17.30 0 0 3 2
#> Camaro Z28 13.3 8 NA NA 3.73 3.840 15.41 0 0 3 4
#> Pontiac Firebird 19.2 8 NA NA 3.08 3.845 17.05 0 0 3 2
#> Fiat X1-9 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1
#> Porsche 914-2 26.0 4 NA 91 4.43 2.140 16.70 0 1 5 2
#> Lotus Europa 30.4 4 95.1 NA 3.77 1.513 16.90 1 1 5 2
#> Ford Pantera L 15.8 8 NA NA 4.22 3.170 14.50 0 1 5 4
#> Ferrari Dino 19.7 6 NA NA 3.62 2.770 15.50 0 1 5 6
#> Maserati Bora 15.0 8 NA NA 3.54 3.570 14.60 0 1 5 8
#> Volvo 142E 21.4 4 NA NA 4.11 2.780 18.60 1 1 4 2
replace_outliers(mtcars, 2) # Replace all values above or below 2 column-
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> Mazda RX4 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4
#> Mazda RX4 Wag 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4
#> Datsun 710 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1
#> Hornet 4 Drive 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1
#> Hornet Sportabout 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2
#> Valiant 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1
#> Duster 360 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4
#> Merc 240D 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2
#> Merc 230 22.8 4 140.8 95 3.92 3.150 NA 1 0 4 2
#> Merc 280 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4 4
#> Merc 280C 17.8 6 167.6 123 3.92 3.440 18.90 1 0 4 4
#> Merc 450SE 16.4 8 275.8 180 3.07 4.070 17.40 0 0 3 3
#> Merc 450SL 17.3 8 275.8 180 3.07 3.730 17.60 0 0 3 3
#> Merc 450SLC 15.2 8 275.8 180 3.07 3.780 18.00 0 0 3 3
#> Cadillac Fleetwood 10.4 8 472.0 205 2.93 NA 17.98 0 0 3 4
#> Lincoln Continental 10.4 8 460.0 215 3.00 NA 17.82 0 0 3 4
#> Chrysler Imperial 14.7 8 440.0 230 3.23 NA 17.42 0 0 3 4
#> Fiat 128 NA 4 78.7 66 4.08 2.200 19.47 1 1 4 1
#> Honda Civic 30.4 4 75.7 52 NA 1.615 18.52 1 1 4 2
#> Toyota Corolla NA 4 71.1 65 4.22 1.835 19.90 1 1 4 1
#> Toyota Corona 21.5 4 120.1 97 3.70 2.465 20.01 1 0 3 1
#> Dodge Challenger 15.5 8 318.0 150 2.76 3.520 16.87 0 0 3 2
#> AMC Javelin 15.2 8 304.0 150 3.15 3.435 17.30 0 0 3 2
#> Camaro Z28 13.3 8 350.0 245 3.73 3.840 15.41 0 0 3 4
#> Pontiac Firebird 19.2 8 400.0 175 3.08 3.845 17.05 0 0 3 2
#> Fiat X1-9 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1
#> Porsche 914-2 26.0 4 120.3 91 4.43 2.140 16.70 0 1 5 2
#> Lotus Europa 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2
#> Ford Pantera L 15.8 8 351.0 264 4.22 3.170 14.50 0 1 5 4
#> Ferrari Dino 19.7 6 145.0 175 3.62 2.770 15.50 0 1 5 6
#> Maserati Bora 15.0 8 301.0 NA 3.54 3.570 14.60 0 1 5 NA
#> Volvo 142E 21.4 4 121.0 109 4.11 2.780 18.60 1 1 4 2
# standard-deviations from the column-mean w. NA
replace_outliers(fgroup_by(iris, Species), 2) # Passing a grouped_df, pseries or pdata.frame
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> 1 5.1 3.5 1.4 0.2 setosa
#> 2 4.9 3.0 1.4 0.2 setosa
#> 3 4.7 3.2 1.3 0.2 setosa
#> 4 4.6 3.1 1.5 0.2 setosa
#> 5 5.0 3.6 1.4 0.2 setosa
#> 6 5.4 3.9 1.7 0.4 setosa
#> 7 4.6 3.4 1.4 0.3 setosa
#> 8 5.0 3.4 1.5 0.2 setosa
#> 9 4.4 2.9 1.4 0.2 setosa
#> 10 4.9 3.1 1.5 0.1 setosa
#> 11 5.4 3.7 1.5 0.2 setosa
#> 12 4.8 3.4 1.6 0.2 setosa
#> 13 4.8 3.0 1.4 0.1 setosa
#> 14 NA 3.0 NA 0.1 setosa
#> 15 NA 4.0 1.2 0.2 setosa
#> 16 5.7 NA 1.5 0.4 setosa
#> 17 5.4 3.9 1.3 0.4 setosa
#> 18 5.1 3.5 1.4 0.3 setosa
#> 19 5.7 3.8 1.7 0.3 setosa
#> 20 5.1 3.8 1.5 0.3 setosa
#> 21 5.4 3.4 1.7 0.2 setosa
#> 22 5.1 3.7 1.5 0.4 setosa
#> 23 4.6 3.6 NA 0.2 setosa
#> 24 5.1 3.3 1.7 NA setosa
#> 25 4.8 3.4 NA 0.2 setosa
#> 26 5.0 3.0 1.6 0.2 setosa
#> 27 5.0 3.4 1.6 0.4 setosa
#> 28 5.2 3.5 1.5 0.2 setosa
#> 29 5.2 3.4 1.4 0.2 setosa
#> 30 4.7 3.2 1.6 0.2 setosa
#> 31 4.8 3.1 1.6 0.2 setosa
#> 32 5.4 3.4 1.5 0.4 setosa
#> 33 5.2 4.1 1.5 0.1 setosa
#> 34 5.5 NA 1.4 0.2 setosa
#> 35 4.9 3.1 1.5 0.2 setosa
#> 36 5.0 3.2 1.2 0.2 setosa
#> 37 5.5 3.5 1.3 0.2 setosa
#> 38 4.9 3.6 1.4 0.1 setosa
#> 39 4.4 3.0 1.3 0.2 setosa
#> 40 5.1 3.4 1.5 0.2 setosa
#> 41 5.0 3.5 1.3 0.3 setosa
#> 42 4.5 NA 1.3 0.3 setosa
#> 43 4.4 3.2 1.3 0.2 setosa
#> 44 5.0 3.5 1.6 NA setosa
#> 45 5.1 3.8 NA 0.4 setosa
#> 46 4.8 3.0 1.4 0.3 setosa
#> 47 5.1 3.8 1.6 0.2 setosa
#> 48 4.6 3.2 1.4 0.2 setosa
#> 49 5.3 3.7 1.5 0.2 setosa
#> 50 5.0 3.3 1.4 0.2 setosa
#> 51 NA 3.2 4.7 1.4 versicolor
#> 52 6.4 3.2 4.5 1.5 versicolor
#> 53 6.9 3.1 4.9 1.5 versicolor
#> 54 5.5 2.3 4.0 1.3 versicolor
#> 55 6.5 2.8 4.6 1.5 versicolor
#> 56 5.7 2.8 4.5 1.3 versicolor
#> 57 6.3 3.3 4.7 1.6 versicolor
#> 58 NA 2.4 NA 1.0 versicolor
#> 59 6.6 2.9 4.6 1.3 versicolor
#> 60 5.2 2.7 3.9 1.4 versicolor
#> 61 5.0 NA 3.5 1.0 versicolor
#> 62 5.9 3.0 4.2 1.5 versicolor
#> 63 6.0 2.2 4.0 1.0 versicolor
#> 64 6.1 2.9 4.7 1.4 versicolor
#> 65 5.6 2.9 3.6 1.3 versicolor
#> 66 6.7 3.1 4.4 1.4 versicolor
#> 67 5.6 3.0 4.5 1.5 versicolor
#> 68 5.8 2.7 4.1 1.0 versicolor
#> 69 6.2 2.2 4.5 1.5 versicolor
#> 70 5.6 2.5 3.9 1.1 versicolor
#> 71 5.9 3.2 4.8 NA versicolor
#> 72 6.1 2.8 4.0 1.3 versicolor
#> 73 6.3 2.5 4.9 1.5 versicolor
#> 74 6.1 2.8 4.7 1.2 versicolor
#> 75 6.4 2.9 4.3 1.3 versicolor
#> 76 6.6 3.0 4.4 1.4 versicolor
#> 77 6.8 2.8 4.8 1.4 versicolor
#> 78 6.7 3.0 5.0 1.7 versicolor
#> 79 6.0 2.9 4.5 1.5 versicolor
#> 80 5.7 2.6 3.5 1.0 versicolor
#> 81 5.5 2.4 3.8 1.1 versicolor
#> 82 5.5 2.4 3.7 1.0 versicolor
#> 83 5.8 2.7 3.9 1.2 versicolor
#> 84 6.0 2.7 5.1 1.6 versicolor
#> 85 5.4 3.0 4.5 1.5 versicolor
#> 86 6.0 NA 4.5 1.6 versicolor
#> 87 6.7 3.1 4.7 1.5 versicolor
#> 88 6.3 2.3 4.4 1.3 versicolor
#> 89 5.6 3.0 4.1 1.3 versicolor
#> 90 5.5 2.5 4.0 1.3 versicolor
#> 91 5.5 2.6 4.4 1.2 versicolor
#> 92 6.1 3.0 4.6 1.4 versicolor
#> 93 5.8 2.6 4.0 1.2 versicolor
#> 94 5.0 2.3 NA 1.0 versicolor
#> 95 5.6 2.7 4.2 1.3 versicolor
#> 96 5.7 3.0 4.2 1.2 versicolor
#> 97 5.7 2.9 4.2 1.3 versicolor
#> 98 6.2 2.9 4.3 1.3 versicolor
#> 99 5.1 2.5 NA 1.1 versicolor
#> 100 5.7 2.8 4.1 1.3 versicolor
#> 101 6.3 3.3 6.0 2.5 virginica
#> 102 5.8 2.7 5.1 1.9 virginica
#> 103 7.1 3.0 5.9 2.1 virginica
#> 104 6.3 2.9 5.6 1.8 virginica
#> 105 6.5 3.0 5.8 2.2 virginica
#> 106 7.6 3.0 6.6 2.1 virginica
#> 107 NA 2.5 4.5 1.7 virginica
#> 108 7.3 2.9 6.3 1.8 virginica
#> 109 6.7 2.5 5.8 1.8 virginica
#> 110 7.2 3.6 6.1 2.5 virginica
#> 111 6.5 3.2 5.1 2.0 virginica
#> 112 6.4 2.7 5.3 1.9 virginica
#> 113 6.8 3.0 5.5 2.1 virginica
#> 114 5.7 2.5 5.0 2.0 virginica
#> 115 5.8 2.8 5.1 2.4 virginica
#> 116 6.4 3.2 5.3 2.3 virginica
#> 117 6.5 3.0 5.5 1.8 virginica
#> 118 7.7 NA NA 2.2 virginica
#> 119 7.7 2.6 NA 2.3 virginica
#> 120 6.0 NA 5.0 1.5 virginica
#> 121 6.9 3.2 5.7 2.3 virginica
#> 122 5.6 2.8 4.9 2.0 virginica
#> 123 7.7 2.8 NA 2.0 virginica
#> 124 6.3 2.7 4.9 1.8 virginica
#> 125 6.7 3.3 5.7 2.1 virginica
#> 126 7.2 3.2 6.0 1.8 virginica
#> 127 6.2 2.8 4.8 1.8 virginica
#> 128 6.1 3.0 4.9 1.8 virginica
#> 129 6.4 2.8 5.6 2.1 virginica
#> 130 7.2 3.0 5.8 1.6 virginica
#> 131 7.4 2.8 6.1 1.9 virginica
#> 132 NA NA 6.4 2.0 virginica
#> 133 6.4 2.8 5.6 2.2 virginica
#> 134 6.3 2.8 5.1 1.5 virginica
#> 135 6.1 2.6 5.6 NA virginica
#> 136 7.7 3.0 6.1 2.3 virginica
#> 137 6.3 3.4 5.6 2.4 virginica
#> 138 6.4 3.1 5.5 1.8 virginica
#> 139 6.0 3.0 4.8 1.8 virginica
#> 140 6.9 3.1 5.4 2.1 virginica
#> 141 6.7 3.1 5.6 2.4 virginica
#> 142 6.9 3.1 5.1 2.3 virginica
#> 143 5.8 2.7 5.1 1.9 virginica
#> 144 6.8 3.2 5.9 2.3 virginica
#> 145 6.7 3.3 5.7 2.5 virginica
#> 146 6.7 3.0 5.2 2.3 virginica
#> 147 6.3 2.5 5.0 1.9 virginica
#> 148 6.5 3.0 5.2 2.0 virginica
#> 149 6.2 3.4 5.4 2.3 virginica
#> 150 5.9 3.0 5.1 1.8 virginica
#>
#> Grouped by: Species [3 | 50 (0)]
# allows to remove outliers according to
# in-group standard-deviation. see ?fscale