Directional Quantile Classification

This function is used to classify multivariate observations by means of directional quantiles.

dqc(formula, data, df.test, subset, weights, na.action, control = list(),
  fit = TRUE)
dqc.fit(x, z, y, control)

Arguments

formula: an object of class formula: a two-sided formula of the form y ~ x1 + ... + xn where y represents the groups (i.e., labels) for the observations and x1, ..., xn are the variables used for classification.
data: an optional data frame, list or environment (or object coercible by as.data.frame to a data frame) containing the variables for classification (training). If not found in data, the variables are taken from environment(formula), typically the environment from which dqc is called.
df.test: a required data frame, list or environment (or object coercible by as.data.frame to a data frame) containing the variables for prediction.
subset: an optional vector specifying a subset of observations to be used in the fitting process.
weights: an optional vector of weights to be used in the fitting process.
na.action: a function which indicates what should happen when the data contain NAs.
control: list of control parameters of the fitting process. See dqcControl.
fit: logical flag. If FALSE the function returns a list of arguments for fitting.
x: design matrix of dimension \(nx * p\) for training.
z: design matrix of dimension \(nz * p\) for prediction.
y: vector of labels of length \(nx\).

Details

Directional quantile classification is described in the article by Viroli et al (2020).

Value

a list of class dqc containing the following components

call: the matched call.
ans: a data frame with predictions.
nx: number of observations in the training dataset.
nz: number of observations in the prediction dataset.
p: number of variables.
control: control parameters used for fitting.

References

Viroli C, Farcomeni A, Geraci M (2020). Directional quantile-based classifiers (in preparation).

Author

Marco Geraci with contributions from Cinzia Viroli

Examples


if (FALSE) { # \dontrun{
# Iris data
data(iris)

# Create training and prediction datasets

n <- nrow(iris)
ng <- length(unique(iris$Species))
df1 <- iris[c(1:40, 51:90, 101:140),]
df2 <- iris[c(41:50, 91:100, 141:150),]

# Classify
ctrl <- dqcControl(nt = 10, ndir = 5000, seed = 123)
fit <- dqc(Species ~ Sepal.Length + Petal.Length,
  data = df1, df.test = df2, control = ctrl)

# Data frame with predictions
fit$ans

# Confusion matrix
print(cm <- xtabs( ~ fit$ans$groups + df2$Species))

# Misclassification rate
1-sum(diag(cm))/nrow(df2)
} # }