dqc.Rd
This function is used to classify multivariate observations by means of directional quantiles.
dqc(formula, data, df.test, subset, weights, na.action, control = list(),
fit = TRUE)
dqc.fit(x, z, y, control)
an object of class formula
: a two-sided formula of the form y ~ x1 + ... + xn
where y
represents the groups (i.e., labels) for the observations and x1
, ..., xn
are the variables used for classification.
an optional data frame, list or environment (or object coercible by as.data.frame to a data frame) containing the variables for classification (training). If not found in data, the variables are taken from environment(formula), typically the environment from which dqc
is called.
a required data frame, list or environment (or object coercible by as.data.frame to a data frame) containing the variables for prediction.
an optional vector specifying a subset of observations to be used in the fitting process.
an optional vector of weights to be used in the fitting process.
a function which indicates what should happen when the data contain NA
s.
list of control parameters of the fitting process. See dqcControl
.
logical flag. If FALSE
the function returns a list of arguments for fitting.
design matrix of dimension \(nx * p\) for training.
design matrix of dimension \(nz * p\) for prediction.
vector of labels of length \(nx\).
Directional quantile classification is described in the article by Viroli et al (2020).
a list of class dqc
containing the following components
the matched call.
a data frame with predictions.
number of observations in the training dataset.
number of observations in the prediction dataset.
number of variables.
control parameters used for fitting.
Viroli C, Farcomeni A, Geraci M (2020). Directional quantile-based classifiers (in preparation).
if (FALSE) { # \dontrun{
# Iris data
data(iris)
# Create training and prediction datasets
n <- nrow(iris)
ng <- length(unique(iris$Species))
df1 <- iris[c(1:40, 51:90, 101:140),]
df2 <- iris[c(41:50, 91:100, 141:150),]
# Classify
ctrl <- dqcControl(nt = 10, ndir = 5000, seed = 123)
fit <- dqc(Species ~ Sepal.Length + Petal.Length,
data = df1, df.test = df2, control = ctrl)
# Data frame with predictions
fit$ans
# Confusion matrix
print(cm <- xtabs( ~ fit$ans$groups + df2$Species))
# Misclassification rate
1-sum(diag(cm))/nrow(df2)
} # }