csi.RdThe csi function in kernlab is an implementation of an
incomplete Cholesky decomposition algorithm which exploits side
information (e.g., classification labels, regression responses) to
compute a low rank decomposition of a kernel matrix from the data.
# S4 method for class 'matrix'
csi(x, y, kernel="rbfdot", kpar=list(sigma=0.1), rank,
centering = TRUE, kappa = 0.99 ,delta = 40 ,tol = 1e-5)The data matrix indexed by row
the classification labels or regression responses. In classification y is a \(m \times n\) matrix where \(m\) the number of data and \(n\) the number of classes \(y\) and \(y_i\) is 1 if the corresponding x belongs to class i.
the kernel function used in training and predicting.
This parameter can be set to any function, of class kernel,
which computes the inner product in feature space between two
vector arguments. kernlab provides the most popular kernel functions
which can be used by setting the kernel parameter to the following
strings:
rbfdot Radial Basis kernel function "Gaussian"
polydot Polynomial kernel function
vanilladot Linear kernel function
tanhdot Hyperbolic tangent kernel function
laplacedot Laplacian kernel function
besseldot Bessel kernel function
anovadot ANOVA RBF kernel function
splinedot Spline kernel
stringdot String kernel
The kernel parameter can also be set to a user defined function of class kernel by passing the function name as an argument.
the list of hyper-parameters (kernel parameters). This is a list which contains the parameters to be used with the kernel function. Valid parameters for existing kernels are :
sigma inverse kernel width for the Radial Basis
kernel function "rbfdot" and the Laplacian kernel "laplacedot".
degree, scale, offset for the Polynomial kernel "polydot"
scale, offset for the Hyperbolic tangent kernel
function "tanhdot"
sigma, order, degree for the Bessel kernel "besseldot".
sigma, degree for the ANOVA kernel "anovadot".
Hyper-parameters for user defined kernels can be passed through the kpar parameter as well.
maximal rank of the computed kernel matrix
if TRUE centering is performed (default: TRUE)
trade-off between approximation of K and prediction of Y (default: 0.99)
number of columns of cholesky performed in advance (default: 40)
minimum gain at each iteration (default: 1e-4)
An incomplete cholesky decomposition calculates
\(Z\) where \(K= ZZ'\) \(K\) being the kernel matrix.
Since the rank of a kernel matrix is usually low, \(Z\) tends to
be smaller then the complete kernel matrix. The decomposed matrix can be
used to create memory efficient kernel-based algorithms without the
need to compute and store a complete kernel matrix in memory. csi uses the class labels, or regression responses to compute a
more appropriate approximation for the problem at hand considering the
additional information from the response variable.
An S4 object of class "csi" which is an extension of the class "matrix". The object is the decomposed kernel matrix along with the slots :
Indices on which pivots where done
Residuals left on the diagonal
Residuals picked for pivoting
predicted gain before adding each column
actual gain after adding each column
QR decomposition of the kernel matrix
QR decomposition of the kernel matrix
slots can be accessed either by object@slot
or by accessor functions with the same name
(e.g., pivots(object))
Francis R. Bach, Michael I. Jordan
Predictive low-rank decomposition for kernel methods.
Proceedings of the Twenty-second International Conference on Machine Learning (ICML) 2005
http://www.di.ens.fr/~fbach/bach_jordan_csi.pdf
data(iris)
## create multidimensional y matrix
yind <- t(matrix(1:3,3,150))
ymat <- matrix(0, 150, 3)
ymat[yind==as.integer(iris[,5])] <- 1
datamatrix <- as.matrix(iris[,-5])
# initialize kernel function
rbf <- rbfdot(sigma=0.1)
rbf
#> new("rbfkernel", .Data = function (x, y = NULL)
#> {
#> if (!is(x, "vector"))
#> stop("x must be a vector")
#> if (!is(y, "vector") && !is.null(y))
#> stop("y must a vector")
#> if (is(x, "vector") && is.null(y)) {
#> return(1)
#> }
#> if (is(x, "vector") && is(y, "vector")) {
#> if (!length(x) == length(y))
#> stop("number of dimension must be the same on both data points")
#> return(exp(sigma * (2 * crossprod(x, y) - crossprod(x) -
#> crossprod(y))))
#> }
#> }, kpar = list(sigma = 0.1))
#> <bytecode: 0x564204d155f8>
#> <environment: 0x564205099d70>
#> attr(,"kpar")
#> attr(,"kpar")$sigma
#> [1] 0.1
#>
#> attr(,"class")
#> [1] "rbfkernel"
#> attr(,"class")attr(,"package")
#> [1] "kernlab"
Z <- csi(datamatrix,ymat, kernel=rbf, rank = 30)
dim(Z)
#> [1] 150 30
pivots(Z)
#> [1] 1 101 132 119 106 80 66 135 57 74 122 111 58 11 39 33 54 86
#> [19] 44 136 62 85 35 142 27 91 115 109 77 138 99 53 46 67 123 146
#> [37] 61 149 120 108 10 17 130 9 52 70 107 118 114 141 19 127 104 82
#> [55] 100 110 131 51 140 5 21 126 40 105 12 64 22 28 34 6 71 72
#> [73] 73 56 75 76 48 78 79 2 81 14 83 84 49 8 87 88 89 90
#> [91] 43 92 93 94 95 96 97 98 23 15 18 102 103 20 24 13 3 37
#> [109] 45 42 25 112 113 59 38 116 117 41 4 30 121 69 50 124 125 68
#> [127] 60 128 129 55 63 31 133 134 32 7 137 16 139 65 29 36 143 144
#> [145] 145 26 147 148 47 150
# calculate kernel matrix
K <- crossprod(t(Z))
# difference between approximated and real kernel matrix
(K - kernelMatrix(kernel=rbf, datamatrix))[6,]
#> [1] 0.000000e+00 3.294318e-03 1.644680e-03 1.608380e-03 -1.220268e-03
#> [6] -6.078501e-03 9.589439e-05 8.384188e-04 9.656466e-04 2.836263e-03
#> [11] -2.930943e-03 4.098783e-04 2.980274e-03 -2.220446e-16 -8.965374e-03
#> [16] -1.653778e-02 -5.867396e-03 1.008349e-05 -5.410934e-03 -3.900733e-03
#> [21] 6.643020e-04 -2.409574e-03 -1.998123e-03 1.475035e-03 0.000000e+00
#> [26] 3.350540e-03 7.407855e-04 -8.013015e-05 1.165805e-03 1.477897e-03
#> [31] 2.283301e-03 9.742738e-04 -9.253187e-03 -1.203781e-02 2.748481e-03
#> [36] 2.719135e-03 8.271673e-05 -1.322824e-03 8.292507e-04 9.523782e-04
#> [41] 1.221245e-15 0.000000e+00 3.044012e-04 1.443290e-15 -4.048128e-03
#> [46] 2.787433e-03 -4.075572e-03 1.232110e-03 -2.788301e-03 1.824274e-03
#> [51] 0.000000e+00 5.920832e-05 3.994131e-05 -7.013161e-05 5.777698e-06
#> [56] 8.783444e-05 6.449669e-05 -3.406535e-05 2.778523e-06 7.216450e-16
#> [61] 5.551115e-16 1.994334e-05 -2.210526e-04 4.522136e-05 0.000000e+00
#> [66] 4.046290e-05 -1.665335e-16 -5.551115e-17 -3.510309e-04 6.106227e-16
#> [71] -7.302072e-05 -4.572073e-05 -1.375647e-04 6.587981e-05 -6.804974e-06
#> [76] 5.241305e-05 -3.462390e-05 3.640250e-05 1.404683e-05 1.110223e-16
#> [81] -2.034306e-05 -1.110223e-16 -5.068559e-05 -4.275164e-05 -1.662684e-05
#> [86] -5.551115e-17 7.944239e-05 -2.804399e-04 -9.909627e-06 -2.937080e-05
#> [91] 2.234118e-04 4.748586e-05 -5.556823e-05 -1.193880e-04 3.728456e-05
#> [96] -2.273134e-05 1.539013e-05 -5.502633e-06 2.220446e-16 1.110223e-16
#> [101] -4.163336e-17 -3.465849e-04 3.059177e-05 -2.498002e-16 5.551115e-17
#> [106] -6.938894e-18 -5.551115e-17 -3.659936e-05 -8.564319e-05 9.177555e-05
#> [111] 2.220446e-16 -1.817270e-04 -2.694895e-05 -5.026887e-04 -8.584143e-04
#> [116] -1.605942e-04 -1.421009e-05 -2.210191e-04 1.040834e-17 -5.551115e-17
#> [121] 5.023864e-05 -5.314147e-04 -2.145284e-04 -1.636829e-04 -8.908144e-06
#> [126] -7.560976e-05 -1.240224e-04 -8.208677e-05 -1.398680e-04 -1.672290e-04
#> [131] -1.861385e-04 -6.403198e-04 -1.693464e-04 3.200999e-06 2.846070e-04
#> [136] -1.526557e-16 -2.339729e-04 2.775558e-17 -9.652743e-05 1.424119e-05
#> [141] -1.665335e-16 1.770770e-04 -3.465849e-04 1.062464e-04 2.549706e-05
#> [146] -2.220446e-16 -3.128903e-04 -7.610019e-05 -2.763565e-04 -1.687860e-04