Rank of a Matrix

Compute ‘the’ matrix rank, a well-defined functional in theory(*), somewhat ambiguous in practice. We provide several methods, the default corresponding to Matlab's definition.

(*) The rank of a $n \times m$ matrix $A$, $rk(A)$, is the maximal number of linearly independent columns (or rows); hence $rk(A) \le min(n,m)$.

rankMatrix(x, tol = NULL,
           method = c("tolNorm2", "qr.R", "qrLINPACK", "qr",
                      "useGrad", "maybeGrad"),
           sval = svd(x, 0, 0)$d, warn.t = TRUE, warn.qr = TRUE)

qr2rankMatrix(qr, tol = NULL, isBqr = is.qr(qr), do.warn = TRUE)

Arguments

x

numeric matrix, of dimension $n \times m$, say.

tol

nonnegative number specifying a (relative, “scalefree”) tolerance for testing of “practically zero” with specific meaning depending on method; by default, max(dim(x)) * .Machine$double.eps is according to Matlab's default (for its only method which is our method="tolNorm2").

method

a character string specifying the computational method for the rank, can be abbreviated:

"tolNorm2":

the number of singular values >= tol * max(sval);

"qrLINPACK":

for a dense matrix, this is the rank of qr(x, tol, LAPACK=FALSE) (which is qr(...)$rank);
This ("qr*", dense) version used to be the recommended way to compute a matrix rank for a while in the past.

For sparse x, this is equivalent to "qr.R".

"qr.R":

this is the rank of triangular matrix $R$, where qr() uses LAPACK or a "sparseQR" method (see qr-methods) to compute the decomposition $QR$. The rank of $R$ is then defined as the number of “non-zero” diagonal entries $d_i$ of $R$, and “non-zero”s fulfill $|d_i| \ge \mathrm{tol}\cdot\max(|d_i|)$.

"qr":

is for back compatibility; for dense x, it corresponds to "qrLINPACK", whereas for sparse x, it uses "qr.R".

For all the "qr*" methods, singular values sval are not used, which may be crucially important for a large sparse matrix x, as in that case, when sval is not specified, the default, computing svd() currently coerces x to a dense matrix.

"useGrad":

considering the “gradient” of the (decreasing) singular values, the index of the smallest gap.

"maybeGrad":

choosing method "useGrad" only when that seems reasonable; otherwise using "tolNorm2".

sval

numeric vector of non-increasing singular values of x; typically unspecified and computed from x when needed, i.e., unless method = "qr".

warn.t

logical indicating if rankMatrix() should warn when it needs t(x) instead of x. Currently, for method = "qr" only, gives a warning by default because the caller often could have passed t(x) directly, more efficiently.

warn.qr

in the $QR$ cases (i.e., if method starts with "qr"), rankMatrix() calls qr2rankMarix(.., do.warn = warn.qr), see below.

qr: an R object resulting from qr(x,..), i.e., typically inheriting from class "qr" or "sparseQR".
isBqr: logical indicating if qr is resulting from base qr(). (Otherwise, it is typically from Matrix package sparse qr.)
do.warn: logical; if true, warn about non-finite diagonal entries in the $R$ matrix of the $QR$ decomposition. Do not change lightly!

Details

qr2rankMatrix() is typically called from rankMatrix() for the "qr"* methods, but can be used directly - much more efficiently in case the qr-decomposition is available anyway.

Note

For large sparse matrices x, unless you can specify sval yourself, currently method = "qr" may be the only feasible one, as the others need sval and call svd() which currently coerces x to a denseMatrix which may be very slow or impossible, depending on the matrix dimensions.

Note that in the case of sparse x, method = "qr", all non-strictly zero diagonal entries $d_i$ where counted, up to including Matrix version 1.1-0, i.e., that method implicitly used tol = 0, see also the set.seed(42) example below.

Value

If x is a matrix of all 0 (or of zero dimension), the rank is zero; otherwise, typically a positive integer in 1:min(dim(x)) with attributes detailing the method used.

There are rare cases where the sparse $QR$ decomposition “fails” in so far as the diagonal entries of $R$, the $d_i$ (see above), end with non-finite, typically NaN entries. Then, a warning is signalled (unless warn.qr / do.warn is not true) and NA (specifically, NA_integer_) is returned.

Author

Martin Maechler; for the "*Grad" methods building on suggestions by Ravi Varadhan.

Examples