General Sparse Matrix Construction from Nonzero Entries

User-friendly construction of sparse matrices (inheriting from virtual class CsparseMatrix, RsparseMatrix, or TsparseMatrix) from the positions and values of their nonzero entries.

This interface is recommended over direct construction via calls such as new("..[CRT]Matrix", ...).

sparseMatrix(i, j, p, x, dims, dimnames,
             symmetric = FALSE, triangular = FALSE, index1 = TRUE,
             repr = c("C", "R", "T"), giveCsparse,
             check = TRUE, use.last.ij = FALSE)

Arguments

i,j: integer vectors of equal length specifying the positions (row and column indices) of the nonzero (or non-TRUE) entries of the matrix. Note that, when x is non-missing, the \(x_k\) corresponding to repeated pairs \((i_k,j_k)\) are added, for consistency with the definition of class TsparseMatrix, unless use.last.ij is TRUE, in which case only the last such \(x_k\) is used.
p: integer vector of pointers, one for each column (or row), to the initial (zero-based) index of elements in the column (or row). Exactly one of i, j, and p must be missing.
x: optional, typically nonzero values for the matrix entries. If specified, then the length must equal that of i (or j) or equal 1, in which case x is recycled as necessary. If missing, then the result is a nonzero pattern matrix, i.e., inheriting from class nsparseMatrix.
dims: optional length-2 integer vector of matrix dimensions. If missing, then !index1+c(max(i),max(j)) is used.
dimnames: optional list of dimnames; if missing, then NULL ones are used.
symmetric: logical indicating if the resulting matrix should be symmetric. In that case, \((i,j,p)\) should specify only one triangle (upper or lower).
triangular: logical indicating if the resulting matrix should be triangular. In that case, \((i,j,p)\) should specify only one triangle (upper or lower).
index1: logical. If TRUE (the default), then i and j are interpreted as 1-based indices, following the R convention. That is, counting of rows and columns starts at 1. If FALSE, then they are interpreted as 0-based indices.
repr: character string, one of "C", "R", and "T", specifying the representation of the sparse matrix result, i.e., specifying one of the virtual classes CsparseMatrix, RsparseMatrix, and TsparseMatrix.
giveCsparse: (deprecated, replaced by repr) logical indicating if the result should inherit from CsparseMatrix or TsparseMatrix. Note that operations involving CsparseMatrix are very often (but not always) more efficient.
check: logical indicating whether to check that the result is formally valid before returning. Do not set to FALSE unless you know what you are doing!
use.last.ij: logical indicating if, in the case of repeated (duplicated) pairs \((i_k,j_k)\), only the last pair should be used. FALSE (the default) is consistent with the definiton of class TsparseMatrix.

Value

A sparse matrix, by default in compressed sparse column format and (formally) without symmetric or triangular structure, i.e., by default inheriting from both CsparseMatrix and generalMatrix.

Details

Exactly one of the arguments i, j and p must be missing.

In typical usage, p is missing, i and j are vectors of positive integers and x is a numeric vector. These three vectors, which must have the same length, form the triplet representation of the sparse matrix.

If i or j is missing then p must be a non-decreasing integer vector whose first element is zero. It provides the compressed, or “pointer” representation of the row or column indices, whichever is missing. The expanded form of p, rep(seq_along(dp),dp) where dp <- diff(p), is used as the (1-based) row or column indices.

You cannot set both singular and triangular to true; rather use Diagonal() (or its alternatives, see there).

The values of i, j, p and index1 are used to create 1-based index vectors i and j from which a TsparseMatrix is constructed, with numerical values given by x, if non-missing. Note that in that case, when some pairs \((i_k,j_k)\) are repeated (aka “duplicated”), the corresponding \(x_k\) are added, in consistency with the definition of the TsparseMatrix class, unless use.last.ij is set to true.

By default, when repr = "C", the CsparseMatrix derived from this triplet form is returned, where repr = "R" now allows to directly get an RsparseMatrix and repr = "T" leaves the result as TsparseMatrix.

The reason for returning a CsparseMatrix object instead of the triplet format by default is that the compressed column form is easier to work with when performing matrix operations. In particular, if there are no zeros in x then a CsparseMatrix is a unique representation of the sparse matrix.

Note

You do need to use index1 = FALSE (or add + 1 to i and j) if you want use the 0-based i (and j) slots from existing sparse matrices.

Examples

## simple example
i <- c(1,3:8); j <- c(2,9,6:10); x <- 7 * (1:7)
(A <- sparseMatrix(i, j, x = x))                    ##  8 x 10 "dgCMatrix"
#> 8 x 10 sparse Matrix of class "dgCMatrix"
#>                              
#> [1,] . 7 . . .  .  .  .  .  .
#> [2,] . . . . .  .  .  .  .  .
#> [3,] . . . . .  .  .  . 14  .
#> [4,] . . . . . 21  .  .  .  .
#> [5,] . . . . .  . 28  .  .  .
#> [6,] . . . . .  .  . 35  .  .
#> [7,] . . . . .  .  .  . 42  .
#> [8,] . . . . .  .  .  .  . 49
summary(A)
#> 8 x 10 sparse Matrix of class "dgCMatrix", with 7 entries
#>   i  j  x
#> 1 1  2  7
#> 2 4  6 21
#> 3 5  7 28
#> 4 6  8 35
#> 5 3  9 14
#> 6 7  9 42
#> 7 8 10 49
str(A) # note that *internally* 0-based row indices are used
#> Formal class 'dgCMatrix' [package "Matrix"] with 6 slots
#>   ..@ i       : int [1:7] 0 3 4 5 2 6 7
#>   ..@ p       : int [1:11] 0 0 1 1 1 1 2 3 4 6 ...
#>   ..@ Dim     : int [1:2] 8 10
#>   ..@ Dimnames:List of 2
#>   .. ..$ : NULL
#>   .. ..$ : NULL
#>   ..@ x       : num [1:7] 7 21 28 35 14 42 49
#>   ..@ factors : list()

(sA <- sparseMatrix(i, j, x = x, symmetric = TRUE)) ## 10 x 10 "dsCMatrix"
#> 10 x 10 sparse Matrix of class "dsCMatrix"
#>                                  
#>  [1,] . 7  .  .  .  .  .  .  .  .
#>  [2,] 7 .  .  .  .  .  .  .  .  .
#>  [3,] . .  .  .  .  .  .  . 14  .
#>  [4,] . .  .  .  . 21  .  .  .  .
#>  [5,] . .  .  .  .  . 28  .  .  .
#>  [6,] . .  . 21  .  .  . 35  .  .
#>  [7,] . .  .  . 28  .  .  . 42  .
#>  [8,] . .  .  .  . 35  .  .  . 49
#>  [9,] . . 14  .  .  . 42  .  .  .
#> [10,] . .  .  .  .  .  . 49  .  .
(tA <- sparseMatrix(i, j, x = x, triangular= TRUE)) ## 10 x 10 "dtCMatrix"
#> 10 x 10 sparse Matrix of class "dtCMatrix"
#>                               
#>  [1,] . 7 . . .  .  .  .  .  .
#>  [2,] . . . . .  .  .  .  .  .
#>  [3,] . . . . .  .  .  . 14  .
#>  [4,] . . . . . 21  .  .  .  .
#>  [5,] . . . . .  . 28  .  .  .
#>  [6,] . . . . .  .  . 35  .  .
#>  [7,] . . . . .  .  .  . 42  .
#>  [8,] . . . . .  .  .  .  . 49
#>  [9,] . . . . .  .  .  .  .  .
#> [10,] . . . . .  .  .  .  .  .
stopifnot( all(sA == tA + t(tA)) ,
           identical(sA, as(tA + t(tA), "symmetricMatrix")))

## dims can be larger than the maximum row or column indices
(AA <- sparseMatrix(c(1,3:8), c(2,9,6:10), x = 7 * (1:7), dims = c(10,20)))
#> 10 x 20 sparse Matrix of class "dgCMatrix"
#>                                                   
#>  [1,] . 7 . . .  .  .  .  .  . . . . . . . . . . .
#>  [2,] . . . . .  .  .  .  .  . . . . . . . . . . .
#>  [3,] . . . . .  .  .  . 14  . . . . . . . . . . .
#>  [4,] . . . . . 21  .  .  .  . . . . . . . . . . .
#>  [5,] . . . . .  . 28  .  .  . . . . . . . . . . .
#>  [6,] . . . . .  .  . 35  .  . . . . . . . . . . .
#>  [7,] . . . . .  .  .  . 42  . . . . . . . . . . .
#>  [8,] . . . . .  .  .  .  . 49 . . . . . . . . . .
#>  [9,] . . . . .  .  .  .  .  . . . . . . . . . . .
#> [10,] . . . . .  .  .  .  .  . . . . . . . . . . .
summary(AA)
#> 10 x 20 sparse Matrix of class "dgCMatrix", with 7 entries
#>   i  j  x
#> 1 1  2  7
#> 2 4  6 21
#> 3 5  7 28
#> 4 6  8 35
#> 5 3  9 14
#> 6 7  9 42
#> 7 8 10 49

## i, j and x can be in an arbitrary order, as long as they are consistent
set.seed(1); (perm <- sample(1:7))
#> [1] 1 4 7 2 5 3 6
(A1 <- sparseMatrix(i[perm], j[perm], x = x[perm]))
#> 8 x 10 sparse Matrix of class "dgCMatrix"
#>                              
#> [1,] . 7 . . .  .  .  .  .  .
#> [2,] . . . . .  .  .  .  .  .
#> [3,] . . . . .  .  .  . 14  .
#> [4,] . . . . . 21  .  .  .  .
#> [5,] . . . . .  . 28  .  .  .
#> [6,] . . . . .  .  . 35  .  .
#> [7,] . . . . .  .  .  . 42  .
#> [8,] . . . . .  .  .  .  . 49
stopifnot(identical(A, A1))

## The slots are 0-index based, so
try( sparseMatrix(i=A@i, p=A@p, x= seq_along(A@x)) )
#> Error in sparseMatrix(i = A@i, p = A@p, x = seq_along(A@x)) : 
#>   'i' and 'j' must be positive
## fails and you should say so: 1-indexing is FALSE:
     sparseMatrix(i=A@i, p=A@p, x= seq_along(A@x), index1 = FALSE)
#> 8 x 10 sparse Matrix of class "dgCMatrix"
#>                         
#> [1,] . 1 . . . . . . . .
#> [2,] . . . . . . . . . .
#> [3,] . . . . . . . . 5 .
#> [4,] . . . . . 2 . . . .
#> [5,] . . . . . . 3 . . .
#> [6,] . . . . . . . 4 . .
#> [7,] . . . . . . . . 6 .
#> [8,] . . . . . . . . . 7

## the (i,j) pairs can be repeated, in which case the x's are summed
(args <- data.frame(i = c(i, 1), j = c(j, 2), x = c(x, 2)))
#>   i  j  x
#> 1 1  2  7
#> 2 3  9 14
#> 3 4  6 21
#> 4 5  7 28
#> 5 6  8 35
#> 6 7  9 42
#> 7 8 10 49
#> 8 1  2  2
(Aa <- do.call(sparseMatrix, args))
#> 8 x 10 sparse Matrix of class "dgCMatrix"
#>                              
#> [1,] . 9 . . .  .  .  .  .  .
#> [2,] . . . . .  .  .  .  .  .
#> [3,] . . . . .  .  .  . 14  .
#> [4,] . . . . . 21  .  .  .  .
#> [5,] . . . . .  . 28  .  .  .
#> [6,] . . . . .  .  . 35  .  .
#> [7,] . . . . .  .  .  . 42  .
#> [8,] . . . . .  .  .  .  . 49
## explicitly ask for elimination of such duplicates, so
## that the last one is used:
(A. <- do.call(sparseMatrix, c(args, list(use.last.ij = TRUE))))
#> 8 x 10 sparse Matrix of class "dgCMatrix"
#>                              
#> [1,] . 2 . . .  .  .  .  .  .
#> [2,] . . . . .  .  .  .  .  .
#> [3,] . . . . .  .  .  . 14  .
#> [4,] . . . . . 21  .  .  .  .
#> [5,] . . . . .  . 28  .  .  .
#> [6,] . . . . .  .  . 35  .  .
#> [7,] . . . . .  .  .  . 42  .
#> [8,] . . . . .  .  .  .  . 49
stopifnot(Aa[1,2] == 9, # 2+7 == 9
          A.[1,2] == 2) # 2 was *after* 7

## for a pattern matrix, of course there is no "summing":
(nA <- do.call(sparseMatrix, args[c("i","j")]))
#> 8 x 10 sparse Matrix of class "ngCMatrix"
#>                         
#> [1,] . | . . . . . . . .
#> [2,] . . . . . . . . . .
#> [3,] . . . . . . . . | .
#> [4,] . . . . . | . . . .
#> [5,] . . . . . . | . . .
#> [6,] . . . . . . . | . .
#> [7,] . . . . . . . . | .
#> [8,] . . . . . . . . . |

dn <- list(LETTERS[1:3], letters[1:5])
## pointer vectors can be used, and the (i,x) slots are sorted if necessary:
m <- sparseMatrix(i = c(3,1, 3:2, 2:1), p= c(0:2, 4,4,6), x = 1:6, dimnames = dn)
m
#> 3 x 5 sparse Matrix of class "dgCMatrix"
#>   a b c d e
#> A . 2 . . 6
#> B . . 4 . 5
#> C 1 . 3 . .
str(m)
#> Formal class 'dgCMatrix' [package "Matrix"] with 6 slots
#>   ..@ i       : int [1:6] 2 0 1 2 0 1
#>   ..@ p       : int [1:6] 0 1 2 4 4 6
#>   ..@ Dim     : int [1:2] 3 5
#>   ..@ Dimnames:List of 2
#>   .. ..$ : chr [1:3] "A" "B" "C"
#>   .. ..$ : chr [1:5] "a" "b" "c" "d" ...
#>   ..@ x       : num [1:6] 1 2 4 3 6 5
#>   ..@ factors : list()
stopifnot(identical(dimnames(m), dn))

sparseMatrix(x = 2.72, i=1:3, j=2:4) # recycling x
#> 3 x 4 sparse Matrix of class "dgCMatrix"
#>                      
#> [1,] . 2.72 .    .   
#> [2,] . .    2.72 .   
#> [3,] . .    .    2.72
sparseMatrix(x = TRUE, i=1:3, j=2:4) # recycling x, |--> "lgCMatrix"
#> 3 x 4 sparse Matrix of class "lgCMatrix"
#>             
#> [1,] . | . .
#> [2,] . . | .
#> [3,] . . . |

## no 'x' --> patter*n* matrix:
(n <- sparseMatrix(i=1:6, j=rev(2:7)))# -> ngCMatrix
#> 6 x 7 sparse Matrix of class "ngCMatrix"
#>                   
#> [1,] . . . . . . |
#> [2,] . . . . . | .
#> [3,] . . . . | . .
#> [4,] . . . | . . .
#> [5,] . . | . . . .
#> [6,] . | . . . . .

## an empty sparse matrix:
(e <- sparseMatrix(dims = c(4,6), i={}, j={}))
#> 4 x 6 sparse Matrix of class "ngCMatrix"
#>                 
#> [1,] . . . . . .
#> [2,] . . . . . .
#> [3,] . . . . . .
#> [4,] . . . . . .

## a symmetric one:
(sy <- sparseMatrix(i= c(2,4,3:5), j= c(4,7:5,5), x = 1:5,
                    dims = c(7,7), symmetric=TRUE))
#> 7 x 7 sparse Matrix of class "dsCMatrix"
#>                   
#> [1,] . . . . . . .
#> [2,] . . . 1 . . .
#> [3,] . . . . . 3 .
#> [4,] . 1 . . 4 . 2
#> [5,] . . . 4 5 . .
#> [6,] . . 3 . . . .
#> [7,] . . . 2 . . .
stopifnot(isSymmetric(sy),
          identical(sy, ## switch i <-> j {and transpose }
    t( sparseMatrix(j= c(2,4,3:5), i= c(4,7:5,5), x = 1:5,
                    dims = c(7,7), symmetric=TRUE))))

## rsparsematrix() calls sparseMatrix() :
M1 <- rsparsematrix(1000, 20, nnz = 200)
summary(M1)
#> 1000 x 20 sparse Matrix of class "dgCMatrix", with 200 entries
#>       i  j       x
#> 1    29  1 -0.3900
#> 2   219  1  0.7400
#> 3   455  1  0.9200
#> 4   501  1  0.1400
#> 5   664  1 -0.3800
#> 6   719  1  1.7000
#> 7   733  1 -0.0053
#> 8   858  1  0.5700
#> 9   878  1  1.5000
#> 10  101  2 -0.3900
#> 11  218  2 -1.4000
#> 12  222  2 -2.3000
#> 13  530  2  0.4100
#> 14  572  2  1.3000
#> 15  706  2  1.0000
#> 16  790  2 -0.3500
#> 17  822  2  0.0044
#> 18  942  2  0.8600
#> 19  948  2  1.4000
#> 20  330  3 -2.0000
#> 21  762  3 -0.3900
#> 22   45  4  0.0830
#> 23  123  4 -0.4100
#> 24  144  4 -0.5300
#> 25  306  4  1.5000
#> 26  476  4 -1.0000
#> 27  598  4  0.0990
#> 28  747  4  0.9500
#> 29  863  4 -1.7000
#> 30  992  4 -0.1700
#> 31   50  5 -0.0350
#> 32   78  5  0.6800
#> 33  115  5  1.8000
#> 34  182  5 -0.9400
#> 35  225  5  0.4600
#> 36  486  5  1.6000
#> 37  633  5  1.6000
#> 38  907  5 -0.8800
#> 39  934  5  1.7000
#> 40   51  6  1.2000
#> 41   57  6  0.6800
#> 42  136  6 -0.4500
#> 43  388  6 -0.5700
#> 44  447  6 -0.2000
#> 45  522  6 -0.1600
#> 46  936  6 -1.0000
#> 47  217  7  0.3800
#> 48  273  7  0.2200
#> 49  429  7 -0.2600
#> 50  473  7 -0.9900
#> 51  500  7  0.8000
#> 52  519  7 -0.3300
#> 53  526  7 -0.7700
#> 54  563  7 -0.0510
#> 55  611  7 -0.7000
#> 56  655  7 -0.0700
#> 57  763  7 -0.0980
#> 58  789  7 -0.1200
#> 59  877  7 -1.2000
#> 60  881  7  0.3700
#> 61  902  7  0.1600
#> 62   75  8  0.4000
#> 63  131  8 -1.3000
#> 64  146  8 -0.3300
#> 65  233  8 -0.6700
#> 66  514  8 -0.4300
#> 67  793  8  1.9000
#> 68  892  8  0.1700
#> 69  976  8 -0.3100
#> 70  174  9 -2.6000
#> 71  229  9  0.5200
#> 72  462  9 -0.3300
#> 73  465  9  0.1300
#> 74  479  9  0.2600
#> 75  547  9  0.3900
#> 76  613  9 -1.0000
#> 77  615  9  0.5700
#> 78  666  9  0.2500
#> 79  696  9  1.1000
#> 80  749  9 -1.6000
#> 81   86 10  0.7400
#> 82  182 10  0.5600
#> 83  251 10  0.6000
#> 84  326 10  0.0021
#> 85  346 10 -0.6600
#> 86  359 10 -0.5400
#> 87  392 10  0.6400
#> 88  458 10 -0.8500
#> 89  638 10  0.8300
#> 90  714 10 -0.6400
#> 91  803 10  0.8900
#> 92  888 10  0.5400
#> 93  941 10 -1.5000
#> 94  140 11  1.8000
#> 95  148 11 -1.1000
#> 96  271 11  0.5700
#> 97  618 11  2.5000
#> 98  789 11  1.3000
#> 99  816 11  0.4800
#> 100 844 11  2.0000
#> 101  41 12 -0.9300
#> 102  96 12 -1.6000
#> 103 114 12 -0.8600
#> 104 260 12 -0.0690
#> 105 367 12 -1.2000
#> 106 385 12  0.6800
#> 107 478 12 -2.4000
#> 108 546 12  0.2800
#> 109 558 12 -0.1900
#> 110 564 12 -1.4000
#> 111 571 12  2.1000
#> 112 854 12 -0.2800
#> 113 874 12  0.2400
#> 114 878 12  0.7300
#> 115 892 12 -0.1900
#> 116 980 12  0.4100
#> 117 204 13 -0.4300
#> 118 257 13  1.0000
#> 119 399 13 -1.3000
#> 120 414 13  1.0000
#> 121 428 13  0.3200
#> 122 535 13 -2.3000
#> 123 597 13  1.4000
#> 124 631 13  0.4100
#> 125 669 13 -1.2000
#> 126 269 14 -1.9000
#> 127 284 14  0.9900
#> 128 499 14  0.7900
#> 129 602 14  0.7100
#> 130 824 14 -0.0360
#> 131 837 14  1.5000
#> 132 877 14  1.3000
#> 133 903 14  0.2200
#> 134 928 14 -0.4000
#> 135 940 14 -0.4200
#> 136 973 14  0.2000
#> 137 128 15 -0.0600
#> 138 262 15 -1.2000
#> 139 283 15  0.9600
#> 140 330 15  1.0000
#> 141 413 15 -0.1600
#> 142 504 15 -0.2900
#> 143 662 15 -0.7300
#> 144 663 15 -0.2600
#> 145 713 15  0.4000
#> 146 837 15 -2.3000
#> 147 986 15  1.0000
#> 148  29 16  1.1000
#> 149 191 16 -2.9000
#> 150 225 16 -0.3800
#> 151 655 16 -0.2500
#> 152 705 16 -0.0450
#> 153 800 16  2.6000
#> 154  44 17  1.7000
#> 155  89 17 -1.3000
#> 156 168 17  0.5700
#> 157 236 17  0.4900
#> 158 693 17 -0.3500
#> 159 700 17  1.1000
#> 160 802 17  0.7700
#> 161 849 17 -1.4000
#> 162 854 17 -0.3300
#> 163 884 17  1.5000
#> 164 910 17 -0.0130
#> 165 916 17  0.9200
#> 166 920 17  0.7000
#> 167 169 18 -0.4800
#> 168 181 18 -0.1700
#> 169 202 18 -1.0000
#> 170 251 18 -0.3100
#> 171 453 18  0.5100
#> 172 513 18 -0.1600
#> 173 532 18  0.4300
#> 174 556 18 -0.4300
#> 175 626 18  0.3800
#> 176 649 18 -0.8200
#> 177 685 18  1.2000
#> 178 712 18 -0.3300
#> 179 744 18  0.9500
#> 180 175 19  0.4200
#> 181 183 19  0.9800
#> 182 204 19  0.2700
#> 183 217 19  1.5000
#> 184 279 19 -0.6300
#> 185 350 19 -0.3300
#> 186 494 19 -0.6400
#> 187 537 19  0.9500
#> 188 543 19 -1.2000
#> 189 562 19 -0.5800
#> 190 665 19 -1.2000
#> 191 845 19  0.8100
#> 192 964 19 -1.0000
#> 193 242 20  0.6700
#> 194 250 20  1.1000
#> 195 284 20 -0.1800
#> 196 485 20 -0.3400
#> 197 573 20 -0.3700
#> 198 866 20 -1.4000
#> 199 968 20  0.6100
#> 200 989 20 -1.9000

## pointers example in converting from other sparse matrix representations.
if(requireNamespace("SparseM") &&
   packageVersion("SparseM") >= "0.87" &&
   nzchar(dfil <- system.file("extdata", "rua_32_ax.rua", package = "SparseM"))) {
  X <- SparseM::model.matrix(SparseM::read.matrix.hb(dfil))
  XX <- sparseMatrix(j = X@ja, p = X@ia - 1L, x = X@ra, dims = X@dimension)
  validObject(XX)

  ## Alternatively, and even more user friendly :
  X. <- as(X, "Matrix")  # or also
  X2 <- as(X, "sparseMatrix")
  stopifnot(identical(XX, X.), identical(X., X2))
}
#> Loading required namespace: SparseM