3.1 Matrix

Having mastered atomic vectors, which are an 1-dimensional objects containing elements of the same type, we now introduce another object type, called matrix, which is a rectangular array (2-dimensional) that contains elements of the same type, arranged in rows and columns.

3.1.1 Create a matrix from a vector

One of most common ways to create a matrix from a vector is to use the function matrix(). In the matrix() function, the first argument is the data vector, and the following arguments of nrow and ncol represent the desired numbers of rows and columns of the matrix.

matrix(data = 1:12, nrow = 3, ncol = 4)
#>      [,1] [,2] [,3] [,4]
#> [1,]    1    4    7   10
#> [2,]    2    5    8   11
#> [3,]    3    6    9   12

Note that when we supply the arguments in the default order of an R function, we can omit the argument names. As a result, the following expression generates the same matrix.

matrix(1:12, 3, 4)

Typically, the length of the supplied vector equals the number of rows multiplied by the number of columns. Otherwise, R will use the recycling rule we learned in Section 2.1.2 on the vector to fill in the matrix. This recycling rule is particularly useful to create matrix consisting elements of the same value.

matrix(6, 3, 3)
#>      [,1] [,2] [,3]
#> [1,]    6    6    6
#> [2,]    6    6    6
#> [3,]    6    6    6

As mentioned before, some arguments of the function could be omitted for simpler expression and cleaner coding. In this case, you can just specify nrow or ncol if the value of the other one can be implied, and still get the exact same matrices as shown below.

matrix(1:12, nrow = 3)
matrix(1:12, ncol = 4)

Looking at the resulting matrix, you may notice that the matrix is created by filling the columns sequentially with the elements from the input vector. That is, it first fills the first column, then the second column, and so on. If you want to fill by rows instead, you can add the argument byrow = TRUE.

matrix(1:12, nrow = 4, byrow = TRUE)
#>      [,1] [,2] [,3]
#> [1,]    1    2    3
#> [2,]    4    5    6
#> [3,]    7    8    9
#> [4,]   10   11   12

After defining a matrix, we can apply various functions to it.

x <- matrix(1:12, nrow = 4)
dim(x)  #the dimension        
#> [1] 4 3
nrow(x)  #the number of rows
#> [1] 4
ncol(x)  #the number of columns
#> [1] 3
t(x)  #the transpose
#>      [,1] [,2] [,3] [,4]
#> [1,]    1    2    3    4
#> [2,]    5    6    7    8
#> [3,]    9   10   11   12

Now, let’s check its class using class(), internal storage type using typeof(), structure using str(), and attributes using attributes().

class(x)
#> [1] "matrix" "array"
typeof(x)
#> [1] "integer"
str(x)
#>  int [1:4, 1:3] 1 2 3 4 5 6 7 8 9 10 ...
attributes(x)
#> $dim
#> [1] 4 3

We can see that x is of matrix class, and the storage type is integer. The reason is that when creating consecutive integers using the : operator, R will interpret it as integers. x has one attribute named dim with value 4 3.

Attributes are very critical in determining the class as well as the structure of an R object. You can completely change the class and structure of an R object by specifying/changing its attributes.

v <- rep(1:3, 2)
class(v)
str(v)
attr(v, "dim") <- c(2, 3)
v
class(v)
str(v)

In this example, you will convert a length-6 vector into a 2-by-3 matrix by setting the "dim" attributes.

To set the names of a matrix, you can use rownames() and colnames() to set the row names and column names, respectively.

rownames(x) <- c("a", "b", "c", "d")  #row names
colnames(x) <- c("x", "y", "z")  #column names

In addition, you can use the same functions rownames() and colnames() to extract the row and column names.

rownames(x)
colnames(x)

We can also convert a matrix to a vector, which will arrange the elements of the matrix column by column.

as.vector(x)  #convert matrix to a vector
#>  [1]  1  2  3  4  5  6  7  8  9 10 11 12

In addition to numeric matrices, you can also create character matrices from a character vector.

char_mat <- matrix(letters[1:6], 2, 3)
char_mat
#>      [,1] [,2] [,3]
#> [1,] "a"  "c"  "e" 
#> [2,] "b"  "d"  "f"

Another way of creating matrix from a vector is to specify the desired dimensions (length-2 integer vector) to the dim() function.

my_vec <- 1:6
my_vec
#> [1] 1 2 3 4 5 6
dim(my_vec) <- c(2, 3)
my_vec
#>      [,1] [,2] [,3]
#> [1,]    1    3    5
#> [2,]    2    4    6

You can see here, my_vec becomes a matrix after we set its dimensions.

3.1.2 Combine vectors or matrices into a matrix

To combine two vectors into a matrix, you can use the rbind() or cbind() function to stack the vectors together by row or by column, respectively.

z <- 1:4
w <- 5:8
rbind(z, w)
#>   [,1] [,2] [,3] [,4]
#> z    1    2    3    4
#> w    5    6    7    8
cbind(z, w)
#>      z w
#> [1,] 1 5
#> [2,] 2 6
#> [3,] 3 7
#> [4,] 4 8

In addition to combine two vectors, you can also use rbind() and cbind() to combine two matrices.

m1 <- matrix(1:6, 2, 3)
m2 <- matrix(5:10, 2, 3)
rbind(m1, m2)
#>      [,1] [,2] [,3]
#> [1,]    1    3    5
#> [2,]    2    4    6
#> [3,]    5    7    9
#> [4,]    6    8   10
cbind(m1, m2)
#>      [,1] [,2] [,3] [,4] [,5] [,6]
#> [1,]    1    3    5    5    7    9
#> [2,]    2    4    6    6    8   10

3.1.3 Matrix subsetting

Like vector subsetting introduced in 2.5.2, we can do matrix subsetting as well.

a. using indices to do matrix subsetting

The first method for matrix subsetting is to specify the desired row indices and column indices, separated by ,. For example, we can extract the (1, 1) and (2, 3) element of x using the following codes.

x[1, 1]  #the element on the first row and first column
#> [1] 1
x[2, 3]  #the element on the second row and third column
#> [1] 10

To get a submatrix with multiple rows and columns, you just need to supply the row and column indices separated by ,.

x[1:2, 2:3]  #the elements on the first & second row and second & third column
#>   y  z
#> a 5  9
#> b 6 10

To keep all the rows or columns, you can leave the corresponding index location empty.

x[2, ]  #all elements on the second row
#>  x  y  z 
#>  2  6 10
x[, 3]  #all elements on the third column
#>  a  b  c  d 
#>  9 10 11 12
x[, c(2, 3)]  #all elements on the second and third columns
#>   y  z
#> a 5  9
#> b 6 10
#> c 7 11
#> d 8 12

As you can see from the first two results, if the result is only one-dimensional, R will drop the other index and return a vector instead of a matrix. If you need to keep the result as a matrix, you can add a third “dimension”, drop = FALSE in the subsetting operation.

x[2, , drop = FALSE]
#>   x y  z
#> b 2 6 10
x[, 3, drop = FALSE]
#>    z
#> a  9
#> b 10
#> c 11
#> d 12

Similar to vectors, you can use negative indices to get all the rows or columns except the specified one(s).

x[-2, 3]  #all rows except the 2nd row, the 3rd column
#>  a  c  d 
#>  9 11 12
x[-1, -c(2, 3)]  #all rows except the 1st row, except the 2nd and the 3rd column
#> b c d 
#> 2 3 4

b. using row and column names to do matrix subsetting

Just like vector subsetting for named vectors (Section 2.5.2), we can extract a submatrix using the row and columns names.

x["a", "z"]
x[c("a", "c"), c("x", "y")]
x["b", ]

c. using logical vectors to do matrix subsetting

Similar to vector subsetting, you can also use logical vectors to do matrix subsetting. Note that different from vector subsetting, you can supply two logical vectors, one for rows and another for columns. Let’s see some examples.

x[c(T, F, T, F), c(F, T, T)]
x[c(F, T, F, T), ]
x[, c(T, F, F)]

In addition to using the logical values directly, you can also create a logical vector and use it on the fly to do matrix subsetting. Let’s say we want to keep the rows with the value on the y column that are greater than 6. To do that, you can create a logical vector x[, "y"] > 6, then use it to subset the corresponding rows.

x[, "y"] > 6  #logical vector for the rows such that the `y` column > 6
#>     a     b     c     d 
#> FALSE FALSE  TRUE  TRUE
x[x[, "y"] > 6, ]  #extract the corresponding rows
#>   x y  z
#> c 3 7 11
#> d 4 8 12

Similarly, if we want to keep the columns with the value on the b row less than 7. To do that, you can create a logical vector x["b", ] < 7, then use it to subset the corresponding columns.

x["b", ] < 7  #logical vector for the columns such that the `b` row < 7
#>     x     y     z 
#>  TRUE  TRUE FALSE
x[, x["b", ] < 7]  #extract the corresponding columns
#>   x y
#> a 1 5
#> b 2 6
#> c 3 7
#> d 4 8

Of course, you can combine the two requirements, namely, keep the rows with the value on the y column greater than 6 and the columns with the value on the b row less than 7.

x[x[, "y"] > 6, x["b", ] < 7]

3.1.4 Modify values in sub-matrices

Just as modifying values on sub-vectors (Section 2.5), we can use the same technique for modifying values in sub-matrices. To avoid the changing the original matrix, we will modify the values on an copy of the original matrix.

x_copy <- x
x_copy[1, 2] <- 10  #update the (1, 2) element
x_copy
#>   x  y  z
#> a 1 10  9
#> b 2  6 10
#> c 3  7 11
#> d 4  8 12
x_copy[1:2, 3] <- c(20, 30)  #update the (1:2, 3) submatrix
x_copy
#>   x  y  z
#> a 1 10 20
#> b 2  6 30
#> c 3  7 11
#> d 4  8 12
x_copy[, 2] <- 40  #update the 2nd column to 40
x_copy
#>   x  y  z
#> a 1 40 20
#> b 2 40 30
#> c 3 40 11
#> d 4 40 12
x_copy[c(2, 4), c(1, 3)] <- 50
x_copy
#>    x  y  z
#> a  1 40 20
#> b 50 40 50
#> c  3 40 11
#> d 50 40 50

3.1.5 Operators and functions on matrices

Now, let’s introduce some commonly used operators and functions on matrices. First of all, if you use arithmetic operators between two matrices, the specified operation will be performed elementwisely, similar to the operation between two vectors.

m1 <- matrix(c(2, 1, 1, 2), 2, 2)
m2 <- matrix(c(1, 2, 2, 1), 2, 2)
m1 + m2
#>      [,1] [,2]
#> [1,]    3    3
#> [2,]    3    3
m1 * m2
#>      [,1] [,2]
#> [1,]    2    2
#> [2,]    2    2
m1/m2
#>      [,1] [,2]
#> [1,]  2.0  0.5
#> [2,]  0.5  2.0

You can also apply operations between a matrix and a number (a vector of length 1), where the recycling rule introduced in Section 2.1.2 will apply.

m1 * 2

To perform the actual matrix multiplication, you can use the operator %*% between two matrices.

m1 %*% m2
#>      [,1] [,2]
#> [1,]    4    5
#> [2,]    5    4

We will next introduce some functions for creating special matrices.

To create a diagonal matrix, you can use the diag() function on a vector.

diag(1:5)
#>      [,1] [,2] [,3] [,4] [,5]
#> [1,]    1    0    0    0    0
#> [2,]    0    2    0    0    0
#> [3,]    0    0    3    0    0
#> [4,]    0    0    0    4    0
#> [5,]    0    0    0    0    5

The diag() function can also be used to extract the diagonal elements of a square matrix.

diag(m1)
#> [1] 2 2

For a squared matrix \(A_{n\times n} = [A_{ij}]\), you can also calculate its determinant \(det(A)\) using det(A). If \(A\) is invertible, you can compute its inverse matrix \(A^{-1}\) using solve(A).

det(m1)
#> [1] 3
solve(m1)
#>            [,1]       [,2]
#> [1,]  0.6666667 -0.3333333
#> [2,] -0.3333333  0.6666667
m1 %*% solve(m1)  ##verify we are getting the inverse of m1
#>      [,1] [,2]
#> [1,]    1    0
#> [2,]    0    1

To apply a function on all elements of a matrix, you can directly use the function on the matrix object as if it is a vector. The result is equivalent to first convert the matrix into a vector using as.vector() and then apply the function on the vector.

sum(x)
#> [1] 78
mean(x)
#> [1] 6.5
quantile(x, c(0.25, 0.5, 0.75))
#>  25%  50%  75% 
#> 3.75 6.50 9.25
cumsum(x)
#>  [1]  1  3  6 10 15 21 28 36 45 55 66 78

3.1.6 Apply functions on each row or each column

In many applications, we may want to apply certain function on each row or column. To do this, you can use the apply() function, which takes three arguments by default.

The first argument is the matrix.
The second argument is the dimension(s) to apply the function on.
The third argument is the function to be applied.

For example, if you want to calculate the mean and sum of each row for x, you can use

apply(x, 1, mean)  #calculate the mean of each row
#> a b c d 
#> 5 6 7 8
rowMeans(x)  #calculate the mean of each row
#> a b c d 
#> 5 6 7 8
apply(x, 1, sum)  #calculate the sum of each row
#>  a  b  c  d 
#> 15 18 21 24
rowSums(x)  #calculate the sum of each row 
#>  a  b  c  d 
#> 15 18 21 24

Here, 1 means the first dimension, i.e. the row. You can see that the mean for the row a is (1 + 5 + 9)/3 = 5, and the sum is 1 + 5 + 9 = 15. To get the mean and sum of each row, you can also use rowMeans() and rowSums(). To calculate the mean and sum of each column for x, you can use

apply(x, 2, mean)  #calculate the mean of each column
#>    x    y    z 
#>  2.5  6.5 10.5
colMeans(x)
#>    x    y    z 
#>  2.5  6.5 10.5
apply(x, 2, sum)
#>  x  y  z 
#> 10 26 42
colSums(x)
#>  x  y  z 
#> 10 26 42

Here, 2 means the second dimension, i.e. the column. You can see that the mean for the column y is (5 + 6 + 7 + 8)/4 = 6.5, and the sum is 5 + 6 + 7 + 8 = 26.

In addition to the mean and sum functions, you can use any function defined on a vector. Following are some other examples.

apply(x, 2, sd)  #calculate the standard deviation of each column
#>        x        y        z 
#> 1.290994 1.290994 1.290994
apply(x, 1, max)  #calculate the max of each row
#>  a  b  c  d 
#>  9 10 11 12

In addition to the three default arguments for the apply() function, you can add more arguments that will be passed when applying the specified function, i.e. the original third argument. For example, to calculate the first quantile of each column,

apply(x, 2, quantile, 0.25)
#>    x    y    z 
#> 1.75 5.75 9.75

If the function you apply returns a vector with more than one elements, the apply() function will create a higher dimensional object. Let’s see an example of calculate the cumulative sum of each row.

apply(x, 1, cumsum)  #calculate the cumulative sum of each row
#>    a  b  c  d
#> x  1  2  3  4
#> y  6  8 10 12
#> z 15 18 21 24

The mechanism of the cumsum() function is applied on each row of x and the resulting vectors are combined into a matrix. The following reproduces the results using the cbind() function on the cumulative sum results.

cbind(cumsum(x[1, ]), cumsum(x[2, ]), cumsum(x[3, ]), cumsum(x[4, ]))

As another example, you can use the following code to calculate the (0.25, 0.5, 0.75) quantiles for each column of x.

apply(x, 2, quantile, c(0.25, 0.5, 0.75))
#>        x    y     z
#> 25% 1.75 5.75  9.75
#> 50% 2.50 6.50 10.50
#> 75% 3.25 7.25 11.25

Finally, let’s see an example of the apply() function on a character matrix. We want to combine the strings in each column of the matrix with separator _.

apply(char_mat, 2, paste, collapse = "_")
#> [1] "a_b" "c_d" "e_f"

3.1.7 Exercises

Use R to create the following matrix

#>      [,1] [,2] [,3] [,4] [,5] [,6]
#> [1,]    2    1    1    1    1    1
#> [2,]    1    2    1    1    1    1
#> [3,]    1    1    2    1    1    1
#> [4,]    1    1    1    2    1    1
#> [5,]    1    1    1    1    2    1
#> [6,]    1    1    1    1    1    2

For matrix X <- matrix(1:16, 4, 4) + diag(4), compute the following questions using R.

Compute the column means of X.
Create a matrix that contains the 0.4 and 0.7 quantiles for each row of X.
Compute the cumulative sum of each row of X. What type of object is the result? And explain the result of the first column.
If b <- 1:4, solve a such that \(Xa = b\).