3.5 List
In the last two sections, we have learned data frames and tibbles whose columns can contain variables of different types. However, they are still of rectangular format, where all variables have the same number of observations. Now, let’s learn the most complex object type in R, namely, the list, which contains a sequence of variables with possibly different number of observations (elements) and even with different types.
3.5.1 Create a list
To create a list, you can use the list()
function with the elements separated by comma.
dig_num <- 1:6
ani_char <- c("sheep", "pig", "monkey", "pig", "monkey", "pig")
x_mat <- matrix(1:12, nrow = 3, ncol = 4)
conditions <- c("Excellent", "Good", "Average", "Fair", "Good", "Excellent")
cond_fac <- factor(conditions, ordered = TRUE, levels = c("Fair", "Average", "Good",
"Excellent"))
my_list <- list(dig_num, ani_char, x_mat, cond_fac)
my_list
#> [[1]]
#> [1] 1 2 3 4 5 6
#>
#> [[2]]
#> [1] "sheep" "pig" "monkey" "pig" "monkey" "pig"
#>
#> [[3]]
#> [,1] [,2] [,3] [,4]
#> [1,] 1 4 7 10
#> [2,] 2 5 8 11
#> [3,] 3 6 9 12
#>
#> [[4]]
#> [1] Excellent Good Average Fair Good Excellent
#> Levels: Fair < Average < Good < Excellent
Here, we created a list named my_list
, which contains four elements of different types. From the output, at the beginning of each element, its index is represented by the surrounding [[
and ]]
. Usually, you want to assign a name for each element and the elements can be accessed by their names later on.
my_list <- list(number = dig_num, character = ani_char, matrix = x_mat, factor = cond_fac)
my_list
#> $number
#> [1] 1 2 3 4 5 6
#>
#> $character
#> [1] "sheep" "pig" "monkey" "pig" "monkey" "pig"
#>
#> $matrix
#> [,1] [,2] [,3] [,4]
#> [1,] 1 4 7 10
#> [2,] 2 5 8 11
#> [3,] 3 6 9 12
#>
#> $factor
#> [1] Excellent Good Average Fair Good Excellent
#> Levels: Fair < Average < Good < Excellent
After the elements are named, the output shows the $
followed by the name before each element. Let’s examine the class, structure, and internal storage type of my_list
.
class(my_list)
#> [1] "list"
str(my_list)
#> List of 4
#> $ number : int [1:6] 1 2 3 4 5 6
#> $ character: chr [1:6] "sheep" "pig" "monkey" "pig" ...
#> $ matrix : int [1:3, 1:4] 1 2 3 4 5 6 7 8 9 10 ...
#> $ factor : Ord.factor w/ 4 levels "Fair"<"Average"<..: 4 3 2 1 3 4
typeof(my_list)
#> [1] "list"
From this example, it is clear that lists are more general than all the object types we have learned. It can contain vectors, matrices, arrays, data frames/tibbles, and even lists.
In R, data frames are actually stored as lists. Indeed, they are special lists where each element is a vector (could be of different types) and each element is of the same length. Let’s create a data frame and look at its typeof()
.
For this reason, data frame and matrix are treated completely differently in R. Most functions on lists can be used directly on data frames.
Sometimes, you may want to create a list skeleton and fill in the values at a later time. In this case, you can use the vector(mode, length)
function.
Note that the default value is NULL
, which will be explained in detail in the next Section.
3.5.2 Extract a list element and list subsetting
a. Extract a list element
To extract a single element from a list, you can use the $
followed by the element name if the element is named, or use the index surrounded by [[
and ]]
. After the elements are extracted, you can directly apply desired functions on them without the need to assign the values to another name.
my_list$number #a vector
#> [1] 1 2 3 4 5 6
# my_list[[1]] #the first element my_list$matrix #a matrix
my_list[[3]] #the third element
#> [,1] [,2] [,3] [,4]
#> [1,] 1 4 7 10
#> [2,] 2 5 8 11
#> [3,] 3 6 9 12
mean(my_list$number)
#> [1] 3.5
rowMeans(my_list$matrix)
#> [1] 5.5 6.5 7.5
If you want to do list subsetting by extracting multiple elements, you can follow similar methods as subsetting a vector introduced in Section 2.5.2.
b. Use indices to do list subsetting
The first method is to use indices to do list subsetting. To get a sublist of the 3rd element of the original list my_list
, you can use my_list[3]
as below.
my_list[3] #the third element of my_list
#> $matrix
#> [,1] [,2] [,3] [,4]
#> [1,] 1 4 7 10
#> [2,] 2 5 8 11
#> [3,] 3 6 9 12
It is worth to do a comparison between the results of my_list[[3]]
and my_list[3]
. The former is the third element of my_list
which is a matrix, while the latter is a list containing a single matrix element. Let’s confirm this by looking at their structures.
str(my_list[[3]])
#> int [1:3, 1:4] 1 2 3 4 5 6 7 8 9 10 ...
str(my_list[3])
#> List of 1
#> $ matrix: int [1:3, 1:4] 1 2 3 4 5 6 7 8 9 10 ...
Of course, you can create a sublist containing multiple element, just like when creating subvectors using indices.
my_list[c(1, 3)] #the first and third elements of my_list
#> $number
#> [1] 1 2 3 4 5 6
#>
#> $matrix
#> [,1] [,2] [,3] [,4]
#> [1,] 1 4 7 10
#> [2,] 2 5 8 11
#> [3,] 3 6 9 12
# my_list[-3] #all elements except the third one
c. Use names to do list subsetting
When the list elements are named, you can also use names around [
and ]
to do list subsetting. You can also use a character vector containing the element names.
3.5.3 List inside a list
One interesting aspect of the list type is that you can have lists inside a list.
my_big_list <- list(small_list = my_list, number = dig_num)
my_big_list
#> $small_list
#> $small_list$number
#> [1] 1 2 3 4 5 6
#>
#> $small_list$character
#> [1] "sheep" "pig" "monkey" "pig" "monkey" "pig"
#>
#> $small_list$matrix
#> [,1] [,2] [,3] [,4]
#> [1,] 1 4 7 10
#> [2,] 2 5 8 11
#> [3,] 3 6 9 12
#>
#> $small_list$factor
#> [1] Excellent Good Average Fair Good Excellent
#> Levels: Fair < Average < Good < Excellent
#>
#>
#> $number
#> [1] 1 2 3 4 5 6
If you want to extract ani_char
from my_list
, you need to use two layers of extraction operations.
my_big_list$small_list$ani_char
#> NULL
my_big_list[[1]][[2]]
#> [1] "sheep" "pig" "monkey" "pig" "monkey" "pig"
You can further put the my_big_list
in another list.
my_big_big_list <- list(big_list = my_big_list, character = ani_char)
my_big_big_list
#> $big_list
#> $big_list$small_list
#> $big_list$small_list$number
#> [1] 1 2 3 4 5 6
#>
#> $big_list$small_list$character
#> [1] "sheep" "pig" "monkey" "pig" "monkey" "pig"
#>
#> $big_list$small_list$matrix
#> [,1] [,2] [,3] [,4]
#> [1,] 1 4 7 10
#> [2,] 2 5 8 11
#> [3,] 3 6 9 12
#>
#> $big_list$small_list$factor
#> [1] Excellent Good Average Fair Good Excellent
#> Levels: Fair < Average < Good < Excellent
#>
#>
#> $big_list$number
#> [1] 1 2 3 4 5 6
#>
#>
#> $character
#> [1] "sheep" "pig" "monkey" "pig" "monkey" "pig"
To extract dig_num
from my_big_big_list
, you now need three layers of extraction operations.
3.5.4 Unlist a list
Sometime, you may want to unlist a list into a vector.
unlist(my_list)
#> number1 number2 number3 number4 number5 number6 character1
#> "1" "2" "3" "4" "5" "6" "sheep"
#> character2 character3 character4 character5 character6 matrix1 matrix2
#> "pig" "monkey" "pig" "monkey" "pig" "1" "2"
#> matrix3 matrix4 matrix5 matrix6 matrix7 matrix8 matrix9
#> "3" "4" "5" "6" "7" "8" "9"
#> matrix10 matrix11 matrix12 factor1 factor2 factor3 factor4
#> "10" "11" "12" "4" "3" "2" "1"
#> factor5 factor6
#> "3" "4"
c(my_list[[1]], my_list[[2]], my_list[[3]], my_list[[4]]) #reproduce the result
#> [1] "1" "2" "3" "4" "5" "6" "sheep" "pig"
#> [9] "monkey" "pig" "monkey" "pig" "1" "2" "3" "4"
#> [17] "5" "6" "7" "8" "9" "10" "11" "12"
#> [25] "4" "3" "2" "1" "3" "4"
As you can tell from the result, the unlist()
function works in the following steps.
- It visits each element of the list sequentially according to the indices, one at a time. Each element will be converted into a vector using the
as.vector()
function. - Then converted vectors of each element will be combined into a longer vector using the
c()
function. During thec()
operation, coercion rules will apply. - The named of the final vector will the name of each element followed by the index of the corresponding value inside each list element.
Here, my_list
doesn’t contain an element that is also a list. When a list contains another list as one of its elements, the unlist()
function will also apply the unlist operation on the element by default. In fact, the unlist operation will be applied recursively until none of the elements is a list. Please try to run the following example.
If you just want to unlist a list at the first level, you can add the argument recursive = FALSE
to the unlist()
function. Note that the result will still be a list if the original list contains a list. You can check the result of the following code.
3.5.5 Apply functions for each element of a list
It is often useful to apply function on each element of a list. To do that, you can use the lapply()
function (short for list apply). Let’s look at an example where we want to get the length
of each element of my_list
.
lapply(my_list, length)
#> $number
#> [1] 6
#>
#> $character
#> [1] 6
#>
#> $matrix
#> [1] 12
#>
#> $factor
#> [1] 6
You can see that the default output of lapply()
is another list which is the result of the function applied on each element of the list. In this application, it might to better to use a vector to represent the result. To smplify this process, you can use the sapply()
function, which is a user-friendly version and wrapper of lapply()
which will return a vector or a matrix when appropriate.
Let’s look at another example where we want to compute the quantiles of each element of a list.
3.5.6 List Matrix
In Section 3.1, we know that matrices are 2-dimensional objects containing elements of the same type. A natural extension of matrix to the case where different elements can be of different types is called list matrix.
Let’s create a list matrix from my_list
. Since my_list
has 4 elements, you can use the matrix()
function to create a \(2\times 2\) matrix of elements.
my_list_mat <- matrix(my_list, 2, 2)
my_list_mat
#> [,1] [,2]
#> [1,] integer,6 integer,12
#> [2,] character,6 ordered,6
To extract the elements from a list matrix, you can use the pair of [[
and ]]
with the indices inside. Similar to the list subsetting, using the pair of [
and ]
will result in a list of length 1.
my_list_mat[[1, 2]]
#> [,1] [,2] [,3] [,4]
#> [1,] 1 4 7 10
#> [2,] 2 5 8 11
#> [3,] 3 6 9 12
my_list_mat[1, 2]
#> [[1]]
#> [,1] [,2] [,3] [,4]
#> [1,] 1 4 7 10
#> [2,] 2 5 8 11
#> [3,] 3 6 9 12
Similar to combining vectors into matrices in Section 3.1.2, you can also combine lists into list matrix using the rbind()
or cbind()
functions.
l1 <- list(num = 1:4, mat = matrix(1:6, 2, 3))
l2 <- list(char = letters[1:5], my_list = list(num = 9:1, char = LETTERS[1:7]))
list_mat_r <- rbind(l1, l2)
list_mat_c <- cbind(l1, l2)
str(list_mat_r)
#> List of 4
#> $ : int [1:4] 1 2 3 4
#> $ : chr [1:5] "a" "b" "c" "d" ...
#> $ : int [1:2, 1:3] 1 2 3 4 5 6
#> $ :List of 2
#> ..$ num : int [1:9] 9 8 7 6 5 4 3 2 1
#> ..$ char: chr [1:7] "A" "B" "C" "D" ...
#> - attr(*, "dim")= int [1:2] 2 2
#> - attr(*, "dimnames")=List of 2
#> ..$ : chr [1:2] "l1" "l2"
#> ..$ : chr [1:2] "num" "mat"
typeof(list_mat_r)
#> [1] "list"
You can see from the result that list_mat_r
is still stored as a list, but with the "dim"
attributes. This relationship is very similar to that between a vector and its converted matrix.
3.5.7 Exercises
Consider the following list,
dig_num <- 1:6
ani_char <- c("sheep", "pig", "monkey", "pig", "monkey")
x_mat <- matrix(1:12, nrow = 3, ncol = 4)
my_list <- list(num = dig_num, char = ani_char, mat = x_mat)
my_list
#> $num
#> [1] 1 2 3 4 5 6
#>
#> $char
#> [1] "sheep" "pig" "monkey" "pig" "monkey"
#>
#> $mat
#> [,1] [,2] [,3] [,4]
#> [1,] 1 4 7 10
#> [2,] 2 5 8 11
#> [3,] 3 6 9 12
- What’s the difference between
my_list[2]
andmy_list[3]
? - What’s the difference between
my_list[2:3]
andmy_list[[2:3]]
? - Using the
sapply
function to get the length of each element inmy_list
. - Multiple the 3rd row of
mat
inmy_list
by 10. Then, check the new value ofmy_list
.