3.5 List

In the last two sections, we have learned data frames and tibbles whose columns can contain variables of different types. However, they are still of rectangular format, where all variables have the same number of observations. Now, let’s learn the most complex object type in R, namely, the list, which contains a sequence of variables with possibly different number of observations (elements) and even with different types.

3.5.1 Create a list

To create a list, you can use the list() function with the elements separated by comma.

dig_num <- 1:6
ani_char <- c("sheep", "pig", "monkey", "pig", "monkey", "pig")
x_mat <- matrix(1:12, nrow = 3, ncol = 4)
conditions <- c("Excellent", "Good", "Average", "Fair", "Good", "Excellent")
cond_fac <- factor(conditions, ordered = TRUE, levels = c("Fair", "Average", "Good",
    "Excellent"))
my_list <- list(dig_num, ani_char, x_mat, cond_fac)
my_list
#> [[1]]
#> [1] 1 2 3 4 5 6
#> 
#> [[2]]
#> [1] "sheep"  "pig"    "monkey" "pig"    "monkey" "pig"   
#> 
#> [[3]]
#>      [,1] [,2] [,3] [,4]
#> [1,]    1    4    7   10
#> [2,]    2    5    8   11
#> [3,]    3    6    9   12
#> 
#> [[4]]
#> [1] Excellent Good      Average   Fair      Good      Excellent
#> Levels: Fair < Average < Good < Excellent

Here, we created a list named my_list, which contains four elements of different types. From the output, at the beginning of each element, its index is represented by the surrounding [[ and ]]. Usually, you want to assign a name for each element and the elements can be accessed by their names later on.

my_list <- list(number = dig_num, character = ani_char, matrix = x_mat, factor = cond_fac)
my_list
#> $number
#> [1] 1 2 3 4 5 6
#> 
#> $character
#> [1] "sheep"  "pig"    "monkey" "pig"    "monkey" "pig"   
#> 
#> $matrix
#>      [,1] [,2] [,3] [,4]
#> [1,]    1    4    7   10
#> [2,]    2    5    8   11
#> [3,]    3    6    9   12
#> 
#> $factor
#> [1] Excellent Good      Average   Fair      Good      Excellent
#> Levels: Fair < Average < Good < Excellent

After the elements are named, the output shows the $ followed by the name before each element. Let’s examine the class, structure, and internal storage type of my_list.

class(my_list)
#> [1] "list"
str(my_list)
#> List of 4
#>  $ number   : int [1:6] 1 2 3 4 5 6
#>  $ character: chr [1:6] "sheep" "pig" "monkey" "pig" ...
#>  $ matrix   : int [1:3, 1:4] 1 2 3 4 5 6 7 8 9 10 ...
#>  $ factor   : Ord.factor w/ 4 levels "Fair"<"Average"<..: 4 3 2 1 3 4
typeof(my_list)
#> [1] "list"

From this example, it is clear that lists are more general than all the object types we have learned. It can contain vectors, matrices, arrays, data frames/tibbles, and even lists.

In R, data frames are actually stored as lists. Indeed, they are special lists where each element is a vector (could be of different types) and each element is of the same length. Let’s create a data frame and look at its typeof().

df_ex <- data.frame(dig_num, ani_char)
typeof(df_ex)
#> [1] "list"
length(df_ex)
#> [1] 2

For this reason, data frame and matrix are treated completely differently in R. Most functions on lists can be used directly on data frames.

Sometimes, you may want to create a list skeleton and fill in the values at a later time. In this case, you can use the vector(mode, length) function.

vector("list", length = 3)
#> [[1]]
#> NULL
#> 
#> [[2]]
#> NULL
#> 
#> [[3]]
#> NULL

Note that the default value is NULL, which will be explained in detail in the next Section.

3.5.2 Extract a list element and list subsetting

a. Extract a list element

To extract a single element from a list, you can use the $ followed by the element name if the element is named, or use the index surrounded by [[ and ]]. After the elements are extracted, you can directly apply desired functions on them without the need to assign the values to another name.

my_list$number  #a vector
#> [1] 1 2 3 4 5 6
my_list[[1]]  #the first element
#> [1] 1 2 3 4 5 6
my_list$matrix  #a matrix
#>      [,1] [,2] [,3] [,4]
#> [1,]    1    4    7   10
#> [2,]    2    5    8   11
#> [3,]    3    6    9   12
my_list[[3]]  #the third element
#>      [,1] [,2] [,3] [,4]
#> [1,]    1    4    7   10
#> [2,]    2    5    8   11
#> [3,]    3    6    9   12
mean(my_list$number)
#> [1] 3.5
rowMeans(my_list$matrix)
#> [1] 5.5 6.5 7.5

If you want to do list subsetting by extracting multiple elements, you can follow similar methods as subsetting a vector introduced in Section 2.5.2.

b. Use indices to do list subsetting

The first method is to use indices to do list subsetting. To get a sublist of the 3rd element of the original list my_list, you can use my_list[3] as below.

my_list[3]  #the third element of my_list
#> $matrix
#>      [,1] [,2] [,3] [,4]
#> [1,]    1    4    7   10
#> [2,]    2    5    8   11
#> [3,]    3    6    9   12

It is worth to do a comparison between the results of my_list[[3]] and my_list[3]. The former is the third element of my_list which is a matrix, while the latter is a list containing a single matrix element. Let’s confirm this by looking at their structures.

str(my_list[[3]])
#>  int [1:3, 1:4] 1 2 3 4 5 6 7 8 9 10 ...
str(my_list[3])
#> List of 1
#>  $ matrix: int [1:3, 1:4] 1 2 3 4 5 6 7 8 9 10 ...

Of course, you can create a sublist containing multiple element, just like when creating subvectors using indices.

my_list[c(1, 3)]  #the first and third elements of my_list
#> $number
#> [1] 1 2 3 4 5 6
#> 
#> $matrix
#>      [,1] [,2] [,3] [,4]
#> [1,]    1    4    7   10
#> [2,]    2    5    8   11
#> [3,]    3    6    9   12
my_list[-3]  #all elements except the third one
#> $number
#> [1] 1 2 3 4 5 6
#> 
#> $character
#> [1] "sheep"  "pig"    "monkey" "pig"    "monkey" "pig"   
#> 
#> $factor
#> [1] Excellent Good      Average   Fair      Good      Excellent
#> Levels: Fair < Average < Good < Excellent

c. Use names to do list subsetting

When the list elements are named, you can also use names inside [ and ] to do list subsetting. You can also use a character vector containing the element names.

my_list["number"]
#> $number
#> [1] 1 2 3 4 5 6
my_list[c("number", "matrix")]
#> $number
#> [1] 1 2 3 4 5 6
#> 
#> $matrix
#>      [,1] [,2] [,3] [,4]
#> [1,]    1    4    7   10
#> [2,]    2    5    8   11
#> [3,]    3    6    9   12

3.5.3 List inside a list

One interesting aspect of the list type is that you can have lists inside a list.

my_big_list <- list(small_list = my_list, number = dig_num)
my_big_list
#> $small_list
#> $small_list$number
#> [1] 1 2 3 4 5 6
#> 
#> $small_list$character
#> [1] "sheep"  "pig"    "monkey" "pig"    "monkey" "pig"   
#> 
#> $small_list$matrix
#>      [,1] [,2] [,3] [,4]
#> [1,]    1    4    7   10
#> [2,]    2    5    8   11
#> [3,]    3    6    9   12
#> 
#> $small_list$factor
#> [1] Excellent Good      Average   Fair      Good      Excellent
#> Levels: Fair < Average < Good < Excellent
#> 
#> 
#> $number
#> [1] 1 2 3 4 5 6

If you want to extract ani_char from my_list, you need to use two layers of extraction operations.

my_big_list$small_list$ani_char
#> NULL
my_big_list[[1]][[2]]
#> [1] "sheep"  "pig"    "monkey" "pig"    "monkey" "pig"

You can further put the my_big_list in another list.

my_big_big_list <- list(big_list = my_big_list, character = ani_char)
my_big_big_list
#> $big_list
#> $big_list$small_list
#> $big_list$small_list$number
#> [1] 1 2 3 4 5 6
#> 
#> $big_list$small_list$character
#> [1] "sheep"  "pig"    "monkey" "pig"    "monkey" "pig"   
#> 
#> $big_list$small_list$matrix
#>      [,1] [,2] [,3] [,4]
#> [1,]    1    4    7   10
#> [2,]    2    5    8   11
#> [3,]    3    6    9   12
#> 
#> $big_list$small_list$factor
#> [1] Excellent Good      Average   Fair      Good      Excellent
#> Levels: Fair < Average < Good < Excellent
#> 
#> 
#> $big_list$number
#> [1] 1 2 3 4 5 6
#> 
#> 
#> $character
#> [1] "sheep"  "pig"    "monkey" "pig"    "monkey" "pig"

To extract dig_num from my_big_big_list, you now need three layers of extraction operations.

my_big_big_list$big_list$small_list[[1]]
#> [1] 1 2 3 4 5 6
my_big_big_list[[1]][[1]][[1]]
#> [1] 1 2 3 4 5 6
my_big_big_list[[1]]$small_list[[1]]  #mix the two ways for value extraction
#> [1] 1 2 3 4 5 6

3.5.4 Unlist a list

Sometime, you may want to unlist a list into a vector.

unlist(my_list)
#>    number1    number2    number3    number4    number5    number6 character1 
#>        "1"        "2"        "3"        "4"        "5"        "6"    "sheep" 
#> character2 character3 character4 character5 character6    matrix1    matrix2 
#>      "pig"   "monkey"      "pig"   "monkey"      "pig"        "1"        "2" 
#>    matrix3    matrix4    matrix5    matrix6    matrix7    matrix8    matrix9 
#>        "3"        "4"        "5"        "6"        "7"        "8"        "9" 
#>   matrix10   matrix11   matrix12    factor1    factor2    factor3    factor4 
#>       "10"       "11"       "12"        "4"        "3"        "2"        "1" 
#>    factor5    factor6 
#>        "3"        "4"
c(my_list[[1]], my_list[[2]], my_list[[3]], my_list[[4]])  #reproduce the result
#>  [1] "1"      "2"      "3"      "4"      "5"      "6"      "sheep"  "pig"   
#>  [9] "monkey" "pig"    "monkey" "pig"    "1"      "2"      "3"      "4"     
#> [17] "5"      "6"      "7"      "8"      "9"      "10"     "11"     "12"    
#> [25] "4"      "3"      "2"      "1"      "3"      "4"

As you can see from the result, the unlist() function operates in the following steps:

It visits each element of the list sequentially, following the indices. Each element is converted into a vector using the as.vector() function.
The resulting vectors are then combined into a single, longer vector using the c() function. During this step, coercion rules apply.
The names of the final vector are formed by combining the name of each list element with the index of the corresponding value within that element.

Here, my_list doesn’t contain an element that is also a list. When a list contains another list as one of its elements, the unlist() function will also apply the unlist operation on the element by default. In fact, the unlist operation will be applied recursively until none of the elements is a list. Please try to run the following example.

unlist(my_big_big_list)

If you just want to unlist a list at the first level, you can add the argument recursive = FALSE to the unlist() function. Note that the result will still be a list if the original list contains a list. You can check the result of the following code.

unlist(my_big_big_list, recursive = FALSE)

3.5.5 Apply functions for each element of a list

It is often useful to apply function on each element of a list. To do that, you can use the lapply() function (short for list apply). Let’s look at an example where we want to get the length of each element of my_list.

lapply(my_list, length)
#> $number
#> [1] 6
#> 
#> $character
#> [1] 6
#> 
#> $matrix
#> [1] 12
#> 
#> $factor
#> [1] 6

You can see that the default output of lapply() is another list which is the result of the function applied on each element of the list. In this application, it might to better to use a vector to represent the result. To smplify this process, you can use the sapply() function, which is a user-friendly version and wrapper of lapply() which will return a vector or a matrix when appropriate.

sapply(my_list, length)
#>    number character    matrix    factor 
#>         6         6        12         6

Let’s look at another example where we want to compute the quantiles of each element of a list.

my_num_list <- list(a = 1:20, b = rep(c(TRUE, FALSE), c(6, 3)))
sapply(my_num_list, quantile)
#>          a b
#> 0%    1.00 0
#> 25%   5.75 0
#> 50%  10.50 1
#> 75%  15.25 1
#> 100% 20.00 1

3.5.6 List Matrix

In Section 3.1, we know that matrices are 2-dimensional objects containing elements of the same type. A natural extension of matrix to the case where different elements can be of different types is called list matrix.

Let’s create a list matrix from my_list. Since my_list has 4 elements, you can use the matrix() function to create a $2\times 2$ matrix of elements.

my_list_mat <- matrix(my_list, 2, 2)
my_list_mat
#>      [,1]        [,2]      
#> [1,] integer,6   integer,12
#> [2,] character,6 ordered,6

To extract the elements from a list matrix, you can use the pair of [[ and ]] with the indices inside. Similar to the list subsetting, using the pair of [ and ] will result in a list of length 1.

my_list_mat[[1, 2]]
#>      [,1] [,2] [,3] [,4]
#> [1,]    1    4    7   10
#> [2,]    2    5    8   11
#> [3,]    3    6    9   12
my_list_mat[1, 2]
#> [[1]]
#>      [,1] [,2] [,3] [,4]
#> [1,]    1    4    7   10
#> [2,]    2    5    8   11
#> [3,]    3    6    9   12

Similar to combining vectors into matrices in Section 3.1.2, you can also combine lists into list matrix using the rbind() or cbind() functions.

l1 <- list(num = 1:4, mat = matrix(1:6, 2, 3))
l2 <- list(char = letters[1:5], my_list = list(num = 9:1, char = LETTERS[1:7]))
list_mat_r <- rbind(l1, l2)
list_mat_c <- cbind(l1, l2)
str(list_mat_r)
#> List of 4
#>  $ : int [1:4] 1 2 3 4
#>  $ : chr [1:5] "a" "b" "c" "d" ...
#>  $ : int [1:2, 1:3] 1 2 3 4 5 6
#>  $ :List of 2
#>   ..$ num : int [1:9] 9 8 7 6 5 4 3 2 1
#>   ..$ char: chr [1:7] "A" "B" "C" "D" ...
#>  - attr(*, "dim")= int [1:2] 2 2
#>  - attr(*, "dimnames")=List of 2
#>   ..$ : chr [1:2] "l1" "l2"
#>   ..$ : chr [1:2] "num" "mat"
typeof(list_mat_r)
#> [1] "list"

You can see from the result that list_mat_r is still stored as a list, but with the "dim" attributes. This relationship is very similar to that between a vector and its converted matrix.

3.5.7 Exercises

Consider the following list,

dig_num <- 1:6
ani_char <- c("sheep", "pig", "monkey", "pig", "monkey")
x_mat <- matrix(1:12, nrow = 3, ncol = 4)
my_list <- list(num = dig_num, char = ani_char, mat = x_mat)
my_list
#> $num
#> [1] 1 2 3 4 5 6
#> 
#> $char
#> [1] "sheep"  "pig"    "monkey" "pig"    "monkey"
#> 
#> $mat
#>      [,1] [,2] [,3] [,4]
#> [1,]    1    4    7   10
#> [2,]    2    5    8   11
#> [3,]    3    6    9   12

What’s the difference between my_list[2] and my_list[3]?
What’s the difference between my_list[2:3] and my_list[[2:3]]?
Using the sapply function to get the length of each element in my_list.
Multiple the 3rd row of mat in my_list by 10. Then, check the new value of my_list.