2.8 Set Operations between two vectors

In Section 2.7, we introduced logical operators which are operators between two logical vectors. In this section, we will discuss set operations between two vectors of the same type. Note that while the logical operators only make sense for logical vectors, the operations we will introduce work for all three kinds of vectors. In the rest of this section, we will introduce several operations one by one.

2.8.1 Numeric vectors

Let’s start with numeric vectors. Firstly, let’s create two numeric vectors x and y.

x <- c(1, 2, 1, 3, 1)
y <- c(1, 1, 3, 4, 4, 5)

a. Intersection
To get values in both x and y, you can use the intersect() function.

intersect(x, y)
#> [1] 1 3

Note that although x has three elements of 1 and y has two, the result of their intersection only has one 1, showing that only the unique elements are retained in the output.

b. Union
To get values in either x or y, you can use the union() function.

union(x, y)
#> [1] 1 2 3 4 5

Again, only one copy of each value is retained in the output.

c. Set difference
To get values in x but not in y, you can use the setdiff() function. Notice that x is in the first argument and y is in the second in this situation.

setdiff(x, y)
#> [1] 2

Then you will get the result of 2! But you may be wondering why you don’t get 1 2 as the result since there is one more 1 in x than in y? That’s because for this operation, you will get unique elements of x and y firstly, then find the set difference between them.

Similarly, if you want to get values in y but not in x, y should be in the first argument and x is in the second.

setdiff(y, x)
#> [1] 4 5

d. Set equality
To check whether the two vectors x and y are the same, you can use the setequal() function.

setequal(x, y)
#> [1] FALSE

Of course you will get FALSE since x has value 2 which y doesn’t have.

Similar to the setdiff() function, the setequal() function works by looking at whether the two vectors have same set of unique values. For example, you will get TRUE in the following example,

setequal(c(1, 1, 2), c(1, 2))
#> [1] TRUE

e. Membership determination
To check whether each element of x is inside y, you can use the is.element() function or the %in% operator.

is.element(x, y)
#> [1]  TRUE FALSE  TRUE  TRUE  TRUE
x %in% y
#> [1]  TRUE FALSE  TRUE  TRUE  TRUE

In this example, it returns a logical vector of length-5, the same length as x. The first element of x is 1, and y also has elements with value 1, so the first element of the logical vector is TRUE. The second element of x is 2, but y doesn’t have any elements with value 2, hence the result is FALSE. You can verify the other elements by yourself.

The order of vectors is important for membership determination since if you put y before x, you will check whether each element of y is inside x.

is.element(y, x)
#> [1]  TRUE  TRUE  TRUE FALSE FALSE FALSE
y %in% x
#> [1]  TRUE  TRUE  TRUE FALSE FALSE FALSE

Please find a summary of the set operations between x and y in the following table.

Operation Code
Intersection intersect(x, y)
Union union(x, y)
Set Difference setdiff(x, y)
Set Equality setequal(x, y)
Membership Determination is.element(x, y)

2.8.2 Character vectors

Now you must have been familiar with all the five operations! Similar to numeric vectors, you can also apply operations on character vectors. Here are some codes which you can run by yourself.

a <- c("sheep", "monkey", "sheep", "chicken", "dragon")
b <- c("sheep", "pig", "pig")
intersect(a, b)
union(a, b)
setdiff(a, b)
setequal(a, b)
a %in% b

2.8.3 Logical vectors

Of course you can also apply set operations on logical vectors.

c <- c("T", "F", "F", "T")
d <- c("T", "T", "T")
intersect(c, d)
union(c, d)
setdiff(c, d)
setequal(c, d)
c %in% d

2.8.4 Exercises

Consider the vector s1 <- seq(from = 1, to = 100, length.out = 7).

  1. Compare s1 to 50 to see whether the values of s1 are bigger than 50, then assign the result to name s2. Compare s1 to 80 to see whether the values of s1 are less or equal to 80, then assign the result to name s3.

  2. Use two methods (logical operators and set operations) to find the subvector of s1 with values bigger than 50 and less or equal to 80.

  3. For x <- 1:200, use two methods (logical operators and set operations) to find the subvector of x that is divisible by 7, but not divisible by 2.