2.8 Set Operations between two vectors
In Section 2.7, we introduced logical operators which are operators between two logical vectors. In this section, we will discuss set operations between two vectors of the same type. Note that while the logical operators only make sense for logical vectors, the operations we will introduce work for all three kinds of vectors. In the rest of this section, we will introduce several operations one by one.
2.8.1 Numeric vectors
Let’s start with numeric vectors. Firstly, let’s create two numeric vectors x
and y
.
<- c(1, 2, 1, 3, 1)
x <- c(1, 1, 3, 4, 4, 5) y
a. Intersection
To get values in both x
and y
, you can use the intersect()
function.
intersect(x, y)
#> [1] 1 3
Note that although x
has three elements of 1
and y
has two, the result of their intersection only has one 1
, showing that only the unique elements are retained in the output.
b. Union
To get values in either x
or y
, you can use the union()
function.
union(x, y)
#> [1] 1 2 3 4 5
Again, only one copy of each value is retained in the output.
c. Set difference
To get values in x
but not in y
, you can use the setdiff()
function. Notice that x
is in the first argument and y
is in the second in this situation.
setdiff(x, y)
#> [1] 2
Then you will get the result of 2
! But you may be wondering why you don’t get 1 2
as the result since there is one more 1
in x
than in y
? That’s because for this operation, you will get unique elements of x
and y
firstly, then find the set difference between them.
Similarly, if you want to get values in y
but not in x
, y
should be in the first argument and x
is in the second.
setdiff(y, x)
#> [1] 4 5
d. Set equality
To check whether the two vectors x
and y
are the same, you can use the setequal()
function.
setequal(x, y)
#> [1] FALSE
Of course you will get FALSE
since x
has value 2
which y
doesn’t have.
Similar to the setdiff()
function, the setequal()
function works by looking at whether the two vectors have same set of unique values. For example, you will get TRUE
in the following example,
setequal(c(1, 1, 2), c(1, 2))
#> [1] TRUE
e. Membership determination
To check whether each element of x
is inside y
, you can use the is.element()
function or the %in%
operator.
is.element(x, y)
#> [1] TRUE FALSE TRUE TRUE TRUE
%in% y
x #> [1] TRUE FALSE TRUE TRUE TRUE
In this example, it returns a logical vector of length-5, the same length as x
. The first element of x
is 1
, and y
also has elements with value 1, so the first element of the logical vector is TRUE
. The second element of x
is 2
, but y
doesn’t have any elements with value 2, hence the result is FALSE
. You can verify the other elements by yourself.
The order of vectors is important for membership determination since if you put y
before x
, you will check whether each element of y
is inside x
.
is.element(y, x)
#> [1] TRUE TRUE TRUE FALSE FALSE FALSE
%in% x
y #> [1] TRUE TRUE TRUE FALSE FALSE FALSE
Please find a summary of the set operations between x
and y
in the following table.
Operation | Code |
---|---|
Intersection | intersect(x, y) |
Union | union(x, y) |
Set Difference | setdiff(x, y) |
Set Equality | setequal(x, y) |
Membership Determination | is.element(x, y) |
2.8.2 Character vectors
Now you must have been familiar with all the five operations! Similar to numeric vectors, you can also apply operations on character vectors. Here are some codes which you can run by yourself.
<- c("sheep", "monkey", "sheep", "chicken", "dragon")
a <- c("sheep", "pig", "pig")
b intersect(a, b)
union(a, b)
setdiff(a, b)
setequal(a, b)
%in% b a
2.8.3 Logical vectors
Of course you can also apply set operations on logical vectors.
<- c("T", "F", "F", "T")
c <- c("T", "T", "T")
d intersect(c, d)
union(c, d)
setdiff(c, d)
setequal(c, d)
%in% d c
2.8.4 Exercises
Consider the vector s1 <- seq(from = 1, to = 100, length.out = 7)
.
Compare
s1
to 50 to see whether the values ofs1
are bigger than 50, then assign the result to name s2. Compares1
to 80 to see whether the values ofs1
are less or equal to 80, then assign the result to name s3.Use two methods (logical operators and set operations) to find the subvector of
s1
with values bigger than 50 and less or equal to 80.For
x <- 1:200
, use two methods (logical operators and set operations) to find the subvector ofx
that is divisible by 7, but not divisible by 2.