2.6 Numeric Vectors: Creating numeric vectors with patterns

In Section 2.1, you learned how to create numeric vectors with length more than 1 by using the c() function. When the number of elements become too big, you may feel redundant manually typing in all those elements. Luckily, as long as the assigned elements have certain patterns, R allows you to create the desired numeric vectors more efficiently.

2.6.1 The colon operator : – Equally-spaced numeric vectors with increment of 1 or -1

An equally-spaced numeric vector is a numeric vector with the same distance between adjacent elements. Suppose you want to assign consecutive integers from 1 to 5 to the name pattern1. Coming from Section 2.1, you may do the following:

pattern1 <- c(1, 2, 3, 4, 5)

However, when the number of elements changes from 5 consecutive integers to 100 consecutive integers, manually typing in would be really time-consuming and inefficient. In R, you can use the colon operator : to create equally-spaced numeric vectors. Note that you don’t need to use : together with c().

pattern2 <- 1:5

You can interpret the colon operator as “a numeric vector starting from 1 and ending at 5, with the increment between adjacent elements of 1”. The colon operator is very powerful and convenient, as it can be used to create numeric vectors with any length of consecutive integers.

If you compare pattern1 and pattern2, you will realize that they are both numeric vectors with elements of consecutive integers from 1 to 5. Let’s experiment : with more attempts.

pattern3 <- 5:1
pattern3
#> [1] 5 4 3 2 1
pattern4 <- 1.2:5.2
pattern4
#> [1] 1.2 2.2 3.2 4.2 5.2

Here, pattern3 is a numeric vector with elements of consecutive integers from 5 to 1, and pattern4 is a numeric vector with elements of consecutive decimals from 1.2 to 5.2.

However, as you can see below, R will store numeric vectors with equally-spaced integers as type integer and those with equally-spaced decimals as type double.

class(pattern3)
#> [1] "integer"
class(pattern4)
#> [1] "numeric"

2.6.2 The seq() function – Equally-spaced numeric vectors with any increment

As powerful as the colon operator : is, it is really restricted to creating equally-spaced numeric vectors with an increment 1 or -1. By contrast, the seq() function doesn’t have such a restriction, and it will come into play when you want to create equally-spaced numeric vectors with increments other than 1 or -1.

a. Create sequences with the by argument

The seq() function can include three arguments: from, to, and by. The from and to arguments specify the start and limited end values, respectively, while the by argument specifies the increment of the sequence.

seq(from = 1, to = 5, by = 1)
#> [1] 1 2 3 4 5
seq(to = 5)
#> [1] 1 2 3 4 5

You can interpret the above example as “an equally-spaced numeric vector starting at 1 and ending at 5, with the increment between adjacent elements of 1”. If you don’t specify the optional from and by arguments, seq() will use the default value 1 for both arguments.

Now you have had four methods to create vectors with consecutive integers.

c(1, 2, 3, 4, 5, 6)  #write all numbers down
1:6  #use colon operator
seq(from = 1, to = 6, by = 1)  #use seq()
seq(to = 6)  #use seq()

Next, let’s assign a different value to the by argument in the seq() function. What you will get a numeric vector with 1 3 5 as its values.

seq(from = 1, to = 5, by = 2)
#> [1] 1 3 5
seq(from = 1, to = 6, by = 2)
#> [1] 1 3 5

Note that the ending value of the sequence doesn’t always match the value assigned to the to argument. If you change the limit end value to 6, you still get the same sequence, because the next value in the sequence would be 7, which exceeds the limit of 6. This is why to specifies the upper limit, not necessarily the last value in the sequence.

Of course, you can assign decimal numbers to arguments in the seq() function. Below, you will first get a sequence which starts with 1, increases by 0.5 each time until it is larger than 6. Subsequently, you will get a numeric vector with 1.1 1.8 2.5 3.2 as its values.

seq(from = 1, to = 6, by = 0.5)
#>  [1] 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0
seq(from = 1.1, to = 3.2, by = 0.7)
#> [1] 1.1 1.8 2.5 3.2

Besides ascending sequences, you can also create descending sequences via seq() by using a negative value in the by argument.

seq(from = 1.5, to = -1, by = -0.5)
#> [1]  1.5  1.0  0.5  0.0 -0.5 -1.0
seq(from = -3.5, to = -3, by = 0.1)
#> [1] -3.5 -3.4 -3.3 -3.2 -3.1 -3.0
seq(from = 1.5, to = -1, by = 0.5)
#> Error in seq.default(from = 1.5, to = -1, by = 0.5): wrong sign in 'by' argument

It’s worth noting that the arguments in seq() function should make algebraic sense. For example, in the last example above, it doesn’t make sense to use a positive value in the by argument when the start value is bigger than the limited end value, which implies a decreasing sequence. In such cases, you will get an error message telling you that the sign in the by argument is wrong.

b. Create sequences with the length.out argument

In seq(), in addition to setting the increment via the by argument, you can also specify the length.out argument, which creates a equally-spaced sequence with the assigned length (a.k.a. the number of elements in the numeric vector). R will automatically calculate the difference between two neighboring elements according to values of three arguments in seq().

seq(from = 1, to = 5, length.out = 9)
#> [1] 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0

Here, you will get an equally-spaced sequence of length 9 from 1 to 5. You can also create a decreasing sequence by using the length.out argument.

seq(from = 5, to = -5, length.out = 9)
#> [1]  5.00  3.75  2.50  1.25  0.00 -1.25 -2.50 -3.75 -5.00
seq(from = 5, to = -5, length.out = 8.9)
#> [1]  5.00  3.75  2.50  1.25  0.00 -1.25 -2.50 -3.75 -5.00

Unlike creating sequences with by argument, if you specify the length.out argument in seq(), the start value and limited end value of the sequence you get will be exactly match the input arguments.

c. Create sequences with both the by and length.out arguments

Lastly, if you provide both the by and length.out arguments, only one of the from and to arguments is necessary. With one value (the start value or the limit end value) fixed, seq() will create a vector with specified increment and length.

If you only have the from argument, you will get a sequence starting from the value you set, with the increment in the by argument, until you get a sequence with a length specified in the length.out argument.

If you only have the to argument, you will get a sequence ending with the value you set, with the increment in the by argument, until you get a sequence with a length specified in the length.out argument.

seq(from = 1, by = 2, length.out = 5)
#> [1] 1 3 5 7 9
seq(to = 1, by = 2, length.out = 5)
#> [1] -7 -5 -3 -1  1

Before finishing the seq() function, note that you should at most provide three arguments. For example, you will see an error when running the following example since all four arguments are specified.

seq(from = 1, to = 3, by = 1, length.out = 3)
#> Error in seq.default(from = 1, to = 3, by = 1, length.out = 3): too many arguments

2.6.3 The sequence generator seq_along() – Matching numeric vectors

Now, we will introduce one function derived from the seq() function. Let’s first create a numeric vector extend.

extend <- seq(from = 2, to = 8, length.out = 7)
extend
#> [1] 2 3 4 5 6 7 8

By the value assigned to the length.out argument, you know that the length of extend is 7. You also know that 2 3 4 5 6 7 8 are the elements of extend. Next, let’s put extend in seq_along().

seq_along(extend)
#> [1] 1 2 3 4 5 6 7

As you can tell from the result, seq_along() takes a numeric vector as its argument, generating consecutive integers from 1 to the length of the input vector. The seq_along() function is commonly used when writing loops, which will be covered at a later time.

You can also use 1:length(extend) to get the same result as seq_along(extend). However, seq_along() is more efficient and safer than 1:length(extend) when the input vector is empty. For example, if you run 1:length(extend) with an empty vector extend, you will get 1:0, which is a numeric vector with values 1 0. In contrast, seq_along() will correctly return a numeric vector with no elements.

1:length(extend)
#> [1] 1 2 3 4 5 6 7

2.6.4 The rep() Function – Numeric vectors with repeated values

The rep() function is designed for another type of patterned numeric vectors, which are with replicated values as their elements. To do repetition, you can use the rep() function, which works by repeating the first argument for the number of times indicated in the second argument. Suppose you want to assign four 2s to the name num1. You can use the rep() function here:

num1 <- rep(2, 4)
num1
#> [1] 2 2 2 2

You can interpret the function as “creating a numeric vector in which 2 is repeated four times”. The resulting vector is a length-4 numeric vector with all elements of value 2. In addition to one single value, a numeric vector with several values can be assigned in the first argument.

num2 <- rep(c(1, 4, 2), 3)
num2
#> [1] 1 4 2 1 4 2 1 4 2

Here, the rep() will repeat the whole vector c(1, 4, 2) three times. Note that the vector is repeated as a whole, not element-wisely. Reflectively, you may be wondering what happens the second argument also has several elements? Let’s try together.

num3 <- rep(c(1, 5, 7), c(3, 2, 1))
num3
#> [1] 1 1 1 5 5 7

In rep(), when the second argument is also a numeric vector, R will do an element-repeat-operation by repeating each element in the first argument the number of times indicated in the corresponding index of the second argument and combine the repeated vectors to a single numeric vector. In this example, 1 is repeated 3 times, 4 is repeated twice, and 7 is repeated once. It is equivalent to

c(rep(1, 3), rep(4, 2), rep(7, 1))
#> [1] 1 1 1 4 4 7

It is important that, to assign two numeric vectors as the two arguments in rep(), the two numeric vectors should have the same length. Otherwise, you will see an error message telling you the second argument (a.k.a. the times argument) is invalid.

num4 <- rep(c(1, 5, 7), c(3, 2))
#> Error in rep(c(1, 5, 7), c(3, 2)): invalid 'times' argument

Last but not least, you can also apply each and times arguments in the rep() command. The former implies how many times each element in the vector of interest will be repeated per round, while the latter determines how many rounds the repeating will include.

rep(c(1, 2), each = 3, times = 3)
#>  [1] 1 1 1 2 2 2 1 1 1 2 2 2 1 1 1 2 2 2

2.6.5 The unique() and table() Functions – Getting unique elements and their frequencies

So far, you have learned how to create vectors with specific patterns. Sometimes, you may want to inspect the unique elements (elements with different values) in a numeric vector and their corresponding frequencies. Let’s use num3 as an example.

num3
#> [1] 1 1 1 5 5 7

You can use unique() to show all unique elements in a vector.

unique(num3)
#> [1] 1 5 7

From the result, you know the unique elements in num3 are 1, 5, and 7. To get the frequency of each element, you can use the table() function.

table(num3)
#> num3
#> 1 5 7 
#> 3 2 1

Here, the first row is the numeric vector’s name. The second row shows all unique elements, and the third row is their corresponding frequencies. In num3, there are three 1s, two 5s and one 7.

2.6.6 Exercises

  1. Use five different ways to create an equally-spaced sequence with 2 4 6 8 10 as result.

  2. Use two different ways to create the patterned sequence with 2 2 6 6 8 8 2 2 6 6 8 8 as result.

  3. Use two different ways to create a numeric vector with 1 2 3 1 2 3 4 5 1 2 3 4 5 6 7 as result. Show the unique elements and their corresponding frequency.