We will start by introducing how to export data to a file in this chapter.
First, we will introduce an important concept called Working Directory. To do data export and import, you are recommended to set the working directory since we usually use a path relative to the working directory for interacting with files on the computer in R. To set the working directory, you can click Session on the menu and click Set Working Directory.
There are three options under this menu.
- To Source File Location: this is the same directory as the current R script.
- To Files Pane Location: this is the same directory as shown in the Files Panel on the bottom right of RStudio.
- Choose Directory…: this will open up a window from which you can choose any desired directory.
setwd()executed in the console.
Indeed, this menu operation is equivalent to using the
setwd() function with the argument being the full path or relative path of the desired directory.
Another related function is
getwd() which tells us the absolute path representing the current working directory.
In most applications, you will use a specific file type called delimited file. In a delimited file, each row represents a single observation, and it has values separated by the delimiter. In principle, any character (including letters, numbers, or symbols) can be used as a delimiter, with the most commonly used ones being the following.
|Delimiter||Symbol||Common File Extension|
First, let’s work with one popular kind of delimited files called comma-separated value file, usually with the file extension .csv. In a .csv file, the delimiter is comma (
Let’s first create a data frame.
<- 7:1 dig_num <- c("sheep", "pig", "monkey", "pig", "monkey", NA, "pig") ani_char <- c("Excellent", "Good", "N", "Fair", "Good", "Good", "Excellent") conditions <- data.frame(dig_num, ani_char, conditions) my_animalsmy_animals
Now, let’s write the data frame
my_animals into a file called “my_animals.csv” in the currently working directory. To write an object into a .csv file, you can use the
write_csv() function in the readr package. Since readr is a sub-package of tidyverse, you can load the package directly if the tidyverse package is installed.
library(readr) write_csv(my_animals, "my_animals.csv")
You can verify the .csv file has been indeed created and open the file with RStudio or any text editor to verify its contents.
We can see that all the information has been written in the .csv file, which has commas separating the values on each line. In particular, you may find out the first row of the file corresponds to the column names. If you don’t want to include the column names, you can set the argument
col_names = FALSE.
write_csv(my_animals, "my_animals_no_colname.csv", col_names = FALSE)
write_csv() writes the data into a file in which
NA is used to represent all the missing values, just like in the tibble. If you want to use another string to represent the missing values in the file, you can set the argument
na to be the string.
write_csv(my_animals, "my_animals_missing.csv", na = "This value is missing!")
As introduced at the beginning, there are different types of delimited files depending on the specific delimiter. The function
write_delim() enables us to write an object into a delimited file with any chosen delimiter. The usage of
write_delim() is almost identical to
write_csv(), except that it has an additional argument
delim, which specifies the delimiter to be used. Let’s see the following example with
* as the delimiter.
write_delim(my_animals, "my_animals_star.csv", delim = "*")
Use R to create the following data frame and assign it to the name
#> word number letter #> 1 one 1 a #> 2 two NA b #> 3 <NA> 3 c #> 4 four 4 d #> 5 five 5 e
Write R code to set working directory to the location of your .R or .Rmd file, and export
my_datainto a .csv file named “my_data_no_name.csv” without column names.
Write R code to export
my_datainto a delimited file called “my_data_na.csv” with
#as the delimiter and use
999as the indicator for missing values.