5.3 Exporting and Importing Data from Excel Files

Now, you know how to export and import data from delimited files. In this section, you will learn how to export and import data from Excel files with extensions .xls and .xlxs.

5.3.1 Export data into Excel files

To export data into an Excel file, you can use the writexl package. Let’s first install the package.

install.packages("writexl")

Now, we can load the writexl package and use the write.xlsx() function to write data into an Excel file with extension .xlsx. Let’s first load the data from “my_animals.csv,” and write it to “my_animals.xlsx.”

library(writexl)
library(readr)
my_animals <- read_csv("my_animals.csv")
write_xlsx(my_animals, "my_animals.xlsx")

By default, the column names of the data frame/tibble will be written to the first row of the Excel file. To skip the column names, you can set col_names = FALSE in the write_xlsx() function.

In addition to writing a single data frame to an Excel file, write.xlsx() can also write multiple data frames into a single Excel file, with each sheet containing each data frame. To do that, you need to supply a list of data frames as the first argument in write_xlsx(). Let’s take a look at the following example which write both my_animals and sahp (a tibble in the r02pro package) into an Excel file named “two_data.xlsx.”

library(r02pro)
two_data <- list(my_animals = my_animals, sahp = sahp)
write_xlsx(two_data, "two_data.xlsx")

You can open this file with Excel and verify its contents.

5.3.2 Import Excel Files (.xls and .xlsx ) using read_excel()

Having learned how to export data into an Excel file, let’s see how to read an existing Excel file into R. We can use the read_excel() function in the readxl package to import Excel files. Here, readxl is another subpackage in the tidyverse package. Thus we can directly load the package if the tidyverse package is already installed.

Let’s import the Excel file “two_data.xlsx” we just created into R.

library(readxl)
my_df <- read_excel("two_data.xlsx")
head(my_df)
#> # A tibble: 6 × 3
#>   dig_num ani_char conditions
#>     <dbl> <chr>    <chr>     
#> 1       7 sheep    Excellent 
#> 2       6 pig      Good      
#> 3       5 monkey   N         
#> 4       4 pig      Fair      
#> 5       3 monkey   Good      
#> 6       2 <NA>     Good

We can see from the result that although the file contains two sheets, the function will import the first sheet by default. To import the second sheet, you can set the sheet argument to the sheet name (sahp) or the position of the sheet (2).

shap_1 <- read_excel("two_data.xlsx", sheet = "sahp")
#shap_1 <- read_excel("two_data.xlsx", sheet = 2)     #same result as the previous line
head(shap_1)
#> # A tibble: 6 × 13
#>   dt_sold             bedroom bathroom gar_car oa_qual liv_area lot_area
#>   <dttm>                <dbl>    <dbl>   <dbl>   <dbl>    <dbl>    <dbl>
#> 1 2010-03-25 00:00:00       3      2.5       2       6     1479    13517
#> 2 2009-04-10 00:00:00       4      3.5       2       7     2122    11492
#> 3 2010-01-15 00:00:00       3      2         1       5     1057     7922
#> 4 2010-04-19 00:00:00       3      2.5       2       5     1444     9802
#> 5 2010-03-22 00:00:00       3      2         2       6     1445    14235
#> 6 2010-06-06 00:00:00       2      2.5       2       6     1888    16492
#> # … with 6 more variables: house_style <chr>, kit_qual <chr>, heat_qual <chr>,
#> #   central_air <chr>, sale_price <dbl>, good_qual <lgl>

If we only want to import a portion of the data, say the first 5 rows and the first 4 columns, then we can set the argument range = "A1:D5", just like the range in an excel file.

shap_2 <- read_excel("two_data.xlsx", sheet = "sahp", range = "A1:D5")
shap_2
#> # A tibble: 4 × 4
#>   dt_sold             bedroom bathroom gar_car
#>   <dttm>                <dbl>    <dbl>   <dbl>
#> 1 2010-03-25 00:00:00       3      2.5       2
#> 2 2009-04-10 00:00:00       4      3.5       2
#> 3 2010-01-15 00:00:00       3      2         1
#> 4 2010-04-19 00:00:00       3      2.5       2

Note that read_excel() can read both .xls and xlsx file types.

5.3.3 Import Excel file using the menu

Besides using read_excel() to import Excel files, you can again use the interactive menu we introduced in Section 5.2.

As introduced in Table 5.1, to import Excel files, you can select From Excel after choosing the Import Dataset option. As before, you can click Browse… and select the data file. Let’s select the “two_data.xlsx” file we just created.

Similar to importing the delimited files, we can see the first several rows in the Data Preview windows. The first row shows the column names and their associate types in parentheses. For each column, you can click the dropdown menu after the type to change its type. Now, let’s discuss several options in the Import Options section and their corresponding arguments in the read_excel() function.

Table 5.3: Menu Options and its Corresponding Arguments in read_excel() and Meanings
Option Argument Meaning
Name
The object name you would like to assign to.
Sheet sheet The Sheet you want to import from.
Range range The data range you want to import.
Max Rows n_max The maximum number of rows to import.
Skip skip The number of rows to skip at the beginning of the file.
NA na The way NA is represented in the data file.
First Row as Names col_names Whether you want to use the first row as column names. TRUE or FALSE.
Code Preview
The R code to be executed for importing the data

Note that similar as importing delimited files, when you change these options, the code in the Code Preview window will change accordingly, which is a great way to learn on how they work.

5.3.4 Exercises

Use R to create the following data frame and assign it to the name df1.

#>   word1 number1
#> 1   one       1
#> 2   two      NA
#> 3  <NA>       3

Then, use R to create the following data frame and assign it to the name df2.

#>   word2 number2
#> 1 three       3
#> 2  <NA>       4
#> 3  five       5

Then create a list named my_list with df1 as the first element and df2 as the second element.

  1. Write R code to set working directory to the desktop, then export my_list into an excel file named list.xlsx. How many sheets are there in the excel file? What’s in each sheet?

  2. Write R code to import the first two rows and the first column of the second sheet from the excel file you just created.