4.5 Multiple geoms and Aesthetics

So far, you have learned to create scatterplots using geom_point() and smoothline fits using geom_smooth(). It is sometimes useful to combine multiple geoms in the same plot.

Let’s first review the scatterplot and smoothline fit between liv_area and sale_price.

library(r02pro)
library(tidyverse)
#> ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
#> ✓ tidyr   1.2.0     ✓ stringr 1.4.0
#> ✓ readr   2.1.2     ✓ forcats 0.5.1
#> ✓ purrr   0.3.4
#> ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
#> x dplyr::filter() masks stats::filter()
#> x dplyr::lag()    masks stats::lag()
ggplot(data = sahp) + 
  geom_point(mapping = aes(x = liv_area, y = sale_price))

ggplot(data = sahp) + 
  geom_smooth(mapping = aes(x = liv_area, y = sale_price))
#> `geom_smooth()` using method = 'loess' and formula 'y ~ x'

To combine multiple geoms, you can simply use + to add them.

ggplot(data = sahp) + 
  geom_point(mapping = aes(x = liv_area, y = sale_price)) + 
  geom_smooth(mapping = aes(x = liv_area, y = sale_price))
#> `geom_smooth()` using method = 'loess' and formula 'y ~ x'

As expected, you see all the points and the smoothline fit on the same plot, which contains very rich information.

As usual, you can add aesthetics to both geoms.

Let’s first map good_qual (defined as oa_qual > 5 in Section 4.4) to the color aesthetic for geom_smooth().

ggplot(data = sahp) + 
  geom_point(mapping = aes(x = liv_area, 
                           y = sale_price)) + 
  geom_smooth(mapping = aes(x = liv_area, 
                            y = sale_price, 
                            color = good_qual))
#> `geom_smooth()` using method = 'loess' and formula 'y ~ x'

To verify the two smoothline fits are indeed fitted from the data points in the two groups, you can map good_qual to the color aesthetic for geom_point() as well.

ggplot(data = sahp) + 
  geom_point(mapping = aes(x = liv_area, 
                           y = sale_price, 
                           color = good_qual)) +
  geom_smooth(mapping = aes(x = liv_area, 
                            y = sale_price, 
                            color = good_qual), 
              se = FALSE)
#> `geom_smooth()` using method = 'loess' and formula 'y ~ x'

The plot is reassuring that the two smoothline fits indeed correspond to the data points in the two groups defined by good_qual. Note that the legend also shows a NA category since there is a missing value in the variable good_qual.

In addition to mapping variables to aesthetics, you can also use global aesthetics.

ggplot(data = sahp) + 
  geom_point(mapping = aes(x = liv_area, 
                           y = sale_price, 
                           color = good_qual, 
                           shape = house_style == "2Story")) +
  geom_smooth(mapping = aes(x = liv_area, 
                            y = sale_price, 
                            color = good_qual), 
              linetype = 2, 
              se = FALSE)
#> `geom_smooth()` using method = 'loess' and formula 'y ~ x'

4.5.1 Exercises

Using the sahp dataset with the ggplot2 package, answer the following questions.

  1. With lot_area on the x-axis and sale_price on the y-axis, create a plot that contains both the scatterplot and smoothline fits, where we use different colors in the scatterplot to distinguish whether heat_qual is excellent and different linetypes for the smoothline fits depending on whether house_style is 2Story.