4.5 Multiple geoms and Aesthetics
So far, you have learned to create scatterplots using geom_point()
and smoothline fits using geom_smooth()
. It is sometimes useful to combine multiple geoms in the same plot.
Let’s first review the scatterplot and smoothline fit between liv_area
and sale_price
.
library(r02pro)
library(tidyverse)
#> ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
#> ✓ tidyr 1.2.0 ✓ stringr 1.4.0
#> ✓ readr 2.1.2 ✓ forcats 0.5.1
#> ✓ purrr 0.3.4
#> ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
#> x dplyr::filter() masks stats::filter()
#> x dplyr::lag() masks stats::lag()
ggplot(data = sahp) +
geom_point(mapping = aes(x = liv_area, y = sale_price))
ggplot(data = sahp) +
geom_smooth(mapping = aes(x = liv_area, y = sale_price))
#> `geom_smooth()` using method = 'loess' and formula 'y ~ x'
To combine multiple geoms, you can simply use +
to add them.
ggplot(data = sahp) +
geom_point(mapping = aes(x = liv_area, y = sale_price)) +
geom_smooth(mapping = aes(x = liv_area, y = sale_price))
#> `geom_smooth()` using method = 'loess' and formula 'y ~ x'
As expected, you see all the points and the smoothline fit on the same plot, which contains very rich information.
As usual, you can add aesthetics to both geoms.
Let’s first map good_qual
(defined as oa_qual > 5
in Section 4.4) to the color aesthetic for geom_smooth()
.
ggplot(data = sahp) +
geom_point(mapping = aes(x = liv_area,
y = sale_price)) +
geom_smooth(mapping = aes(x = liv_area,
y = sale_price,
color = good_qual))
#> `geom_smooth()` using method = 'loess' and formula 'y ~ x'
To verify the two smoothline fits are indeed fitted from the data points in the two groups, you can map good_qual
to the color aesthetic for geom_point()
as well.
ggplot(data = sahp) +
geom_point(mapping = aes(x = liv_area,
y = sale_price,
color = good_qual)) +
geom_smooth(mapping = aes(x = liv_area,
y = sale_price,
color = good_qual),
se = FALSE)
#> `geom_smooth()` using method = 'loess' and formula 'y ~ x'
The plot is reassuring that the two smoothline fits indeed correspond to the data points in the two groups defined by good_qual
. Note that the legend also shows a NA
category since there is a missing value in the variable good_qual
.
In addition to mapping variables to aesthetics, you can also use global aesthetics.
ggplot(data = sahp) +
geom_point(mapping = aes(x = liv_area,
y = sale_price,
color = good_qual,
shape = house_style == "2Story")) +
geom_smooth(mapping = aes(x = liv_area,
y = sale_price,
color = good_qual),
linetype = 2,
se = FALSE)
#> `geom_smooth()` using method = 'loess' and formula 'y ~ x'
4.5.1 Exercises
Using the sahp
dataset with the ggplot2 package, answer the following questions.
- With
lot_area
on the x-axis andsale_price
on the y-axis, create a plot that contains both the scatterplot and smoothline fits, where we use different colors in the scatterplot to distinguish whetherheat_qual
is excellent and different linetypes for the smoothline fits depending on whetherhouse_style
is 2Story.