## 4.4 Smoothline Fits

Now, you know how to create scatterplots with many possible customizations via specifying different aesthetics. In addition to scatterplots, a very useful type of plots that can capture the trend of pairwise relationship is the **smoothline fits**.

### 4.4.1 Creating Smoothline Fits using `geom_smooth()`

To create a smoothline fit, you can use the `geom_smooth()`

function in the **ggplot2** package. Let’s say you want to find the trend between the sale price and the living area of a house.

```
library(ggplot2)
library(r02pro)
ggplot(data = sahp) +
geom_smooth(mapping = aes(x = liv_area, y = sale_price))
#> `geom_smooth()` using method = 'loess' and formula 'y ~ x'
```

Perhaps it is helpful to review the code for generating a scatterplot between `liv_area`

and `sale_price`

.

```
ggplot(data = sahp) +
geom_point(mapping = aes(x = liv_area, y = sale_price))
```

We can see that the only difference is the use of different geoms. In fact, the mechanism of `geom_smooth()`

is that it fits a smooth line according to the points of the given variable pair. By default, it uses the **loess** method (locally estimated scatterplot smoothing), which is a popular nonparametric regression technique. In addition to the smoothline, it also generates a shaded area, representing the confidence interval around the fitted smoothline. To hide this shaded area, you can add the argument `se = FALSE`

as a global aesthetic.

```
ggplot(data = sahp) +
geom_smooth(mapping = aes(x = liv_area, y = sale_price),
se = FALSE)
```

In addition to the default loess method for smoothline fit, `geom_smooth()`

also provides other smoothing methods. For example, we can set `method = "lm"`

to fit a linear line.

```
ggplot(data = sahp) +
geom_smooth(mapping = aes(x = liv_area, y = sale_price),
method = "lm")
#> `geom_smooth()` using formula 'y ~ x'
```

### 4.4.2 Aesthetics in Smoothline Fits

As in scatterplots, you can also set global aesthetics as well as map variables to aesthetics in smoothline fits. Let’s begin with mapping variables to aesthetics. We first define a new logical vector `good_qual`

which is `TRUE`

when `oa_qual > 5`

.

`$good_qual <- sahp$oa_qual > 5 sahp`

*a. Group*

When we map a variable to the `group`

aesthetic, `geom_smooth`

will first divide all the data points into different *groups* according to the variable value, and then fit a separate smoothline for each group.

```
ggplot(data = sahp) +
geom_smooth(mapping = aes(x = liv_area,
y = sale_price,
group = good_qual))
#> `geom_smooth()` using method = 'loess' and formula 'y ~ x'
```

You can see that two smoothlines are generated. However, it is not clear from the plot which group each smoothline corresponds to. To make the two smoothlines different, you can map the variable to other aesthetics.

*b. Color*

As in `geom_point()`

, we can map the variable to the color aesthetic.

```
ggplot(data = sahp) +
geom_smooth(mapping = aes(x = liv_area,
y = sale_price,
color = good_qual))
#> `geom_smooth()` using method = 'loess' and formula 'y ~ x'
```

This is a more informative plot than the one using `group`

aesthetic as you can see the two smoothlines have different colors according to the value of `good_qual`

.

*c. Line Type*

Another useful aesthetic that was not applicable in `geom_point()`

is `linetype`

, which controls the linetypes for each smoothline.

```
ggplot(data = sahp) +
geom_smooth(mapping = aes(x = liv_area,
y = sale_price,
linetype = good_qual))
#> `geom_smooth()` using method = 'loess' and formula 'y ~ x'
```

The plot shows a dashed line for the smoothline corresponding to `good_qual == TRUE`

, and a solid line for the smoothline corresponding to `good_qual == FALSE`

.

*d. Size*

You can also map `good_qual`

to the `size`

aesthetic, which controls the width of each smoothline fit.

```
ggplot(data = sahp) +
geom_smooth(mapping = aes(x = liv_area,
y = sale_price,
size = good_qual))
#> `geom_smooth()` using method = 'loess' and formula 'y ~ x'
```

It is worth to mention that `shape`

is not a valid aesthetic for `geom_smooth`

as it doesn’t make sense to talk about the shape of a line.

```
ggplot(data = sahp) +
geom_smooth(mapping = aes(x = liv_area,
y = sale_price,
shape = good_qual))
#> `geom_smooth()` using method = 'loess' and formula 'y ~ x'
```

When you try to map a variable to the `shape`

aesthetic, `geom_smooth()`

will show a warning message “Warning: Ignoring unknown aesthetics: shape,” and use the `group`

aesthetic instead.

Naturally, you can also have global aesthetic and it is straightforward to combine multiple aesthetics in the same plot.

```
ggplot(data = sahp) +
geom_smooth(mapping = aes(x = liv_area,
y = sale_price,
color = good_qual),
linetype = 2)
#> `geom_smooth()` using method = 'loess' and formula 'y ~ x'
```

### 4.4.3 Exercises

Using the `sahp`

dataset with the **ggplot2** package, answer the following questions.

- Create a smoothline fit to visualize the relationship between
`lot_area`

(on the x-axis) and`sale_price`

(on the y-axis). - Create several smoothlines with different colors corresponding to the value of
`kit_qual`

to visualize the relationship between`lot_area`

(on the x-axis) and`sale_price`

(on the y-axis). - Create smoothlines without confidence interval around and with different linetypes to distinguish whether the house has more than 2 bedrooms to visualize the relationship between
`lot_area`

(on the x-axis) and`sale_price`

(on the y-axis) .