7.3 Reorder Observations
Now, let’s look at the third task: find the 10 countries with the highest life expectancy in the year 2008. To order observations, you can use the function arrange()
in the dplyr package.
First, let’s create a new dataset called gm_2008
that only contains the observations in 2008 and the variables country
, life_expectancy
, and GDP_per_capita
.
library(r02pro)
library(dplyr)
library(tibble)
gm_2008 <- gm %>%
filter(year == 2008) %>%
select(country, life_expectancy, GDP_per_capita)
To arrange the observations in the ascending order of the life expectancy (life_expectancy
), you just need to add life_expectancy
as an argument of the arrange()
function. To arrange the descending order, you can add desc()
around the variable.
gm_2008 %>%
arrange(life_expectancy) #arrange in the ascending order of life_expectancy
#> # A tibble: 236 × 3
#> country life_expectancy GDP_per_capita
#> <chr> <dbl> <dbl>
#> 1 Eswatini 46.4 3.21
#> 2 Central African Republic 47.4 0.514
#> 3 Lesotho 47.4 0.948
#> 4 Zimbabwe 50.2 0.941
#> 5 Mozambique 53.5 0.463
#> 6 Somalia 54.6 NA
#> 7 Malawi 55 0.346
#> 8 Sierra Leone 55.3 0.52
#> 9 Guinea-Bissau 55.5 0.574
#> 10 South Africa 55.7 5.48
#> # … with 226 more rows
gm_2008 %>%
arrange(desc(life_expectancy)) #arrange in the descending order of life_expectancy
#> # A tibble: 236 × 3
#> country life_expectancy GDP_per_capita
#> <chr> <dbl> <dbl>
#> 1 Japan 83.3 31.7
#> 2 Hong Kong, China 82.7 35.9
#> 3 Switzerland 82.5 80.3
#> 4 Singapore 82.5 43.3
#> 5 Iceland 82.4 49.5
#> 6 Australia 81.9 53.3
#> 7 Andorra 81.8 35.4
#> 8 Spain 81.8 25.8
#> 9 Italy 81.8 31.6
#> 10 San Marino 81.8 56.7
#> # … with 226 more rows
You may observe from the results that there are several countries with the same life_expectancy
value, leading to a tie. To break the tie, you can supply additional variables in the arrange()
function, which will arrange the observations within the tie according to the additional variables in the order they are supplied.
gm_2008 %>%
arrange(life_expectancy, GDP_per_capita)
#> # A tibble: 236 × 3
#> country life_expectancy GDP_per_capita
#> <chr> <dbl> <dbl>
#> 1 Eswatini 46.4 3.21
#> 2 Central African Republic 47.4 0.514
#> 3 Lesotho 47.4 0.948
#> 4 Zimbabwe 50.2 0.941
#> 5 Mozambique 53.5 0.463
#> 6 Somalia 54.6 NA
#> 7 Malawi 55 0.346
#> 8 Sierra Leone 55.3 0.52
#> 9 Guinea-Bissau 55.5 0.574
#> 10 Zambia 55.7 1.13
#> # … with 226 more rows
Here, the observations are arranged in the ascending order of life_expectancy
, and the ties are broken in the ascending order of GDP_per_capita
. Note that the observations that has an NA
value in the specified variable will always be arranged at the end of the output.
If you want to break the tie in the descending order of GDP_per_capita
, you can use desc()
around the variable.
gm_2008 %>%
arrange(life_expectancy, desc(GDP_per_capita))
#> # A tibble: 236 × 3
#> country life_expectancy GDP_per_capita
#> <chr> <dbl> <dbl>
#> 1 Eswatini 46.4 3.21
#> 2 Lesotho 47.4 0.948
#> 3 Central African Republic 47.4 0.514
#> 4 Zimbabwe 50.2 0.941
#> 5 Mozambique 53.5 0.463
#> 6 Somalia 54.6 NA
#> 7 Malawi 55 0.346
#> 8 Sierra Leone 55.3 0.52
#> 9 Guinea-Bissau 55.5 0.574
#> 10 South Africa 55.7 5.48
#> # … with 226 more rows
Now, we are ready to present the 10 countries with the highest life expectancy in 2008.
gm_2008 %>%
arrange(desc(life_expectancy)) %>%
head(10)
#> # A tibble: 10 × 3
#> country life_expectancy GDP_per_capita
#> <chr> <dbl> <dbl>
#> 1 Japan 83.3 31.7
#> 2 Hong Kong, China 82.7 35.9
#> 3 Switzerland 82.5 80.3
#> 4 Singapore 82.5 43.3
#> 5 Iceland 82.4 49.5
#> 6 Australia 81.9 53.3
#> 7 Andorra 81.8 35.4
#> 8 Spain 81.8 25.8
#> 9 Italy 81.8 31.6
#> 10 San Marino 81.8 56.7
Here, the head(10)
function is used to get the first 10 observations in the dataset.
7.3.1 Exercises
Using the ahp
dataset,
Find all houses built in
2008
withhouse_style
as"2Story"
, then arrange the the observations in the ascending order of remodel year, and break the ties in the descending order ofsale_price
.Find all houses sold in
2009
withhouse_style
as"1Story"
, and arrange the observations in the descending order ofsale_price
.