Chapter 7 Tidy Data

In the part several chapters, you have learned a lot in data visualization, data import and export, and data manipulation. All the data you have seen so far share a very attractive property, namely, they are all tidy. So, what is the so called tidy data? Following the definition in Wickham and Grolemund (2016), tidy data has the following three interrelated properties.

  1. Each variable must have its own column.
  2. Each observation must have its own row.
  3. Each value must have its own cell.

These properties of tidy data enable us to conduct efficient data manipulation and visualization. Note that in practical applications, many collected data is untidy. Although untidy data could also be very useful in terms of reporting and visually more intuitive, you are recommended to tidy it before applying the tools we learned in this book.

References

Wickham, Hadley, and Garrett Grolemund. 2016. R for Data Science: Import, Tidy, Transform, Visualize, and Model Data. " O’Reilly Media, Inc.".