mini传媒

Menu

Wananga landing Wananga landing
Topic

Tidy Data

14 June 2024
HOW TO APPLY

refers to a structured and standardised way of formatting and organizing datasets that adheres to the principles of simplicity, consistency, and usability. Many popular software packages for analyses (including python, R, and MATLAB) work best when data is arranged in a tidy way.

Tidy Data Principles

  1. Each variable has its own column.
  2. Each observation has its own row.
  3. Each value has its own cell.

The aim is to make sure that each cell contains a single piece of information. This with relational database principles and tools commonly used in data and statistical analysis so that data can be more easily manipulated, analysed, and visualised.

Tips for Setting Up Data Files

  • Don鈥檛 combine multiple pieces of information in one cell. Sometimes it just seems like one thing, but think if that鈥檚 the only way you鈥檒l want to be able to use or sort that data., e.g. FirstName, LastName rather than 鈥楴ame鈥.鈥
  • Always keep a copy of the 鈥榬aw鈥 data separately to your working files.鈥
  • Avoid formatting to convey information, e.g. bolding words, colour coding, adding comments to cells.鈥
  • Avoid merged cells.鈥
  • Export the cleaned data to a text-based format like CSV. This ensures that anyone can use the data, and is the format required by most data repositories.

Other Help with Tidy Data

The library carpentry project provides on for tidy data. The UC Library also runs a that includes Tidy Data and Open Refine, and you can attend in person.

For support in this area, please contact the eResearch team by filling out the .

Privacy Preferences

By clicking "Accept All Cookies", you agree to the storing of cookies on your device to enhance site navigation, analyse site usage, and assist in our marketing efforts.