Untidy Datasets

The following two datasets show data that are not tidy. We will clean these up below.

statclass
##     name stat1 stat2  r spss
## 1   momo    12     5  6    9
## 2    kim    14    10 13   15
## 3 sascha     7     4  4    1

This dataset is in the so-called wide format. That means if we had new exams, we would simply add more columns. However, this can be problematic for processing with R, as we often need a long format.

statclass2

In this case, we have several issues: Firstly, not all columns contain variables, but observations (momo, kim, sascha), and in exam, we find variable names again.

Let’s start with statclass.