The following two datasets show data that are not tidy. We will clean these up below.
statclass
## name stat1 stat2 r spss
## 1 momo 12 5 6 9
## 2 kim 14 10 13 15
## 3 sascha 7 4 4 1
This dataset is in the so-called wide format. That means if we had new exams, we would simply add more columns. However, this can be problematic for processing with R, as we often need a long format.
statclass2
In this case, we have several issues: Firstly, not all columns contain variables, but observations (momo
, kim
, sascha
), and in exam
, we find variable names again.
Let’s start with statclass
.