Tables

Before you start calculating measures of association, you will first create tables. This is sometimes relevant for graphical representation. Often, tables displaying frequencies of variables are chosen for representation. For a simple (frequency) table, we call the table() function:

table(pss$stfdem)
## 
##   0   1   2   3   4   5   6   7   8   9  10 
## 226 268 436 618 754 850 631 522 338 179  83

The first row contains the code values, and the second row contains the frequencies. Listed here are the valid cases, i.e., each value that is not set to NA.

How many valid cases do we have? How many NA’s?

To test this, we use the sum() and length() functions:

# gültige Fälle aus der Tabelle
sum(
  table(
    pss$stfdem
  )
) 
## [1] 4905
# Summe der NA's
sum(
  is.na(
    pss$stfdem
  )
)
## [1] 95
# Gesamtlänge: Gültige Fälle + NA's
length(pss$stfdem)
## [1] 5000

Alternatively, you can expand the table() function with the argument useNA = "ifany":

table(
  pss$stfdem,
  useNA = "ifany"
)
## 
##    0    1    2    3    4    5    6    7    8    9   10 <NA> 
##  226  268  436  618  754  850  631  522  338  179   83   95

Table with library summarytools

To obtain a structured output, you can use the library summarytools. You will get a view similar to SPSS. The library must be installed or loaded first:

install.packages("summarytools")
library("summarytools")

Then the freq() function is used:

freq(pss$stfdem)
## Frequencies  
## pss$stfdem  
## Type: Numeric  
## 
##               Freq   % Valid   % Valid Cum.   % Total   % Total Cum.
## ----------- ------ --------- -------------- --------- --------------
##           0    226      4.61           4.61      4.52           4.52
##           1    268      5.46          10.07      5.36           9.88
##           2    436      8.89          18.96      8.72          18.60
##           3    618     12.60          31.56     12.36          30.96
##           4    754     15.37          46.93     15.08          46.04
##           5    850     17.33          64.26     17.00          63.04
##           6    631     12.86          77.13     12.62          75.66
##           7    522     10.64          87.77     10.44          86.10
##           8    338      6.89          94.66      6.76          92.86
##           9    179      3.65          98.31      3.58          96.44
##          10     83      1.69         100.00      1.66          98.10
##        <NA>     95                               1.90         100.00
##       Total   5000    100.00         100.00    100.00         100.00

The first column contains the code values. The second column shows the frequencies, the third column shows the percentages (of valid cases), and the fourth column shows the cumulative percentages (of valid cases). Columns \(5\) and \(6\) show the percentages and cumulative percentages of all cases (including NA's).

In the next step, you will learn about cross-tabulations!