One-Sample T-Test

The one-sample t-test is calculated to test if the empirical value significantly deviates from the true value \(\mu\). Therefore, you need knowledge about \(\mu\).

Assume that the statistics office states the official average age as \(36.8\) years.

What value does the arithmetic mean of age in our dataset take? The variable for age is agea.

Calculate the arithmetic mean of age from the dataset PSS!

mean(
pss$agea,
na.rm = TRUE
)

In the dataset, the average age is (42.83006) years. So, the value deviates in the dataset.

Take a moment to think about why the value could deviate in the dataset!

Now you want to test whether this deviation is statistically significant. There are two testing situations for each mean comparison:

  • two-sided

  • one-sided (greater or smaller)

You just want to know if the value significantly deviates at first. There is no assumption about the direction. Therefore, you conduct a two-sided test.

You can do this with the function t.test():

t.test(
  pss$agea, 
  mu = 36.8,
  alternative = "two.sided"
)  
## 
## 	One Sample t-test
## 
## data:  pss$agea
## t = 31.273, df = 4842, p-value < 2.2e-16
## alternative hypothesis: true mean is not equal to 36.8
## 95 percent confidence interval:
##  42.45205 43.20808
## sample estimates:
## mean of x 
##  42.83006

Then you will see the t-value, the p-value, the confidence interval, and the mean.

The p-value is less than \(0.05\), giving you a significant test result. In the output, you see 2.2e-16. This is nothing but 2.2*10^{-16}, meaning you need to move the decimal point 16 places to the left. So, it is a value very close to \(0\). You can conclude that the mean age in the sample significantly deviates from \(\mu\). We have already discussed the possible reasons for this above!

You can also calculate the difference using R:

diff_age <- mean(pss$agea, na.rm = TRUE) - 36.8

diff_age
## [1] 6.030064

The difference is \(6.030064\). Since \(\mu\) refers to all individuals in Panem while the sample only includes individuals aged \(16\) and above, this difference is easily explained.

Test Alternatives

Alternatively, you can also test one-sided:

  • if we assume the value is greater than \(\mu\), use greater

  • if we assume the value is smaller than $, useless`.

# one-sided, greater
t.test(
  pss$agea, 
  mu = 36.8, 
  alternative = "greater"
)
## 
## 	One Sample t-test
## 
## data:  pss$agea
## t = 31.273, df = 4842, p-value < 2.2e-16
## alternative hypothesis: true mean is greater than 36.8
## 95 percent confidence interval:
##  42.51284      Inf
## sample estimates:
## mean of x 
##  42.83006
# one-sided, lower
t.test(
  pss$agea,
  mu = 36.8, 
  alternative = "less"
)
## 
## 	One Sample t-test
## 
## data:  pss$agea
## t = 31.273, df = 4842, p-value = 1
## alternative hypothesis: true mean is less than 36.8
## 95 percent confidence interval:
##      -Inf 43.14729
## sample estimates:
## mean of x 
##  42.83006

Let’s move on to the t-test for two samples!