ggcorrplot()

Correlation plots can also be created using ggplot. For this, you will again use the library GGally.

library("GGally")

Now, you will use the function ggcorr(): Non-metric variables are automatically excluded.

ggcorr(pss)
## Warning in ggcorr(pss): data in column(s) 'district', 'gndr', 'edu', 'income'
## are not numeric and were ignored

In the additional argument method, you can specify how to handle NA's and which type of correlation to calculate (pearson, spearman, kendall):

ggcorr(
  pss,
  method = c(
    "pairwise",
    "pearson"
  )
)
## Warning in ggcorr(pss, method = c("pairwise", "pearson")): data in column(s)
## 'district', 'gndr', 'edu', 'income' are not numeric and were ignored

Additionally, you can display the correlation coefficients using label = TRUE:

ggcorr(
  pss,
  method = c(
    "pairwise",
    "pearson"
  ),
  label = TRUE
)
## Warning in ggcorr(pss, method = c("pairwise", "pearson"), label = TRUE): data
## in column(s) 'district', 'gndr', 'edu', 'income' are not numeric and were
## ignored

You can set the number of decimal places with the argument label_round:

ggcorr(
  pss,
  method = c(
    "pairwise",
    "pearson"
  ),
  label = TRUE,
  label_round = 2
)
## Warning in ggcorr(pss, method = c("pairwise", "pearson"), label = TRUE, : data
## in column(s) 'district', 'gndr', 'edu', 'income' are not numeric and were
## ignored

In the geom argument, you can choose between tile, circle, text, or blank.

ggcorr(
  pss,
  method = c(
    "pairwise",
    "pearson"
  ),
  label = TRUE,
  label_round = 2,
  geom = "circle"
)
## Warning in ggcorr(pss, method = c("pairwise", "pearson"), label = TRUE, : data
## in column(s) 'district', 'gndr', 'edu', 'income' are not numeric and were
## ignored

Lastly, you can specify three colors in the palette argument (\(-1\), \(0\), $1) to create the color scale: I’m using beyonce again here! Important: You are not providing the entire palette here, but rather a color from the respective palette, hence an additional number in [..] brackets!

ggcorr(
  pss,
  method = c(
    "pairwise",
    "pearson"
  ),
  label = TRUE,
  label_round = 2,
  geom = "circle",
  low = beyonce_palette(72)[1],
  mid = "white", 
  high = beyonce_palette(72)[2]
)
## Warning in ggcorr(pss, method = c("pairwise", "pearson"), label = TRUE, : data
## in column(s) 'district', 'gndr', 'edu', 'income' are not numeric and were
## ignored

Alternatively, you can also change the limits. This can be helpful if your correlation values are not very high and all are very weakly colored. With limits = FALSE, the endpoints are automatically set (according to the data!).

ggcorr(
  pss,
  method = c(
    "pairwise",
    "pearson"
  ),
  label = TRUE,
  label_round = 2,
  geom = "circle",
  low = beyonce_palette(72)[1],
  mid = "white", 
  high = beyonce_palette(72)[2],
  limits = FALSE
)
## Warning in ggcorr(pss, method = c("pairwise", "pearson"), label = TRUE, : data
## in column(s) 'district', 'gndr', 'edu', 'income' are not numeric and were
## ignored

Important: In ggcorr(), non-significant values cannot be hidden, as the author of these functions (rightly) opposes a focus on the significance level.

For more information on the functionality in ggcorr(), you can find it here.

Now let’s move on to the representation of mean comparisons.