[SOLVED] Correlation Matrix Between Variables in R

Correlation Matrix Between Variables in R

I have been trying to determine the correlation between variable in panel data. My data is in the form (with more dates, some values of PM10 are NA):

structure(list(NetC = c("Cosenza Provincia", "Cosenza Provincia", 
"Cosenza Provincia", "Cosenza Provincia", "Cosenza Provincia", 
"Cosenza Provincia", "Cosenza Provincia", "Cosenza Provincia", 
"Cosenza Provincia", "Reti Private", "Reti Private", "Reti Private", 
"Reti Private", "Reti Private", "Reti Private"), ID = c("IT1938A", 
"IT1938A", "IT1938A", "IT2086A", "IT2086A", "IT2086A", "IT2110A", 
"IT2110A", "IT2110A", "IT1766A", "IT1766A", "IT1766A", "IT2090A", 
"IT2090A", "IT2090A"), Stat = c("Citta dei Ragazzi", "Citta dei Ragazzi", 
"Citta dei Ragazzi", "Rende", "Rende", "Rende", "Acri", "Acri", 
"Acri", "Firmo", "Firmo", "Firmo", "Schiavonea", "Schiavonea", 
"Schiavonea"), Data = c("1/1/2022", "1/2/2022", "1/3/2022", "1/1/2022", 
"1/2/2022", "1/3/2022", "1/1/2022", "1/2/2022", "1/3/2022", "1/1/2022", 
"1/2/2022", "1/3/2022", "1/1/2022", "1/2/2022", "1/3/2022"), 
    PM10 = c(13.29, 11.14, 9.08, 16.62, 12.98, 10.4, 16.2, 19.4, 
    15.7, 10.82, 12.29, 9.54, 24.54, 22.88, 27.33)), class = "data.frame", row.names = c(NA, 
-15L))

I have tried using plm::cortab, but it doesn't calculate the correlation.

library(plm)
cortab(data$PM10, grouping = Stat, groupnames = c("Citta dei Ragazzi", "Rende", 
                                                  "Acri", "Firmo", "Schiavonea"))

The output should look like:

	Citta dei Ragazzi	Rende	Acri
Citta dei Ragazzi	1
Rende	x	1
Acri	x	x	1

Solution

This has pretty much already been asked (How can I complete a correlation in R of one variable across it's factor levels, matching by date) but for ease I have adapted that answer here for your use:

# simple correlation matrix:
data.wider <- data %>% 
  select(-ID, -NetC) %>% # remove unnecessary vars 
  pivot_wider(names_from = 'Stat', values_from = 'PM10')

cor(data.wider[,-1], use = 'p')  

# more lines required to set up correlation testing:
pw <- combn(unique(data$Stat),2) # make pairwise sets
pw

pairwise_c <- apply(pw,2,function(i){
  tidy(cor.test(data.wider[[i[1]]],data.wider[[i[2]]]))
})

results <- cbind(data.frame(t(pw)),bind_rows(pairwise_c))

results