I have been trying to determine the correlation between variable in panel data. My data is in the form (with more dates, some values of PM10 are NA):
structure(list(NetC = c("Cosenza Provincia", "Cosenza Provincia",
"Cosenza Provincia", "Cosenza Provincia", "Cosenza Provincia",
"Cosenza Provincia", "Cosenza Provincia", "Cosenza Provincia",
"Cosenza Provincia", "Reti Private", "Reti Private", "Reti Private",
"Reti Private", "Reti Private", "Reti Private"), ID = c("IT1938A",
"IT1938A", "IT1938A", "IT2086A", "IT2086A", "IT2086A", "IT2110A",
"IT2110A", "IT2110A", "IT1766A", "IT1766A", "IT1766A", "IT2090A",
"IT2090A", "IT2090A"), Stat = c("Citta dei Ragazzi", "Citta dei Ragazzi",
"Citta dei Ragazzi", "Rende", "Rende", "Rende", "Acri", "Acri",
"Acri", "Firmo", "Firmo", "Firmo", "Schiavonea", "Schiavonea",
"Schiavonea"), Data = c("1/1/2022", "1/2/2022", "1/3/2022", "1/1/2022",
"1/2/2022", "1/3/2022", "1/1/2022", "1/2/2022", "1/3/2022", "1/1/2022",
"1/2/2022", "1/3/2022", "1/1/2022", "1/2/2022", "1/3/2022"),
PM10 = c(13.29, 11.14, 9.08, 16.62, 12.98, 10.4, 16.2, 19.4,
15.7, 10.82, 12.29, 9.54, 24.54, 22.88, 27.33)), class = "data.frame", row.names = c(NA,
-15L))
I have tried using plm::cortab
, but it doesn't calculate the correlation.
library(plm)
cortab(data$PM10, grouping = Stat, groupnames = c("Citta dei Ragazzi", "Rende",
"Acri", "Firmo", "Schiavonea"))
The output should look like:
Citta dei Ragazzi | Rende | Acri | |
---|---|---|---|
Citta dei Ragazzi | 1 | ||
Rende | x | 1 | |
Acri | x | x | 1 |
This has pretty much already been asked (How can I complete a correlation in R of one variable across it's factor levels, matching by date) but for ease I have adapted that answer here for your use:
# simple correlation matrix:
data.wider <- data %>%
select(-ID, -NetC) %>% # remove unnecessary vars
pivot_wider(names_from = 'Stat', values_from = 'PM10')
cor(data.wider[,-1], use = 'p')
# more lines required to set up correlation testing:
pw <- combn(unique(data$Stat),2) # make pairwise sets
pw
pairwise_c <- apply(pw,2,function(i){
tidy(cor.test(data.wider[[i[1]]],data.wider[[i[2]]]))
})
results <- cbind(data.frame(t(pw)),bind_rows(pairwise_c))
results