This is a follow up question to what was asked here... Quadrant plot based on ccf value outputs in R
The original question and this one use the same data...
df <- structure(list(Date = structure(c(16222, 16617, 14518, 15156,
15918, 17075, 15522, 16679, 16010, 15187, 15461, 16283, 17379,
15553, 17410, 15553, 16191, 16314, 14549, 15979), class = "Date"),
Commonname = c("Black Sea Bass", "Pinfish", "Pigfish", "Pinfish",
"Silver Perch", "Black Sea Bass", "Pigfish", "Pinfish", "Pigfish",
"Silver Perch", "Silver Perch", "Black Sea Bass", "Pinfish",
"Pinfish", "Silver Perch", "Pigfish", "Black Sea Bass", "Silver Perch",
"Silver Perch", "Black Sea Bass"), CPUE = c(1.25513090974505,
9.41478783154444, 1.63667465565289, 3.13779141143018, 4.26313144106683,
2.32564938844104, 2.70394855189782, 8.49969670589948, 1.7329255861366,
2.0845409179642, 0.269832703723692, 1.21288437532366, 11.8739506505966,
8.55504246458105, 2.21256794002004, 4.51336797979511, 1.47695928524315,
1.10425042966867, 0.632732705722451, 1.59167844861806), Discharge = c(14.8521616,
5.23042759111111, 1.42663083211115, 0.184551018105263, 48.9156538971429,
2.29765846588235, 33.25524992, 4.06629248, 1.610659584, 0.21166808,
0.0607489749333333, 2.22029454545455, 12.90821328, 31.9696672,
8.05754544, 32.7267690105263, 43.493472128, 6.77337856, 1.10646621744,
4.37803470545454)), row.names = c(NA, -20L), class = c("tbl_df",
"tbl", "data.frame"))
Now using the following code and filtering for positive lags only...
library(tidyverse)
fccf = function(data) ccf(data$CPUE,data$Discharge, lag.max = 5, plot = FALSE)
facf = function(acf) tibble(aacf = acf$acf[,,1], lag = acf$lag[,,1])
test.df <- df %>% group_by(Commonname) %>%
nest() %>% #Step 1
mutate(ccf = map(data, ~fccf(.x))) %>% #Step 2
mutate(acf = map(ccf, ~facf(.x))) %>% #Step 3
unnest(acf)
#removes the nested columns
test.df <- test.df[,c("Commonname", "aacf", "lag")]
#removes negative lags
test.df <- subset(test.df, lag > 0)
test.df
I get different results from what the ccf for each individual shows.
For example the plot for Black Sea Bass is completely different that what these custom functions pull out for values....
df <- df %>%
filter(Commonname == "Black Sea Bass")
ccf(df$Discharge, df$CPUE, lag.max = 5)
The ccf plot shows a value of almost 1.0 at the 2nd positive lag step, but the custom functions pull a value of about 0.01
If someone can help me figure out what's causing the differences, or how to get these custom functions to pull the correlations seen on the plot i would be most grateful. Thanks.
Oh, everything is fine! The difference, however, is that in your first query to the ccf
function, you entered data$CPUE
as x
and data$Discharge
as y
. So I did the fccf
function too. Now, however, you have swapped arguments! Note that you are calling ccf(df$Discharge, df$CPUE, lag.max = 5)
. Hence the difference.
fccf = function(data) ccf(data$Discharge, data$CPUE, lag.max = 5, plot = FALSE)
facf = function(acf) tibble(lag = acf$lag[,,1], acf = acf$acf[,,1])
df %>% nest_by(Commonname) %>%
mutate(ccf = list(fccf(data)),
acf = list(facf(ccf))) %>%
select(c(-ccf, -data)) %>%
unnest(acf) %>%
ggplot(aes(lag, acf)) +
geom_segment(aes(x=lag, xend=lag, y=0, yend=acf)) +
geom_point(size=2, color="red", fill=alpha("orange", 0.3), alpha=0.7, shape=21, stroke=2)+
facet_wrap(vars(Commonname), 2, 2)
Compare the chart (facet for Black Sea Bass) above with what you get.
df2 = df %>% filter(Commonname == "Black Sea Bass")
acc1 = ccf(df2$Discharge, df2$CPUE, lag.max = 5)