rfinanceparallel.foreacheconomics

Parallel Computing - Cointegration


I am trying to find cointegrating pairs in the S&P 500. I have the daily price data. Previously I used a "for" loop to check for cointegrating pairs and it worked. But it took about 1500 seconds, so I thought maybe parallel computing will reduce the time.

But when I run my code using "for each" loop, the final matrix(Jotru) which should contain information on whether relationship exist(Yes or No), comes out empty(meaning it returns with the original matrix which is filled with zeros instead of yes or no).

The "for" loop that works is as follow

for  (a in 1:359) {
  Bstock    <- colnames(Useries)[a]
    stockleft = 360-a
    for(i in (1:(stockleft))) {
    teststock <- a + i
    tstock <- colnames(Useries)[teststock]
    Stocknames <- c(Bstock, tstock)
    Jotr <- ca.jo(Useries[,Stocknames], type = "trace", ecdet ="none", K=10)
    tvalue <- Jotr@teststat
    tvalue <- as.data.frame(tvalue)
    cval <- Jotr@cval
    cval <- as.data.frame(cval)
    j = a+(i-1)
    Jotru[j,a] <- ifelse(tvalue[1,1]<cval[1,2], "No", "Yes")
    }
}

I tried the below code with try({}) and that didn't work either. Both with and without try({}), the code runs without an error, the only issue is the final matrix doesn't get filled. I'm not sure where I went wrong, any help would be appreciated.

CPU <- makeCluster(cores[1]-2)
registerDoParallel(CPU)
foreach (a = 1:359,.packages = c("urca"),.combine = rbind) %dopar% {
    Bstock    <- colnames(Useries)[a]
    stockleft = 360-a
    for(i in (1:(stockleft))) {
    teststock <- a + i
    tstock <- colnames(Useries)[teststock]
    Stocknames <- c(Bstock, tstock)
    Jotr <- ca.jo(Useries[,Stocknames], type = "trace", ecdet ="none", K=10)
    tvalue <- Jotr@teststat
    tvalue <- as.data.frame(tvalue)
    cval <- Jotr@cval
    cval <- as.data.frame(cval)
    j = a+(i-1)
    Jotru[j,a] <- ifelse(tvalue[1,1]<cval[1,2], "No", "Yes")
    }
}
stopCluster(CPU)
toc()

EDIT - Package - To do the "foreach" loop I use "parallel", "foreach" and "doParallel" packages. urca is the only package used inside the loop. Xts is used to create the series used inside the loop. Everything else is based on base r.

Edit 2 - Useries - Data file - The file is about 13 Mb.

https://github.com/AvisR/AvisR/blob/main/Useries.csv

https://drive.google.com/file/d/1r3pLwvYxHdnxq1i9hP2Jso8g4qzpq4ds/view?usp=sharing

Jotru - 359*359 matrix to hold the value. The lower half will be filled with Yes or No when we run the "for" loop

Jotru <- matrix(rep(0), 359,359)
rownames(Jotru) <- colnames(Useries)[-1]  
colnames(Jotru) <- colnames(Useries)[-360]  
Jotru <- as.data.frame(Jotru)

Solution

  • Incase anyone is having the same issue. I figured how to make Forloop work.

    The key difference between "for" and "foreach" loop is "for" loops store all the objects created during the loop in the environment.

    To address this, I did two things,

    1. Result - "Foreach" loop only reports a result. I made the entire "foreach" loop as an object so the result is stored as a list. (Jotruraw)

    2. "Foreach" loops doesn't store objects. My final result for the "foreach" loop is simply all the objects I wanted to extract. ( return(c(j,a,Jotru[j,a])) )

    3. I then reformatted the result into a structure I desired. (Jotru, note that even the numbers in the result are saved as characters. I had the convert them.)

    The following codes gives me the same result as the "for" loop I have mentioned above. (Though my data is slightly different and that have changed the code a bit.)

    CPU <- makeCluster(cores[1]-2)
    registerDoParallel(CPU)
    Jotruraw <- foreach (a = 1:439,.packages = c("urca", "parallel", "foreach", "doParallel"),.combine = rbind) %dopar% {
      Bstock    <- colnames(Useries)[a]
      stockleft = 440-a
        foreach (i = (1:(stockleft)),.packages = c("urca"),.combine = rbind) %dopar% { 
        teststock <- a + i
        tstock <- colnames(Useries)[teststock]
        Stocknames <- c(Bstock, tstock)
        Jotr <- ca.jo(Useries[,Stocknames], type = "trace", ecdet ="none", K=12)
        tvalue <- Jotr@teststat
        tvalue <- as.data.frame(tvalue)
        cval <- Jotr@cval
        cval <- as.data.frame(cval)
        j = a+(i-1)
        Jotru[j,a] <- ifelse(tvalue[1,1]<cval[1,2], "No", "Yes")
        return(c(j,a,Jotru[j,a]))
        }
    }
    stopCluster(CPU)
    
    Jotruraw <- as.data.frame(Jotruraw)
    Jotruraw$V1 <- as.numeric(Jotruraw$V1)
    Jotruraw$V2 <- as.numeric(Jotruraw$V2)
    
    for (i in 1:96580){
      a <- Jotruraw$V1[i]
      b <- Jotruraw$V2[i]
      c <- Jotruraw$V3[i]
      Jotru[a,b] <- c
    }