rcorrelationpearson

Not getting expected correlation values - R cor()


I was getting all NA values except for the diagonal entries while finding correlation using R's cor(). I removed NAs pairwise. When I explicitly removed the NAs then I got the desired results. Have I misunderstood the arguments?

I tried

> c <- Result_table[,.SD,.SDcols=c("organic_account_countsession", "organic_account_countsession")]
> b <- cor(c, use="pairwise.complete.obs")

                             organic_account_countsession organic_account_countsession
organic_account_countsession                            1                           NA
organic_account_countsession                           NA                            1

Also tried this

> b <- cor(c, na.rm=TRUE) 

Still got the same result.

Only when I do

c <- c[complete.cases(c)]
b <- cor(c)

                             organic_account_countsession organic_account_countsession
organic_account_countsession                            1                            1
organic_account_countsession                            1                            1

I get all 1s. I expect to get all 1s as I am finding the correlation of a variable with itself.

(Note : The variable has variance, NA is not due to no variance)


Solution

  • This turned out to be a different error altogether on my part.

    I have imported the h2o package along with the stats package. Turns out there is a cor() function in h2o as well with a different behavior.

    cor <- stats::cor
    

    solved the problem.