I need to calculate the sales correlation of 5000 products which will results in 5000 by 5000 correlation matrix. I am trying to accomplish this in pandas using df.corr() but it is causing memory issues. Any ideas of more efficient ways to achieve this?
Use np.corrcoef...I was able to process the matrix in under a minute using this.