pythonmatlabmachine-learningartificial-intelligence

How to use kmeans in python with Pearson correlation as distance measure, similar to MATLAB?


In MATLAB there is kmeans function with parameter as 'correlation' and 'distance'. I want similar functionality in python. How do I do that? Are there any in-built functions which does this? Or are there any custom built solutions available? If not, how can one go about writing such code?


Solution

  • In python, we would use the scikit-learn library or the scipy library for such kind of computation.

    However, they only use the Euclidean distance metric. The code for scipy seems easier to edit, so you could consider updating it and making a pull request to provide that feature to the library.

    Otherwise, someone already asked this question on stack overflow a while back.
    It is not about the k-means clustering but it is about clustering based on correlation. Maybe you will find it helpful.