machine-learninglibsvmscikit-learnpca

Using libsvm format in scikit


I'm very new to all these tools. I have been using libsvm and wanted to use scikit. But all of my inputs are in libsvm format. Something like this

 +1 1:1 36:1
 +1 1:1 11:1 25:1 36:1

I used the load_svmlight_files function to load, after loading my loaded training data looks like this (1, 0) 1.0 (1, 35) 1.0 (2, 0) 1.0 (2, 10) 1.0 (2, 24) 1.0 (2, 35) 1.0

But when i try to use the pylab scatter it returning

   ValueError: setting an array element with a sequence.

How can i change my data into the scikit two dimensional array?


Solution

  • sklearn.datasets.load_svmlight_file, will load the data as a scipy.sparse CSR matrix, while matplotlib scatter plot expects a NumPy array. If you think that materializing your sparse data as a dense NumPy array will fit in memory you can call the .toarray() method on it.

    Also, the scatter plot does only make sense on 2D array data.