pythonarff

Error while loading an .arff file with scikit-learn


I would like to use an Attribute-Relation File Format(.arff) with scikit-learn for a classification problem. The code runs fine on a Windows 10 machine, however when I try the same code on my other machine with Ubuntu(18.04.1) it throws a confusing error. Here is the code for loading the arff file:

import arff, numpy as np
dataset = arff.load(open('mydataset.arff'))
mydata = np.array(dataset['data'])

And the error I am getting is this:

Traceback (most recent call last):
  File "/home/user/Desktop/ml_classification.py", line 14, in <module>
    mydata = np.array(dataset['data'])
TypeError: 'generator' object is not subscriptable

What could be the reason for this error and why does it only occur on one machine and not the other?


Solution

  • I'm assuming you are using an old or an unsupported library for ARFF. In order to find out details of the ARFF package you are using, try pip show arff. In my first attempt it showed the url for a google code site (which is defunct now). Try removing the current arff package and install the one at https://pypi.org/project/liac-arff/ with pip install liac-arff. Your code should work with the liac-arff package.