I am trying to export a pandas dataframe to .arff file to use it in Weka. I have seen that the module liac-arff can be used for that purpose. Going on the documentation here it seems I have to use
arff.dump(obj,fp)
Though, I am struggling with obj ( a dictionary) I'm guessing I have to create this by myself. How do you suggest me to do that properly? in a big dataset (3 000 000 lines and 95 columns) is there any example you can provide me to export from pandas dataframe to .arff file using python (v 2.7)?
First install the package:
$ pip install arff
Then use in Python:
import arff
arff.dump('filename.arff'
, df.values
, relation='relation name'
, names=df.columns)
Where df
is of type pandas.DataFrame
. Voila.