I am pretty new for python. I am using python to read the arff file now:
import arff
for row in arff.load('cpu.arff'):
x = row
print(x)
The part of sample output is like this format:
<Row(125.0,256.0,6000.0,256.0,16.0,128.0,198.0)>
<Row(29.0,8000.0,32000.0,32.0,8.0,32.0,269.0)>
<Row(29.0,8000.0,32000.0,32.0,8.0,32.0,220.0)>
<Row(29.0,8000.0,32000.0,32.0,8.0,32.0,172.0)>
<Row(29.0,8000.0,16000.0,32.0,8.0,16.0,132.0)>
<Row(26.0,8000.0,32000.0,64.0,8.0,32.0,318.0)>
<Row(23.0,16000.0,32000.0,64.0,16.0,32.0,367.0)>
Actually, only the last column of data is the label, and the rest of data are the attributes. I am wondering how I can save them by using array? Because I want to assign the data of last column as y, and the first six column data as my x, and then I will do the cross-validation for the data from arff file.
Or is there any approaches to separate data by attributes and label from arff file automatically?
Row objects from arff
module support typical python array slicing, thus you can separate data from labels easily
import arff
X = []
y = []
for row in arff.load('cpu.arff'):
X.append(row[:-1])
y.append(row[-1])