I'm trying to count the number of instances from a specific attribute in a .arff
file. Although, I can only seem to select the attributes and not the values from the data where it appears.
In this case, I'm trying to select the number of times Wins appear in the data, however, the code only selects for the value Wins
in the attribute.
Here's what I'm using:
//create the class to load the data
package weka;
import weka.core.Instances;
import weka.core.converters.ConverterUtils.DataSource;
public class DatasetLoading {
public static Instances loadData(String location) {
try {
return DataSource.read(location);
}
catch (Exception e) {
System.err.println("Failed to load data from: " + location);
e.printStackTrace();
return null;
}
}
public static void main(String[] args) {
String dataLocation = "C:/Users/Emil/Downloads/Week 1/Arsenal_TRAIN1.arff";
Instances train = loadData(dataLocation);
System.out.println(train);
}
}
//select for counts of values that appear in the data
public class test_learning {
public static void main(String[] args) throws Exception {
String arff = "C:/Users/Emil/Downloads/Week 1/Arsenal_TRAIN1.arff";
Instances data = DatasetLoading.loadData(arff);
System.out.print("Num of Wins = " + data.attribute(2).value(2));
Output: Num of Wins = Win
Expected:
Output: Num of Wins = 12
Example of datafile:
@relation Arsenal-weka.filters.unsupervised.attribute.Remove-R3
@attribute Leno {0,1}
@attribute Tierney {0,1}
@attribute class {Loss,Draw,Win}
@data
1,0,Loss
1,0,Loss
0,1,Draw
1,0,Draw
0,0,Win
0,1,Win
1,1,Win
0,1,Win
1,1,Win
1,0,Win
1,1,Loss
0,1,Draw
1,1,Draw
1,1,Draw
0,0,Win
1,0,Win
0,1,Win
1,1,Win
1,1,Win
1,1,Win
Instead of retrieving the third nominal value of the third attribute (Javadoc), you could use the Instances.attributeStats(int)
method (Javadoc) to get the statistics for the third attribute (AttributeStats Javadoc).