Consider the following usage of liblinear (http://liblinear.bwaldvogel.de/):
double C = 1.0; // cost of constraints violation
double eps = 0.01; // stopping criteria
Parameter param = new Parameter(SolverType.L2R_L2LOSS_SVC, C, eps);
Problem problem = new Problem();
double[] GROUPS_ARRAY = {1, 0, 0, 0};
problem.y = GROUPS_ARRAY;
int NUM_OF_TS_EXAMPLES = 4;
problem.l = NUM_OF_TS_EXAMPLES;
problem.n = 2;
FeatureNode[] instance1 = { new FeatureNode(1, 1), new FeatureNode(2, 1) };
FeatureNode[] instance2 = { new FeatureNode(1, -1), new FeatureNode(2, 1) };
FeatureNode[] instance3 = { new FeatureNode(1, -1), new FeatureNode(2, -1) };
FeatureNode[] instance4 = { new FeatureNode(1, 1), new FeatureNode(2, -1) };
FeatureNode[] instance5 = { new FeatureNode(1, 1), new FeatureNode(2, -0.1) };
FeatureNode[] instance6 = { new FeatureNode(1, -0.1), new FeatureNode(2, 1) };
FeatureNode[] instance7 = { new FeatureNode(1, -0.1), new FeatureNode(2, -0.1) };
FeatureNode[][] testSetWithUnknown = {
instance5,
instance6,
instance7
};
FeatureNode[][] trainingSetWithUnknown = {
instance1,
instance2,
instance3,
instance4
};
problem.x = trainingSetWithUnknown;
Model m = Linear.train(problem, param);
for( int i = 0; i < trainingSetWithUnknown.length; i++)
System.out.println(" Train.instance = " + i + " => " + Linear.predict(m, trainingSetWithUnknown[i]) );
System.out.println("---------------------");
for( int i = 0; i < testSetWithUnknown.length; i++)
System.out.println(" Test.instance = " + i + " => " + Linear.predict(m, testSetWithUnknown[i]) );
Here is the output :
iter 1 act 1.778e+00 pre 1.778e+00 delta 6.285e-01 f 4.000e+00 |g| 5.657e+00 CG 1
Train.instance = 0 => 1.0
Train.instance = 1 => 0.0
Train.instance = 2 => 0.0
Train.instance = 3 => 0.0
---------------------
Test.instance = 0 => 1.0
Test.instance = 1 => 1.0
Test.instance = 2 => 0.0
Instead of the integer (hard) predictions, I need probablistic predictions. There is an option -b for command line, but I couldn't find anything for direct usage of the function inside the code. Also, looked inside the code (https://github.com/bwaldvogel/liblinear-java/blob/master/src/main/java/de/bwaldvogel/liblinear/Predict.java); apparently there is no option for probabilistic prediction, via direct usage inside the code. Is that correct?
UPDATE: I ended up using the liblinear code form https://github.com/bwaldvogel/liblinear-java . In the file Predict.java I changed
private static boolean flag_predict_probability = true;
to
private static boolean flag_predict_probability = false;
and used
SolverType.L2R_LR
But still getting integer classes. Any idea?
To use probabilities one needs to change the code. The prediction is made inside the
public static double predictValues(Model model, Feature[] x, double[] dec_values) {
function inside Linear.java file:
if (model.nr_class == 2) {
System.out.println("Two classes ");
if (model.solverType.isSupportVectorRegression()) {
System.out.println("Support vector");
return dec_values[0];
}
else {
System.out.println("Not Support vector");
return (dec_values[0] > 0) ? model.label[0] : model.label[1];
}
}
needs to be changed to
if (model.nr_class == 2) {
System.out.println("Two classes ");
if (model.solverType.isSupportVectorRegression()) {
System.out.println("Support vector");
return dec_values[0];
}
else {
System.out.println("Not Support vector");
return dec_values[0];
}
}
Note that the output is still not a probabilty, instead it is just a linear combination of weights and feature values. If you give it to softmax function it will become a probability in [0, 1].
Also, make sure to choose Logistic Regression:
SolverType.L2R_LR