machine-learningsvmvlfeat

Tuning vlfeat SVM


I have 6 samples of 1 dim data as example and I'm trying to train vlfeat's SVM on it:

data:
    [188.00000000;
      168.00000000;
      191.00000000;
      150.00000000;
      154.00000000;
      124.00000000]

first 3 samples are positive and last 3 samples are negative.

and I get weights(including bias):

w: -0.6220197226 -0.0002974511

the problem is that all samples get predicted as negative, but they are clearly linear separable.

For learning I use solver type VlSvmSolverSgd and lambda 0.01.

I'm using C API if it matters.

Minimum working example:

void vlfeat_svm_test()
{
    vl_size const numData = 6 ;
    vl_size const dimension = 1 ;
    //double x[dimension * numData] = {188.0,168.0,191.0,150.0, 154.0, 124.0};
    double x[dimension * numData] = {188.0/255,168.0/255,191.0/255,150.0/255, 154.0/255, 124.0/255};
    double y[numData] = {1, 1, 1, -1, -1, -1} ;

    double lambda = 0.01;

    VlSvm *svm = vl_svm_new(VlSvmSolverSgd, x, dimension, numData, y, lambda);
    vl_svm_train(svm);

    double const * w= vl_svm_get_model(svm);
    double bias= vl_svm_get_bias(svm);
    for(int k=0;k<numData;++k)
    {
        double res= 0.0;
        for(int i=0;i<dimension;++i)
        {
            res+= x[k*dimension+i]*w[i];
        }
        int pred= ((res+bias)>0)?1:-1;

        cout<< pred <<endl;
    }

    cout << "w: ";
    for(int i=0;i<dimension;++i)
        cout<< w[i] <<" ";
    cout<< bias <<endl;

    vl_svm_delete(svm);
}

Update:

Also I tried to scale input data by dividing by 255 it has no effect.

Update 2:

Extremly low lambda= 0.000001 seems solve the problem.


Solution

  • This happens because the SVM solvers in VLFeat do not estimate the model and bias directly, but use the workaround of adding a constant component to the data (as mentioned in http://www.vlfeat.org/api/svm-fundamentals.html) and return the corresponding model weight as the bias.

    The bias term is thus a part of the regularizer and models with higher bias are "penalized" in terms of energy. This effect is especially strong in your case, since your data are extremely low dimensional :) Therefore you need to choose a small value of the regularization parameter LAMBDA to lower the importance of the regularizer.