matlabmachine-learningsvmquadprog

How to implement a soft-margin SVM model using Matlab's quadprog?


Suppose we are given a training dataset {yᵢ, xᵢ}, for i = 1, ..., n, where yᵢ can either be -1 or 1 and xᵢ can be e.g. a 2D or 3D point.

In general, when the input points are linearly separable, the SVM model can be defined as follows

min 1/2*||w||²
w,b

subject to the constraints (for i = 1, ..., n)

yᵢ*(w*xᵢ - b) >= 1

This is often called the hard-margin SVM model, which is thus a constrained minimization problem, where the unknowns are w and b. We can also omit 1/2 in the function to be minimized, given it's just a constant.

Now, the documentation about Matlab's quadprog states

x = quadprog(H, f, A, b) minimizes 1/2*x'*H*x + f'*x subject to the restrictions A*x ≤ b. A is a matrix of doubles, and b is a vector of doubles.

We can implement the hard-margin SVM model using quadprog function, to get the weight vector w, as follows


Now, I am trying to implement a soft-margin SVM model. The minimization equation here is

min (1/2)*||w||² + C*(∑ ζᵢ)
w,b

subject to the constraints (for i = 1, ..., n)

yᵢ*(w*xᵢ - b) >= 1 - ζᵢ

such that ζᵢ >= 0, where is the summation symbol, ζᵢ = max(0, 1 - yᵢ*(w*xᵢ - b)) and C is a hyper-parameter.

How can this optimization problem be solved using the Matlab's quadprog function? It's not clear to me how the equation should be mapped to the parameters of the quadprog function.

The "primal" form of the soft-margin SVM model (i.e. the definition above) can be converted to a "dual" form. I did that, and I am able to get the Lagrange variable values (in the dual form). However, I would like to know if I can use quadprog to solve directly the primal form without needing to convert it to the dual form.


Solution

  • I don't see how it can be a problem. Let z be our vector of (2n + 1) variables:

    z = (w, eps, b)
    

    Then, H becomes diagonal matrix with first n values on the diagonal equal to 1 and the last n + 1 set to zero:

    H = diag([ones(1, n), zeros(1, n + 1)])
    

    Vector f can be expressed as:

    f = [zeros(1, n), C * ones(1, n), 0]'
    

    First set of constrains becomes:

    Aineq = [A1, eye(n), zeros(n, 1)]
    bineq = ones(n, 1)
    

    where A1 is a the same matrix as in primal form.

    Second set of constraints becomes lower bounds:

    lb = (inf(n, 1), zeros(n, 1), inf(n, 1))
    

    Then you can call MATLAB:

    z = quadprog(H, f, Aineq, bineq, [], [], lb);
    

    P.S. I can be mistaken in some small details, but the general idea is right.