matlabmachine-learningsvm

What is the box constraint in the output of a support vector machine in matlab?


In matlab, the function fitcsvm trains a support vector machine. In output, there is a component BoxConstraints. I have read the post this post and understood the meaning of box constraint.

But in the answer of the post, the box constraint C is a scalar, while in the output of Matlab it is a vector with same length of number of samples, and it is all 1, where I have used default box constraint 1 in the input.

I do not understand what is the output.

(I have read the sections of SVM on the Element of Statistical Learning, and did not find this term. So I think perhaps it is different choice of term. If you would like to answer, you could skip explaining basic concepts of SVM.)


Solution

  • As described in this answer (to the question you already linked in your question), the scalar box constraint C defines a penalty on the sum of slack variables. In the dual optimization problem, this corresponds to an upper bound to the Lagrange multipliers, hence the name "Box Constraint". As you mention, this is a scalar, not a vector.

    The BoxConstraints property is, however, a n-by-1 vector, where n is the number of training observations. Note that when calling fitcsvm you can only set C to a scalar and not a vector, so the BoxConstraints property is internally calculated.

    In the Algorithms section of the fitcsvm function documentation, MATLAB specifies, what actually happens under the hood (highlighting by me):

    For two-class learning, fitcsvm assigns a box constraint to each observation in the training data. The formula for the box constraint of observation j is

    Cj = nC0wj*

    where C0 is the initial box constraint, and wj* is the observation weight adjusted by Cost and Prior for observation j.

    The concept of Cost, Prior and Weight is further documented here .When calling fitcsvm, you can specify a Cost, Prior and/or Weights:

    From these values, MATLAB calculates an adjusted weight wj*, which is then used by fitcsvm as shown above to scale the Box Constraint individually for each observation. The result of this calculation is saved in the BoxConstraints property of the ClassificationSVM class. Hence, the BoxConstraints contains a weighted version for every observation j of the standard box constraint that you specified.

    tl;dr: fitcsvm adjusts the scalar box constraint C to account for prior probabilities, misclassification costs and/or observation weights that you specify. If you use default values for these, the BoxConstraints will just be C for all values - which is exactly what you see.