I'm trying to compute a numerical subgradient of a convex function. My test subject is the Wolfe function. It doesn't need to be super accurate, so I tried a normal finite differential in both directions: (f(x-h)-f(x+h))/2h. In code:
delta = 1e-10;
subgradient = zeros(length(xToEvaluate),1);
for i = 1 : length(xToEvaluate)
deltaX = xToEvaluate;
deltaX(i) = xToEvaluate(i) + delta;
f1 = funct( deltaX );
deltaX(i) = xToEvaluate(i) - delta;
f2 = funct( deltaX );
subgradient(i,1) = (f1 - f2) / (2 * delta);
end
At the exact minimum of the function, at (-1 ,0), I get some things at the magnitude 1e-7
, so perfectly fine. As I move to something like (-1, 0.1) or (-1, 1e-6), I get a subgradient with second component of about 16
.
I'm aware that low deltas might introduce rounding errors, but it doesn't get better as I increase delta.
My second try was a one-dimensional five-point stencil, but even with deltas of around 1e-3
the weird 16
keeps popping up...
delta = 1e-3;
subgradient = zeros(length(xToEvaluate),1);
for i = 1 : length(xToEvaluate)
xPlusTwo = xToEvaluate;
xPlusOne = xToEvaluate;
xMinusTwo = xToEvaluate;
xMinusOne = xToEvaluate;
xPlusTwo(i) = xToEvaluate(i) + 2*delta;
xPlusOne(i) = xToEvaluate(i) + delta;
xMinusTwo(i) = xToEvaluate(i) - 2*delta;
xMinusOne(i) = xToEvaluate(i) - delta;
subgradient(i,1) = (-funct(xPlusTwo) + 8*funct(xPlusOne) - 8*funct(xMinusOne) + funct(xMinusTwo)) / (12*delta);
end
Anyone got an idea what this is all about?
If you work out the gradient of that Wolfe function, you come up with:
if x<=0;
dfx = 9 - 81*x.^8;
dfy = 16*sign(y);
elseif x>=abs(y);
dfx = 5*0.5./sqrt(9*x.^2 + 16*y.^2)*9*2.*x;
dfy = 5*0.5./sqrt(9*x.^2 + 16*y.^2)*16*2.*y;
else
dfx = 9;
dfy = 16*sign(y);
end
So as you can see, the second component of the gradient for x<=0
is 16*sign(y)
, thus it is zero when y==0
, +-16
otherwise.
BTW, it doesn't look like the exact minimum lies at [-1 0]
, but rather at [-0.7598 0]
= [-(1/9)^(1/8) 0]