I am attempting to write a gradient descent function in R that uses backtracking line search to determine the step size. Ultimately, I want to find the minimizer of a function (let's say f). Mathematically, the procedure should begin with setting i = 1, xi = xi-1 - backtrack(x, f, alpha, epsilon)(f'(xi-1). If the absolute value of f(xi) - f(xi-1) is less than epsilon, then the function should return xi. Else, it should increment i by 1 and repeat the previous step. The function backtrack then can calculate a step size (t) by setting t = 1 and if f(x - tf'(x)) is less than f(x) - alpha*tf'(x)^2, then set the step size to beta(t). Else, the function should stop and return the value of t.
backtrack_desc <- function(fn, deriv, start, alpha, beta, epsilon) {
x = start
while(TRUE) {
step_size = backtrack(fn, deriv, x, alpha, beta)
new_x = fn(x) - step_size * deriv(x)
if(abs(deriv(new_x)) <= epsilon) {
break
}
x = new_x
}
return(x)
}
backtrack <- function(fn, deriv, x, alpha, beta) {
t = 1
while(fn(x - t * deriv(x)) > (fn(x) - alpha * t * deriv(x)^2)) {
t = beta * t
}
}
# This should return something close to zero
backtrack_desc(function(x) x^2, function(x) 2 * x, start = 10,
alpha = .03, beta = .8, epsilon = 1e-10)
backtrack_desc(function(x) x^2, function(x) 2 * x, start = 1,
alpha = .03, beta = .8, epsilon = 1e-10)
The function should return a small number close to zero, but when I run it, I receive this error message: Error in if (abs(deriv(new_x)) <= epsilon) { : argument is of length zero.
You have two problems. First, backtrack
doesn't return anything, so step_size
is always NULL
. Secondly, I think new_x
should be x - step_size * deriv(x)
rather than f(x) - step_size * deriv(x)
.
Fixing these, we have:
backtrack_desc <- function(fn, deriv, start, alpha, beta, epsilon) {
x = start
while(TRUE) {
step_size = backtrack(fn, deriv, x, alpha, beta)
new_x = x - step_size * deriv(x)
if(abs(deriv(new_x)) <= epsilon) {
break
}
x = new_x
}
return(x)
}
backtrack <- function(fn, deriv, x, alpha, beta) {
t = 1
while(fn(x - t * deriv(x)) > (fn(x) - alpha * t * deriv(x)^2)) {
t = beta * t
}
return(t)
}
Which results in
# This should return something close to zero
backtrack_desc(function(x) x^2, function(x) 2 * x, start = 10,
alpha = .03, beta = .8, epsilon = 1e-10)
#> [1] 8.082813e-11
# This should return something close to 1
backtrack_desc(function(x) (x - 1)^2, function(x) 2 * x - 2, start = 4,
alpha = .03, beta = .8, epsilon = 1e-10)
#> [1] 1
# This should return something close to pi
backtrack_desc(function(x) cos(x), function(x) -sin(x), start = 1,
alpha = .03, beta = .8, epsilon = 1e-10)
#> [1] 3.141593