rbacktrackinggradient-descent

Error in Gradient Descent Function with backtracking line search


I am attempting to write a gradient descent function in R that uses backtracking line search to determine the step size. Ultimately, I want to find the minimizer of a function (let's say f). Mathematically, the procedure should begin with setting i = 1, xi = xi-1 - backtrack(x, f, alpha, epsilon)(f'(xi-1). If the absolute value of f(xi) - f(xi-1) is less than epsilon, then the function should return xi. Else, it should increment i by 1 and repeat the previous step. The function backtrack then can calculate a step size (t) by setting t = 1 and if f(x - tf'(x)) is less than f(x) - alpha*tf'(x)^2, then set the step size to beta(t). Else, the function should stop and return the value of t.

backtrack_desc <- function(fn, deriv, start, alpha, beta, epsilon) {
    x = start
    while(TRUE) {
        step_size = backtrack(fn, deriv, x, alpha, beta)
        new_x = fn(x) - step_size * deriv(x)
        if(abs(deriv(new_x)) <= epsilon) {
            break
        }
        x = new_x
    }
    return(x)
}

backtrack <- function(fn, deriv, x, alpha, beta) {
    t = 1
    while(fn(x - t * deriv(x)) > (fn(x) - alpha * t * deriv(x)^2)) {
        t = beta * t
    }
}

# This should return something close to zero
backtrack_desc(function(x) x^2, function(x) 2 * x, start = 10,
               alpha = .03, beta = .8, epsilon = 1e-10)
backtrack_desc(function(x) x^2, function(x) 2 * x, start = 1,
               alpha = .03, beta = .8, epsilon = 1e-10)

The function should return a small number close to zero, but when I run it, I receive this error message: Error in if (abs(deriv(new_x)) <= epsilon) { : argument is of length zero.


Solution

  • You have two problems. First, backtrack doesn't return anything, so step_size is always NULL. Secondly, I think new_x should be x - step_size * deriv(x) rather than f(x) - step_size * deriv(x).

    Fixing these, we have:

    backtrack_desc <- function(fn, deriv, start, alpha, beta, epsilon) {
      x = start
      while(TRUE) {
        step_size = backtrack(fn, deriv, x, alpha, beta)
        new_x = x - step_size * deriv(x)
        if(abs(deriv(new_x)) <= epsilon) {
          break
        }
        x = new_x
      }
      return(x)
    }
    
    backtrack <- function(fn, deriv, x, alpha, beta) {
      t = 1
      while(fn(x - t * deriv(x)) > (fn(x) - alpha * t * deriv(x)^2)) {
        t = beta * t
      }
      return(t)
    }
    

    Which results in

    # This should return something close to zero
    backtrack_desc(function(x) x^2, function(x) 2 * x, start = 10,
                   alpha = .03, beta = .8, epsilon = 1e-10)
    #> [1] 8.082813e-11
    
    # This should return something close to 1
    backtrack_desc(function(x) (x - 1)^2, function(x) 2 * x - 2, start = 4,
                   alpha = .03, beta = .8, epsilon = 1e-10)
    #> [1] 1
    
    # This should return something close to pi
    backtrack_desc(function(x) cos(x), function(x) -sin(x), start = 1,
                   alpha = .03, beta = .8, epsilon = 1e-10)
    #> [1] 3.141593