rdataframeoptimization

Question on nloptr package in for optimisation


I am trying to optimize an function using nloptr package in R like below

Some_Dataframe_Fixed = data.frame(x = 1:2, y = 4:5)

eval_f <- function(x) {
  Some_Arg_Fixed_Mat = as.matrix(Some_Dataframe_Fixed)
100 * (x[2] - x[1] * x[1]) ^ 2 + (1 - x[1]) ^ 2
}

In above function, I am converting an external data frame Some_Dataframe_Fixed to matrix inside the function.

I would like to know from expert here, if such implementation is efficient? Will that dataframe Some_Dataframe_Fixed be converted fresh every time nloptr call that function during optimization? Or should I always convert that data frame outside of the function since that conversion does not depend on x?

Thanks for your pointer.


Solution

  • Narrowly speaking, the answer is that you should do as little computation as possible within the objective function, since the objective function will typically be called many times during the optimization process.

    df_fixed = data.frame(x = 1:2, y = 4:5)
    f1_conv <- function(x) {
      fixed_mat <- as.matrix(df_fixed)
      100 * (x[2] - x[1] * x[1]) ^ 2 + (1 - x[1]) ^ 2
    }
    f1_noconv <- function(x) {
        100 * (x[2] - x[1] * x[1]) ^ 2 + (1 - x[1]) ^ 2
    }
    microbenchmark::microbenchmark(f1_conv(1:2), f1_noconv(1:2))
    

    You can see that the version that doesn't convert the data frame is faster ...

    Unit: nanoseconds
               expr   min      lq      mean  median      uq      max neval cld
       f1_conv(1:2) 27752 29395.5 185535.83 30166.5 31169.0 15456848   100   a
     f1_noconv(1:2)   601   636.5  36988.98   721.0   751.5  3618146   100   a
    

    ... but note that the units here are in nanoseconds: even the slowest execution of the slower function takes 0.015 seconds. Whether it's even worth worrying about this level of optimization depends on how many evaluations it will take to do a single optimization any how many times you'll be running the optimization (i.e. the relative contribution of the optimization to your overall time budget) and the cost of the conversion relative to the rest of the computation in your objective function (see Amdahl's Law)