roptimizationconfidence-intervalsample-size

Optimizing for global minimum


I am attempting to use optimize() to find the minimum value of n for the following function (Clopper-Pearson lower bound):

f <- function (n, p=0.5) 
 (1 + (n - p*n + 1) / 
    (p*n*qf(p= .025, df1= 2*p, df2= 2*(n - p + 1))))^-1

And here is how I attempted to optimize it:

n_clop <- optimize(f.1, c(300,400), maximum = FALSE, p=0.5)
n_clop

I did this over the interval [300,400] because I suspect the value to be between within it but ultimately I would like to do the optimization between 0 and infinity. It seems that this command is producing a local minimum because no matter the interval it produces the lower bound of that interval as the minimum - which is not what I suspect from clopper-pearson. So, my two questions are how to properly find a global minimum in R and how to so over any interval?


Solution

  • I've very briefly looked over the Wikipedia page you linked and don't see any obvious typos in your formula (although I feel like it should be 0.975=1-alpha/2 rather than 0.025=alpha/2?). However, evaluating the function you've coded over a very broad scale suggests that there are no local minima that are messing you up. My strong guess would be that either your logic is wrong (i.e., n->0 is really the right answer) or that you haven't coded what you think you're coding, due to a typo (possibly in the Wikipedia article, although that seems unlikely) or a thinko.

    f <- function (n, p=0.5) 
     (1 + (n - p*n + 1) / 
        (p*n*qf(p= .025, df1= 2*p, df2= 2*(n - p + 1))))^-1
    

    Confirm that you're getting the right answer for the interval you chose:

    curve(f(x),c(300,400)) 
    

    Evaluating over a broad range (n=0.00001 to 1000000):

    curve(f(10^x),c(-5,7))
    

    enter image description here

    As @MrFlick suggests, global optimization is hard. You could start with optim(...method="SANN") but the best answer is definitely case-specific.