I am attempting to use optimize()
to find the minimum value of n for the following function (Clopper-Pearson lower bound):
f <- function (n, p=0.5)
(1 + (n - p*n + 1) /
(p*n*qf(p= .025, df1= 2*p, df2= 2*(n - p + 1))))^-1
And here is how I attempted to optimize it:
n_clop <- optimize(f.1, c(300,400), maximum = FALSE, p=0.5)
n_clop
I did this over the interval [300,400] because I suspect the value to be between within it but ultimately I would like to do the optimization between 0 and infinity. It seems that this command is producing a local minimum because no matter the interval it produces the lower bound of that interval as the minimum - which is not what I suspect from clopper-pearson. So, my two questions are how to properly find a global minimum in R and how to so over any interval?
I've very briefly looked over the Wikipedia page you linked and don't see any obvious typos in your formula (although I feel like it should be 0.975=1-alpha/2 rather than 0.025=alpha/2?). However, evaluating the function you've coded over a very broad scale suggests that there are no local minima that are messing you up. My strong guess would be that either your logic is wrong (i.e., n->0 is really the right answer) or that you haven't coded what you think you're coding, due to a typo (possibly in the Wikipedia article, although that seems unlikely) or a thinko.
f <- function (n, p=0.5)
(1 + (n - p*n + 1) /
(p*n*qf(p= .025, df1= 2*p, df2= 2*(n - p + 1))))^-1
Confirm that you're getting the right answer for the interval you chose:
curve(f(x),c(300,400))
Evaluating over a broad range (n=0.00001 to 1000000):
curve(f(10^x),c(-5,7))
As @MrFlick suggests, global optimization is hard. You could start with optim(...method="SANN")
but the best answer is definitely case-specific.