phpperformanceloopsoptimizationapproximation

Successive approximation of unkown value


I have a url with a page system.

For instance https://myURL?p=50

But I want a script to find the last page available, for instance, let's say p=187

I have a function checkEmpty() that tells me whether the page is empty or not. So for instance:


$myUrl = new URL(50); //https://myURL?p=50
$myUrl->checkEmpty();
//This evaluates to false -> the page exists

$myUrl = new URL(188); //https://myURL?p=188
$myUrl->checkEmpty();
//This evaluates to true -> the page does NOT exist

$myUrl = new URL(187); //https://myURL?p=187
$myUrl->checkEmpty();
//This evaluates to false -> the page exists

I did a naive algorithm, that you might guess it, performs too much requests.

My question is: What would be the algorithm to find the last page with the minimal amount of requests?

EDIT As requested by people in the comment here is the checkEmpty() implementation

<?php
public function checkEmpty() : bool
{
    $criteria = "Aucun contenu disponible";
    if(strstr( $this->replace_carriage_return(" ", $this->getHtml()), $criteria) !== false)
    {
        return true;
    }
    else
    {
        return false;
    }
}

Solution

  • Since the upper bound is not known, exponentially increase the page no by 2 starting from 1. The moment you hit a non-existent page, you can do a binary search from previous existing page + 1 till this new upper bound where the page doesn't exist.

    This way, you can get your answer in O(log(n)) attempts asymptotically where n is the no. of existing pages here as the sample space.

    <?php
    
    $lowerBound = 1;
    $upperBound = 1;
    
    while(true){
        $myUrl = new URL($upperBound);
        if($myUrl->checkEmpty()){
            break;
        }
        $lowerBound = $upperBound + 1;
        $upperBound <<= 1;
    }
    
    $ans = $lowerBound;
    
    while($lowerBound <= $upperBound){
        $mid = $lowerBound + (($upperBound - $lowerBound) >> 1);
        $myUrl = new URL($mid);
        if($myUrl->checkEmpty()){
            $upperBound = $mid - 1;
        }else{
            $lowerBound = $mid + 1;
            $ans = $lowerBound;
        }
    }
    
    echo $ans;