phppreg-replace

How to use PHP preg_replace on a formatted number based on search query while ignoring comma and decimal?


I want to highlight my search query on a formatted number.

For example:

$search_query = '1234'; // or $search_query = '7800';
$formatted_numeber = '12,345,678.00';

Currently my code is:

preg_replace('/(' . $search_query . ')/i', "<span style='background: yellow'>$1</span>", $formatted_numeber);

But this code fails to highlight search clause like '1234' , '7800' due to presence of commas (,) and decimal (.)

Also i want this to work with the presence of commas and decimals in the search query.

For example:

$search_query = '3,456';
$formatted_numeber = '12,345,678.00';

I want to highlight the part "345,6" on my formatted number.

Or:

$search_query = '7.80';
$formatted_numeber = '12,345,678.00';

I want to highlight the part "78.0" on my formatted number.


Solution

  • Given how long is takes to get an answer, I think we can conclude that this isn't a trivial problem.

    I assume, although you don't explicitly say so, that you actually only want to highlight formatted numbers, and that these numbers aren't embedded in a bigger text. I also assume you only want to highlight one part of the number, not multiple parts. So in 12,341,234.00 only the first 1234 will be highlighted.

    My solution is a real algorithm, and not based on complex build-in functions or regular expressions. That would be possible of course, but I avoided them.

    function highlight(string $search, string $number): string
    {
        $searchBegin = NULL;
        $searchIndex = 0;
        $search      = str_replace([',', '.'], '', $search);
        foreach (str_split($number) as $numberIndex => $character) {
            if ($character == $search[$searchIndex]) {
                if (is_null($searchBegin)) {
                    $searchBegin = $numberIndex;
                }    
                $searchIndex++;
                if ($searchIndex == strlen($search)) {
                   return substr($number, 0, $searchBegin) .
                          '<span style="background: yellow">' .
                          substr($number, $searchBegin, $numberIndex - $searchBegin + 1) .
                          '</span>' .
                          substr($number, $numberIndex + 1);                
                }
            } elseif (!in_array($character, [',', '.']))  {
                $searchBegin = NULL;
                $searchIndex = 0;
            }
        }
        return $number;
    }
    

    Live demo: https://3v4l.org/StFn7

    I think the code is quite self-explanatory. Just follow what it does and you'll understand it. There's nothing mysterious about it.

    This code isn't optimized, and doesn't check for errors. It is intended as a starting point for further optimization. str_split() is handy, but not very fast, you can easily make this algorithm faster by using a for loop and a string index. Being able to highlight multiple sections would be the first thing I would do. You could do this recursively on the part of the $number after the match. Like so: https://3v4l.org/aRZtN

    This code might be longer than you expect, but that doesn't make it bad or slow code. Sometimes a good old-fashioned algorithm wins the day.