Solution was to look into look-aheads and look-behinds - the concept of LookArounds in RegEx helped me solve my issue since replacements was eaten from eachother when i did a replacement
So we've been working for a while to make some transitions on some of our older projects and (perhaps bad/old coding habits) and are working on making them php7-ready. In this process i have made some adjustments in the .php files of the project so that for example
The problem at hand is that im facing some issues with danish characters in php string functions (strlen, substr etc) and would like for them to use mb_string functions instead. From what i can read on the internet using the "overload" function is not the way to go, so therefore i've decided to make filebased search replace.
My search replace function look like this right now (Updated thanks to @SeanBright)
$testfile = file_get_contents($file);
$array = array ( 'strlen'=>'mb_strlen',
'strpos'=>'mb_strpos',
'substr'=>'mb_substr',
'strtolower'=>'mb_strtolower',
'strtoupper'=>'mb_strtoupper',
'substr_count'=>'mb_substr_count',
'split'=>'mb_split',
'mail'=>'mb_send_mail',
'ereg'=>'mb_ereg',
'eregi'=>'mb_eregi',
'strrchr' => 'mb_strrchr',
'strichr' => 'mb_strichr',
'strchr' => 'mb_strchr',
'strrpos' => 'mb_strrpos',
'strripos' => 'mb_strripos',
'stripos' => 'mb_stripos',
'stristr' => 'mb_stristr'
);
foreach($array as $function_name => $mb_function_name){
$search_string = '/(^|[\s\[{;(:!\=\><?.,\*\/\-\+])(?<!->)(?<!new )' . $function_name . '(?=\s?\()/i';
$testfile = preg_replace($search_string, "$1".$mb_function_name."$2$3", $test,-1,$count);
}
print "<pre>";
print $test;
The $file has this content:
<?php
print strtoupper('test');
print strtolower'test');
print substr('tester',0,1);
print astrtoupper('test');
print bstrtolower('test');
print csubstr(('tester',0,1);
print [substr('tester',0,1)];
print {substr('tester',0,1)};
substr('test',0,1);
substr('test',0,1);
(substr('test',0,1));
!substr();
if(substr()==substr()=>substr()<substr()){
?substr('test');
}
"test".substr('test');
'asd'.substr('asd');
'asd'.substr('asd');
substr( substr('asdsadsadasd',0,-1),strlen("1"),strlen("100"));
substr (substr ('Asdsadsadasd',0,-1), strlen("1"), strlen("100"));
substr(substr(substr('Asdsadsadasd',0,-1),0,-1), strlen("1"), strlen("100"));
mailafsendelse(substr('asdsadsadasd',0,-1), strlen("1"), strlen("100"));
mail(test);
substr ( tester );
substr ( tester );
mail mail mail mail ( tester );
$mail->mail ();
$mail -> mail ();
new Mail();
new mail ();
strlen ( tester )*strlen ( tester )+strlen ( tester )/strlen ( tester )-strlen ( tester )
;
The point here is that the actual php code does not have to be valid syntax. I just wanted to make it work in different scenarios
My regEx problem is that i cannot find out why this line:
substr(substr(substr('Asdsadsadasd',0,-1),0,-1), strlen("1"), strlen("100"));
is not working. The 1st and 3rd substr are replaced correct but the 2nd looks like this:
mb_substr(substr(mb_substr('Asdsadsadasd',0,-1),0,-1), mb_strlen("1"), mb_strlen("100"));
As a note my search string is made to work with all sorts of characters in front of function name and require that the characters AFTER the function name is a "("
In a perfect world i would like to also exclude stringfunctions that are methods in classes, for example: $order->mail() that would send an email. This i would like NOT to be converted to $order->mb_send_mail()
From my understanding all parameters are the same, so it should not be a problem.
Complete script can be found here https://github.com/welrachid/phpStringToMBString
The problem is that some of the characters you are using to delimit your function call checks are being consumed by matching. If you switch the last group to be a positive lookahead, this will fix the problem:
$search_string = '/([ \[{\n\t\r;(:!=><?\.,])'.($function_name).'([\ |\t]{0,1})(?=[(]{1})/i';
^^ Add these
Your current expression also won't match function calls at the beginning of the line. The following handles that and also simplifies things a bit:
$search_string = '/(^|[\s\[{;(:!=><?.,])' . $function_name . '(?=\s?\()/i';
I've set up an example on regex101.com.
You might even be able to get away with:
$search_string = '/(^|\W)' . $function_name . '(?=\s?\()/i';
Where \W
will match a non-word character.
Update
To prevent matching method calls, you can add a negative lookbehind to your pattern:
$search_string = '/(^|[\s\[{;(:!=><?.,])(?<!->)' . $function_name . '(?=\s?\()/i';
^^^^^^^