phpregexstringurlsplit

Split text into an array of URL and non-URL strings


I need to split a PHP array into an array containing text and URLs. For instance, assuming

$string = "Hello, my name is http://www.audio.com/1234.mp3/. Today is https://radio.org/weather.wav";

The expected output should be something like:

$a[0] = "Hello, my name is";
$a[1] = "http://www.audio.com/1234.mp3/";
$a[2] = ". Today is";
$a[3] = "https://radio.org/weather.wav";

Solution

  • You cannot split it easily. But a workaround would be to match it in pairs using something like:

    preg_match_all('#(.*?)(https?://\S+(?<![,.]))\K#s', $str, $m,
                   PREG_SET_ORDER);
    $list = call_user_func_array("array_merge", $m);
    

    The call_user_func_array is another workaround to avoid flatteing the array manually. This method will lead to empty entries in between however:

    Array
    (
        [0] => 
        [1] => Hello, my name is 
        [2] => http://www.audio.com/1234.mp3/
        [3] => 
        [4] =>  Today is 
        [5] => https://radio.org/weather.wav
    )
    

    Also note that the simplistic URL regex ate up the period. (Use exact character groups instead of lookbehind.)