phpregexargumentskodi

PHP Regex patern, KODI season poster extractor


In kodi Database, table tvshow, column C06 we have this kind of data :

<thumb aspect="poster">http://image.tmdb.org/t/p/original/xjm6uVktPuKXNILwjLXwVG5d5BU.jpg</thumb>
<thumb aspect="poster" type="season" season="6">http://image.tmdb.org/t/p/original/5msClP3ba8iOHvpuZjU6NyzwEB7.jpg</thumb>
<thumb aspect="poster" type="season" season="3">http://image.tmdb.org/t/p/original/xG6kJnvmGme2ZgLZASFrI1PFUnY.jpg</thumb>

I would like with a regex pattern, to extract the http:// link :

1st case -> aspect="poster" => what is the general poster of the TV show
2nd case -> season="X" => Where X is the number of the season poster i want to get

I can't get answer for this problem, i found some regex but they just extract all link, it's not possible to filter as i need, like this one :

preg_match_all('#\bhttps?://[^,\s()<>]+(?:\([\w\d]+\)|([^,[:punct:]\s]|/))#', $TVShowPosterString, $match);

Best regards,

S.


Solution

  • It looks as though the contents is a document fragment (i.e. there isn't a single root element). So you could wrap one round the current data and then load that(I have used <data> here, but it could be anything you want)...

    $data = '<thumb aspect="poster">http://image.tmdb.org/t/p/original/xjm6uVktPuKXNILwjLXwVG5d5BU.jpg</thumb>
    <thumb aspect="poster" type="season" season="6">http://image.tmdb.org/t/p/original/5msClP3ba8iOHvpuZjU6NyzwEB7.jpg</thumb>
    <thumb aspect="poster" type="season" season="3">http://image.tmdb.org/t/p/original/xG6kJnvmGme2ZgLZASFrI1PFUnY.jpg</thumb>';
    
    $xml = simplexml_load_string("<data>{$data}</data>");
    foreach ( $xml->thumb as $thumb )   {
        echo (string)$thumb.PHP_EOL;
    }
    

    gives the links...

    http://image.tmdb.org/t/p/original/xjm6uVktPuKXNILwjLXwVG5d5BU.jpg
    http://image.tmdb.org/t/p/original/5msClP3ba8iOHvpuZjU6NyzwEB7.jpg
    http://image.tmdb.org/t/p/original/xG6kJnvmGme2ZgLZASFrI1PFUnY.jpg