phpweb-scrapinggouttedomcrawler

Goutte crawler get style


I am using Goutte crawler. So, for the few hours, I was trying to get a style attribute of a div in a search page with specific query, and this style have a Background-img. So first I made a GET request to the url by

   $crawler = $client->request('GET',"https://www.esheeq.net/search/مسلسل+علي+رضا");

and then crawled it by

$crawler->filter(".imgBg")->attr("style")

and printed it, and it worked, but the problem is when I change the search query such as (https://www.esheeq.net/search/مسلسل+الغرفة+الحمراء), it throw an error

Fatal error: Uncaught InvalidArgumentException: The current node list is empty. in C:\xampp\htdocs\esheeqAPI\vendor\symfony\dom-crawler\Crawler.php:550 Stack trace: #0 C:\xampp\htdocs\esheeqAPI\api\functions.php(8): Symfony\Component\DomCrawler\Crawler->attr('style') #1 C:\xampp\htdocs\esheeqAPI\api\tests.php(4): InsertMultipleSeries() #2 {main} thrown in C:\xampp\htdocs\esheeqAPI\vendor\symfony\dom-crawler\Crawler.php on line 550

but when I open the url I requested it show me a div of class imgBg that have a style attribute. Then why I am getting error, and how can I solve it.


Solution

  • maybe try doing this instead:

    use Symfony\Component\DomCrawler\Crawler;
    
    // you need to urlencode arabic characters, because php doesn't do that automatically
    
    $url = "https://www.esheeq.net/search/" . urlencode( "مسلسل+الغرفة+الحمراء" );
    
    $html_content = file_get_contents($url);
    
    // and then;
    $crawler = new Crawler( $html_content );
    
    $crawler->filter(".imgBg")->attr("style");
    

    let me know if it doesn't work