phpgoutte

Get div with class with Goutte


I'm trying to get data from this url with Goutte But when I try to filter only the divs with class "empresa", I get the entire page. How can I filter only the divs with a specific class?

This is my code:

<html>

<body>
        <?php

        require __DIR__ . '/vendor/autoload.php';
        use Goutte\Client;

        $client = new Client();
        $crawler = $client->request('GET', 'http://sp.cadastrosindustriais.com.br/?consulta=cal%C3%A7ados');

        $crawler->filter('div[id="empresa"]')->each(function ($node) {
            print $node->text()."\n";
        });


        ?>

</body>


</html>

Solution

  • You're close. The problem is your selector. The crawler uses jquery style selectors.

    Here's a working example of your code. I put the results inside an array just in case you wanted to do more than just dump the results.

    $client = new Goutte\Client();
    $crawler = $client->request('GET', 'http://sp.cadastrosindustriais.com.br/?consulta=cal%C3%A7ados');
        
    $elements = $crawler->filter('.empresa')->each(function($node){
        return $node->text();
    });
    

    Then if you want to traverse through the results, you can just do foreach($elements as $e)