I'm trying to scape a webpage using Laravel, Goutte, and Guzzle. I'm trying to pass an instance of guzzle into Goutte but my web server keeps trying to use Symfony\Contracts\HttpClient\HttpClientInterfac
. Here's the exact error I'm getting:
Argument 1 passed to Symfony\Component\BrowserKit\HttpBrowser::__construct() must be an instance of Symfony\Contracts\HttpClient\HttpClientInterface or null, instance of GuzzleHttp\Client given, called in /opt/bitnami/apache/htdocs/app/Http/Controllers/ScrapeController.php on line 52
Where line 52
is referring to this line: $goutteClient = new Client($guzzleclient);
Here's my class. How can I force it to use Goutte instead of Symfony?
Changing the line to this: $goutteClient = new \Goutte\Client($guzzleclient);
does not fix it.
<?php
namespace App\Http\Controllers;
use Illuminate\Http\Request;
use Goutte\Client;
use GuzzleHttp\Cookie;
use GuzzleHttp\Client as GuzzleClient;
class ScrapeController extends Controller
{
public function index()
{
return view(‘index’);
}
public function scrape() {
$url = ‘www.domain.com;
$domain = ‘www.domain.com’;
$cookieJar = new \GuzzleHttp\Cookie\CookieJar(true);
// get the cookie from www.domain.com
$cookieJar->setCookie(new \GuzzleHttp\Cookie\SetCookie([
'Domain' => “www.domain.com”,
'Name' => ‘_name_session',
'Value' => ‘value’,
'Discard' => true
]));
$guzzleClient = new \GuzzleHttp\Client([
'timeout' => 900,
'verify' => false,
'cookies' => $cookieJar
]);
$goutteClient = new Client($guzzleClient);
$crawler = $goutteClient->request('GET', $url);
$crawler->filter('table')->filter('tr')->each(function ($node) {
dump($node->text());
});
}
}
Here's a fun little observation, Gouette\Client
is now simply a thin extension of Symfony\Component\BrowserKit\HttpBrowser
, so based on that you can modify your scrape
function to be something like:
use Symfony\Component\BrowserKit\Cookie;
use Symfony\Component\BrowserKit\CookieJar;
use Symfony\Component\BrowserKit\HttpBrowser;
use Symfony\Component\HttpClient\HttpClient;
...
public function scrape() {
$url = 'http://www.example.com/';
$domain = 'www.example.com';
$jar = new CookieJar();
$jar->set(new Cookie('_name_session', 'value', null, null, $domain));
$client = HttpClient::create([
'timeout' => 900,
'verify_peer' => false
]);
$browser = new HttpBrowser($client, null, $jar);
$crawler = $browser->request('GET', $url);
$crawler->filter('div')->filter('h1')->each(function ($node) {
dump($node->text());
});
}
In your composer.json
you'll need to have requires similar to the following:
"symfony/browser-kit": "^5.3",
"symfony/css-selector": "^5.3",
"symfony/http-client": "^5.3"
but fabpot/goutte
required all them anyway, so there won't be any libraries downloaded in addition to what you already have.