I am trying to scrape some data from Yahoo, but the xpath query is returning me length 0 when I var_dump
this. Here's a portion of my scraping code.
error_reporting(0);
function curl($url) {
$curl = curl_init();
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, 0);
curl_setopt($curl, CURLOPT_USERAGENT,'Mozilla/5.0 (Windows; U; MSIE 7.0; Windows NT 6.0; en-US)');
curl_setopt($curl, CURLOPT_HEADER, true);
curl_setopt($curl, CURLOPT_AUTOREFERER, false);
curl_setopt($curl, CURLOPT_FRESH_CONNECT, true);
curl_setopt($curl, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($curl, CURLOPT_CONNECTTIMEOUT, 200);
return curl_exec($curl);
}
$page = curl('https://www.yahoo.com');
$dom = new DOMDocument();
$dom->loadHTML($page);
$xpath = new DOMXPath($dom);
$link = $xpath->query('//li[@style="background-color:#fafaff;"]/div/div/div/h3/a');
foreach ($link as $links) {
$get_title[] = $links->nodeValue;
$get_link[] = $links->getAttribute('href');
}
This code has no syntax errors, but there is a logical error.
Your code is working correctly. The problem is that the HTML returned by Yahoo.com simply doesn't contain any li elements that match your selector. You can see this by looking at the contents of $page.