I am using PHP Simple HTML DOM to parse a webpage.
Problem: However, the HTML content scrapped seems to be different from the one I get if I were to use my web browser. What may have caused the difference and how can I get the same content using Simple HTML Dom as the content displayed by the web browser?
PHP
public function action_asos() {
include_once('/home/mysite/public_html/application/libraries/simple_html_dom.php');
$category_url = 'http://www.asos.com/Men/T-Shirts-Vests/Cat/pgecategory.aspx?cid=7616#parentID=-1&pge=0&pgeSize=100&sort=1';
$html = file_get_html($category_url);
foreach($html->find('html') as $content) {
echo $content;
}
}
Actual page:
http://www.asos.com/Men/T-Shirts-Vests/Cat/pgecategory.aspx?cid=7616#parentID=-1&pge=0&pgeSize=100&sort=1
Retrieved using Simple HTML DOM
You need to provide a user-agent. The lack of a user-agent is, for whatever reason, causing the server to choke.