phphtmlsimpledom

Parsing an image using simple DOM


I want to parse an image from a HTML file using simple dom. I was using regex until now but everyone told me that this is a really bad idea, so I wanted to try dom.

<?php
include('simple_html_dom.php');
$html = file_get_html('192.0.0.1/test.html');
var_dump($html);
foreach ($html->find('img') as $image) {
    echo $images->src;
}
?>

TEST.html

<html>
<head>
</head>
<body>
    <p>test</p>
    <img src="test.jpg"/>
    <p>test1</p>
</body>
</html>

I'm getting a blank page, and I checked for errorlogs but I don't have any. I followed the tutorials about DOM, did I make a mistake?

Also can I parse the img from a variable that has the HTML code? What I mean:

$string='<p>sdadasd</p> <img src="test.jph/> <p>asdasda</p>';
$html=file_get_hmtl($string);

Solution

  • You could use something like this (I don't know where you got file_get_html, so i don't know what methods that object returns)

    $document = new DOMDocument();
    $document->loadHTMLFile("http://127.0.0.1/index.html"); // I don't remember if this accepts streams
    
    $images = $document->getElementsByTagName("img");
    
    foreach($images as $image) {
        //Use the image
    }
    

    Or if you need complex queries (e.g. img tags with a certain attribute) you could do

    $xpath = new DOMXPath($document);
    $images = $xpath->query("//img");