phpweb-scrapingxpathoutput

php scraping a meta itemprop tag


I'm testing out my scraping abilities, trying to output values from any sort of tags. Coming up against something I'm unsure how to work, which is contained in a meta itemprop tag.

The source code looks like this, where I'm trying to scrape and output -0.205, under the meta itemprop tag name "priceChangePercent". (see at bottom line). *the value there is random depending on the time of day.

 <meta itemprop="name" content="USDJPY Spot Exchange Rate" /> <meta     
itemprop="url" content="https://www.bloomberg.com/quote/USDJPY:CUR" />   
<meta itemprop="tickerSymbol" content="USDJPY" /> <meta  
itemprop="exchange" content="" /> <meta itemprop="price" content="111.95" 
/> <meta itemprop="priceChange" content="-0.23" /> <meta 
itemprop="priceChangePercent" content="-0.205" />

Currently in the code that is working, I'm identifying content like this:

$query = "//div[@class='change-container']";

Which works just fine.

What would I have to modify exactly? I've tried

$query = "//meta[@itemprop='priceChangePercent']";

No luck. This is coming from this url: https://www.bloomberg.com/quote/USDJPY:CUR

Thanks so much.

my full code:

<?php 

  // this is yearly u.s. stock data
$doc = new DOMDocument;

// We don't want to bother with white spaces
$doc->preserveWhiteSpace = false;

 $doc->strictErrorChecking = false;
$doc->recover = true;

 $doc->loadHTMLFile('https://www.bloomberg.com/quote/USDJPY:CUR');

 $xpath = new DOMXPath($doc);

$query = "//meta[@itemprop='priceChangePercent']";

$entries = $xpath->query($query);
foreach ($entries as $entry) {
echo trim($entry->textContent);  // use `trim` to eliminate spaces


}

?>

Solution

  • Taking your code sample:

    echo trim($entry->textContent);
    

    I've modified it to become:

    echo trim($entry->getAttribute('content'));
    

    your code seems to be getting the text content of the meta tag, which has no text content of it's own, only attribute content.