I'm testing out my scraping abilities, trying to output values from any sort of tags. Coming up against something I'm unsure how to work, which is contained in a meta itemprop tag.
The source code looks like this, where I'm trying to scrape and output -0.205, under the meta itemprop tag name "priceChangePercent". (see at bottom line). *the value there is random depending on the time of day.
<meta itemprop="name" content="USDJPY Spot Exchange Rate" /> <meta
itemprop="url" content="https://www.bloomberg.com/quote/USDJPY:CUR" />
<meta itemprop="tickerSymbol" content="USDJPY" /> <meta
itemprop="exchange" content="" /> <meta itemprop="price" content="111.95"
/> <meta itemprop="priceChange" content="-0.23" /> <meta
itemprop="priceChangePercent" content="-0.205" />
Currently in the code that is working, I'm identifying content like this:
$query = "//div[@class='change-container']";
Which works just fine.
What would I have to modify exactly? I've tried
$query = "//meta[@itemprop='priceChangePercent']";
No luck. This is coming from this url: https://www.bloomberg.com/quote/USDJPY:CUR
Thanks so much.
my full code:
<?php
// this is yearly u.s. stock data
$doc = new DOMDocument;
// We don't want to bother with white spaces
$doc->preserveWhiteSpace = false;
$doc->strictErrorChecking = false;
$doc->recover = true;
$doc->loadHTMLFile('https://www.bloomberg.com/quote/USDJPY:CUR');
$xpath = new DOMXPath($doc);
$query = "//meta[@itemprop='priceChangePercent']";
$entries = $xpath->query($query);
foreach ($entries as $entry) {
echo trim($entry->textContent); // use `trim` to eliminate spaces
}
?>
Taking your code sample:
echo trim($entry->textContent);
I've modified it to become:
echo trim($entry->getAttribute('content'));
your code seems to be getting the text content of the meta tag, which has no text content of it's own, only attribute content.