jquerynode.jsweb-scrapingcheeriohtmlelements

scraping webpage with dynamic content - cheerio


I'm trying to scrape this html with node js and cheerio to get 72 from the span tag. However, when I put the selector it returns nothing (empty string).

in this case, I want that 72 which is in the span tag and I will put part of my code and selector below:

firstly, website html:


<li id="TonerSupplies" data-node="TonerSupplies" class="child-row">
<div class="supplyStatusContainer" data-init="initSupplyStatusContainer(this)">
<div class="contentRow" role="gridcell">
<div class="contentHeader" role="heading">
<span class="translated" data-textid="67527" tabindex="-1">
Black Cartridge
</span>
<br>
</div>
<div class="contentBody" role="presentation">
<div class="progress" role="presentation" tabindex="" data-deviceid="8-1">
<div class="progress-inner BlackGauge" role="img" title="72%" aria-labelledby="72%">
<div class="progress-slider" style="width: 166px; overflow: hidden;">
<span class="dataText">72</span>
</div>
</div> </div>
</div>
</div>
</div>
</li>

I really tried to look up cheerio and jQuery selectors before I post this question and I couldn't find anything to help me.

here is my code:

const cheerio = require('cheerio');
const request = require('request');


request({
    method: 'GET',
    url: `http://exmaple.com/`
}, (err, res, body) => {

    if (err) return console.error(err);

    let $ = cheerio.load(body);

   

    let TonerLevel = 'empty';

    TonerLevel = $('#TonerSupplies > div > div > div.contentBody > div > div > div > span');

    console.log(TonerLevel.text());
});


I apprecite your help and suggestions about how to get that "72"

also if I can get the attribute 'title' too, I will get the toner level, but it returns undefined when I use the selector then TonerLevel.attr('title')

It works for some other elements.


Solution

  • Ok so I found the problem myself. The toner level was loading from a script; therefore, I used puppeteer to wait for the site to be loaded and got that number.

    const page = await browser.newPage();
    await page.goto(`http://ip-address`);
    await page.waitForSelector('#TonerSupplies > div > div > div.contentBody > div > div > div > span');