I want to use Puppeteer to respond to page updates. The page shows items and when I leave the page open new items can appear over time. E.g. every 10 seconds a new item is added.
I can use the following to wait for an item on the initial load of the page:
await page.waitFor(".item");
console.log("the initial items have been loaded")
How can I wait for / catch future items? I would like to achieve something like this (pseudo code):
await page.goto('http://mysite');
await page.waitFor(".item");
// check items (=these initial items)
// event when receiving new items:
// check item(s) (= the additional [or all] items)
You can use exposeFunction to expose a local function:
await page.exposeFunction('getItem', function(a) {
console.log(a);
});
Then you can use page.evaluate to create an observer and listen to new nodes created inside a parent node.
This example scrapes (it's just an idea, not a final work) the python chat in Stack Overflow, and prints new items being created in that chat.
var baseurl = 'https://chat.stackoverflow.com/rooms/6/python';
const browser = await puppeteer.launch({headless: false});
const page = await browser.newPage();
await page.goto(baseurl);
await page.exposeFunction('getItem', function(a) {
console.log(a);
});
await page.evaluate(() => {
var observer = new MutationObserver((mutations) => {
for(var mutation of mutations) {
if(mutation.addedNodes.length) {
getItem(mutation.addedNodes[0].innerText);
}
}
});
observer.observe(document.getElementById("chat"), { attributes: false, childList: true, subtree: true });
});