pythonweb-scrapingchromiumpython-requests-htmlpyppeteer

Webscraping with requests_html but it says a chromium file is missing


I trying to web scrape using requests-html but it returns an error saying there is a missing file even though I pip install requests-html and it said all req fulfilled. how do I get around this.

from requests_html import HTMLSession
import time

url = 'https://soundcloud.com/jujubucks'

s = HTMLSession()
r = s.get(url)

r.html.render()

songs = r.html.xpath('//*[@id="content"]/div/div[4]/div[1]/div/div[2]/div/div[2]', first=True)

print(songs)

this produces a sxstrace error.

OSError: [WinError 14001] The application has failed to start because its side-by-side 
configuration is incorrect. Please see the application event log or use the command-line 
sxstrace.exe tool for more detail

apparently this is the missing file according the event log but I dont know where to get it.

Activation context generation failed for "C:\Users\houst\AppData\Local\pyppeteer\pyppeteer\local-chromium\588429\chrome-win32\chrome.exe". Dependent Assembly 71.0.3542.0,language="*",type="win32",version="71.0.3542.0" could not be found. Please use sxstrace.exe for detailed diagnosis.


Solution

  • requests_html depends upon pyppeteer but it seems your pypeteer has not installed chromium completely. Try installing chromium manually, just activate your environment containing pyppeteer and run pyppeteer-install.exe.