I wanted to know is Selenium the only library that would be able to access data from a table in a webpage specifically here.
When I try to parse these sites using bs4
it doesn't have any data in the tables just the headers, it works locally using selenium
, but the issue is I don't have chrome or any browser for that matter on the box I'm working on. Wondering if there was another way.
The page you linked to loads another resource using AJAX (you can see this in the Network tab of the Inspector feature of your browser):
https://httpd.sslmate.com/ocspwatch/problems
It's plain JSON, you don't even have to scrape it:
import requests
certificates = requests.get("https://httpd.sslmate.com/ocspwatch/problems").json()
for cert in certificates:
print(cert["problem_time"], ":", cert["problem"], "(", cert["operator_name"], ")")
Output:
2023-03-10T00:11:32+00:00 : error parsing OCSP response: ocsp: error from server: unauthorized ( GoDaddy )
2023-03-10T00:14:58+00:00 : error parsing OCSP response: OCSP response contains bad number of responses ( eMudhra Technologies Limited )
2023-03-10T00:14:57+00:00 : OCSP responder does not know this certificate ( Netlock )
...