goweb-scrapinggo-colly

is it possible crawl CSR website using gocolly


Is it possible to crawl CSR(Client Side Render/JS) websites using gocolly? I need to crawl many websites, and for that, I have a titleXpath in the database as follows:

c.OnXML(titleXpath, func(e *colly.XMLElement) {
   data = append(data, e.Text)
   fmt.Println("title", e.Text)
})

Yes or no or another package


Solution

  • It is not possible to crawl Client-Side Rendered (CSR/JS) websites using gocolly alone. gocolly is a scraping library for Golang that operates at the HTTP level and can parse static HTML documents, but it does not execute JavaScript.

    To scrape CSR websites, you need a headless browser or a web scraping tool that supports JavaScript rendering. Some popular options for scraping CSR websites include: