Is it possible to scrape the products from a ecommerce site using the anemone and nokogiri libs in ruby?
I understand how to pull the data I need from each product page using nokogiri but I can't figure out how to make anemone/nokogiri crawl the site and grab all the product pages.
A push in the right direction would be much appreciated
I figured out my issues. First was that anemone didn't seem to be crawling all the pages. This was because the pages I wanted were under a subdomain that I had to tell anemone to crawl separately from the main domain. Second was I needed a way to determine which pages were actually product pages (and thus neede to be parsed). I did this by parsing one of the fields I wanted (sku number) and then testing if it was a sku with RegEX.