rweb-scrapingyelprvest

Using 'rvest' to extract links


I am trying to scrape data from Yelp. One step is to extract links from each restaurant. For example, I search restaurants in NYC and get some results. Then I want to extract the links of all the 10 restaurants Yelp recommends on page 1. Here is what I have tried:

library(rvest)     
page=read_html("http://www.yelp.com/search?find_loc=New+York,+NY,+USA")
page %>% html_nodes(".biz-name span") %>% html_attr('href')

But the code always returns 'NA'. Can anyone help me with that? Thanks!


Solution

  • library(rvest)     
    page <- read_html("http://www.yelp.com/search?find_loc=New+York,+NY,+USA")
    page %>% html_nodes(".biz-name") %>% html_attr('href')
    

    Hope this would simplify your problem