javascriptxpathimacros

imacros: Scrape URLs using xpath


Here is my code, which scrapes Google search:

SET !LOOP 1
TAG XPATH=".//*[@id='rso']/div/div[{{!LOOP}}]/div/h3/a" EXTRACT=HREF
TAG XPATH=".//*[@id='rso']/div/div[{{!LOOP}}]/div/h3/a" EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=* FILE=google_search.txt

How do I fix it? Maybe this whole xpath is wrong.


Solution

  • In many google search result pages you will get multiple element with this xpath.

    .//*[@id='rso']/div/div[1]/div/h3/a
    

    or

    .//*[@id='rso']/div/div[2]/div/h3/a
    

    But iMacros doesn't support this kind of xpath. I have got a better solution for you. Please try the following code.

    SET !TIMEOUT_STEP 0
    SET !ERRORIGNORE YES
    SET !EXTRACT_TEST_POPUP NO
    SET !LOOP 1
    
    TAG POS={{!LOOP}} TYPE=H3 ATTR=CLASS:r EXTRACT=TXT
    TAG POS=1 TYPE=A ATTR=TXT:{{!EXTRACT}} EXTRACT=HREF
    
    SAVEAS TYPE=EXTRACT FOLDER=* FILE=google_search.csv
    

    This code worked fine for me.