htmlexcelvbainternet-explorerfirefox

Return URL From First Search Result


I have an Excel workbook of around 25,000 company keywords from which I'd like to get the company website URL.

I am looking to run a VBA script which can run these keywords as a Google search, and pull the URL of the first result into a spreadsheet.

I found a similar thread.
The results of this to be hit-and-miss; some keywords return the URL in the next column, others remain blank.
It also seemed to pull the URL of Google's optimised sub-links in the first search result rather than the main website URL: Google Search Result example

I then found the below code here which I ran on a sample list of 1,000 keywords. The author of this blog stipulates that this code works for Mozilla Firefox.

I tested IE code that he has also written but this did not achieve the same results (it was adding hyperlinks consisting of descriptive text from the search results rather than the raw URL).

The Firefox code worked until the 714th row, then returned a error message

"Run time error 91: object variable or with block variable not set"

Spreadsheet layout showing successful results and row at which macro stopped
enter image description here

Sub GoogleURL ()

    Dim url As String, lastRow As Long
    Dim XMLHTTP As Object
    Dim html As Object
    Dim objResultDiv As Object
    Dim objH As Object

    lastRow = Range(“A” & Rows.Count).End(xlUp).Row

    For i = 2 To lastRow

        url = “https://www.google.co.uk/search?q=” & Cells(i, 1) & “&rnd=” & WorksheetFunction.RandBetween(1, 10000)

        Set XMLHTTP = CreateObject(“MSXML2.serverXMLHTTP”)

        XMLHTTP.Open “GET”, url, False

        XMLHTTP.setRequestHeader “Content-Type”, “text/xml”

        XMLHTTP.setRequestHeader “User-Agent”, “Mozilla/5.0 (Windows NT 6.1; rv:25.0) Gecko/20100101 Firefox/25.0”

        XMLHTTP.send

        Set html = CreateObject(“htmlfile”)

        html.body.innerHTML = XMLHTTP.ResponseText

        Set objResultDiv = html.getelementbyid(“rso”)

        Set objH = objResultDiv.getelementsbytagname(“h3”)(0)

        Cells(i, 2).Value = objH.innerText

        Set html = CreateObject(“htmlfile”)

        html.body.innerHTML = XMLHTTP.ResponseText

        Set objResultDiv = html.getelementbyid(“rso”)

        Set objH = objResultDiv.getelementsbytagname(“cite”)(0)

        Cells(i, 3).Value = objH.innerText

        DoEvents

    Next

End Sub

Solution

  • As Firefox is a third party browser for the support scope of Microsoft, I can help you to check the VBA code for the IE browser.

    You said that the VBA code given in this link for the IE browser generates the description with the link and your requirement is to store description and link in a separate column.

    I tried to modify that sample code as per your requirement.

    Here is the modified code from that sample.

    Option Explicit
    Const TargetItemsQty = 1 ' results for each keyword
    
    Sub GWebSearchIECtl()
    
        Dim objSheet As Worksheet
        Dim objIE As Object
        Dim x As Long
        Dim y As Long
        Dim strSearch As String
        Dim lngFound As Long
        Dim st As String
        Dim colGItems As Object
        Dim varGItem As Variant
        Dim strHLink As String
        Dim strDescr As String
        Dim strNextURL As String
    
        Set objSheet = Sheets("Sheet1")
        Set objIE = CreateObject("InternetExplorer.Application")
        objIE.Visible = True ' for debug or captcha request cases
        y = 1 ' start searching for the keyword in the first row
        With objSheet
            .Select
            .Range(.Columns("B:B"), .Columns("B:B").End(xlToRight)).Delete ' clear previous results
            .Range(.Columns("C:C"), .Columns("C:C").End(xlToRight)).Delete ' clear previous results
            .Range("A1").Select
            Do Until .Cells(y, 1) = ""
                x = 2 ' start writing results from column B
                .Cells(y, 1).Select
                strSearch = .Cells(y, 1) ' current keyword
                With objIE
                    lngFound = 0
                    .navigate "https://www.google.com/search?q=" & EncodeUriComponent(strSearch) ' go to first search results page
                    Do
                        Do While .Busy Or Not .READYSTATE = 4: DoEvents: Loop ' wait IE
                        Do Until .document.READYSTATE = "complete": DoEvents: Loop ' wait document
                        Do While TypeName(.document.getelementbyid("res")) = "Null": DoEvents: Loop ' wait [#res] element
                        Set colGItems = .document.getelementbyid("res").getElementsByClassName("g") ' collection of search result [.g] items
                        For Each varGItem In colGItems ' process each item in collection
                            If varGItem.getelementsbytagname("a").Length > 0 And varGItem.getElementsByClassName("st").Length > 0 Then ' must have hyperlink and description
                                strHLink = varGItem.getelementsbytagname("a")(0).href ' get first hyperlink [a] found in current item
                                strDescr = GetInnerText(varGItem.getElementsByClassName("st")(0).innerHTML) ' get first description [span.st] found in current item
                                lngFound = lngFound + 1
                                'Debug.Print (strHLink)
                                'Debug.Print (strDescr)
                                With objSheet ' put result into cell
                                     .Cells(y, x).Value = strDescr
                                     .Hyperlinks.Add .Cells(y, x + 1), strHLink
                                    .Cells(y, x).WrapText = True
                                    x = x + 1 ' next column
                                End With
                                If lngFound = TargetItemsQty Then Exit Do ' continue with next keyword - necessary quantity of the results for current keyword found
                            End If
                            DoEvents
                        Next
                        If TypeName(.document.getelementbyid("pnnext")) = "Null" Then Exit Do ' continue with next keyword - no [a#pnnext.pn] next page button exists
                        strNextURL = .document.getelementbyid("pnnext").href ' get next page url
                        .navigate strNextURL ' go to next search results page
                    Loop
                End With
                y = y + 1 ' next row
            Loop
        End With
        objIE.Quit
    
        ' google web search page contains the elements:
        ' [div#res] - main search results block
        ' [div.g] - each result item block within [div#res]
        ' [a] - hyperlink ancor(s) within each [div.g]
        ' [span.st] - description(s) within each [div.g]
        ' [a#pnnext.pn] - hyperlink ancor to the next search results page
    
    End Sub
    
    Function EncodeUriComponent(strText As String) As String
        Static objHtmlfile As Object
    
        If objHtmlfile Is Nothing Then
            Set objHtmlfile = CreateObject("htmlfile")
            objHtmlfile.parentWindow.execScript "function encode(s) {return encodeURIComponent(s)}", "jscript"
        End If
        EncodeUriComponent = objHtmlfile.parentWindow.encode(strText)
    End Function
    
    Function GetInnerText(strText As String) As String
        Static objHtmlfile As Object
    
        If objHtmlfile Is Nothing Then
            Set objHtmlfile = CreateObject("htmlfile")
            objHtmlfile.Open
            objHtmlfile.Write "<body></body>"
        End If
        objHtmlfile.body.innerHTML = strText
        GetInnerText = objHtmlfile.body.innerText
    End Function
    

    Output in IE 11 browser:

    enter image description here

    You can try to run it on your side to see the results with large amount of data.

    If you meet with any performance issue then I suggest you try it with a smaller amount of data.