vbaweb-scrapingserverxmlhttpqueryselector

Unable to let my script run through the end


I've written a script in vba using ServerXMLHTTP requests in order to be able to use proxy along with setting timeout parameter within it. When I run the script, it appears to be working but the problem is - it gets stuck after using the first proxy. I wish this to be running until there is no proxies left to be used. I defined this line While .readyState < 4: DoEvents: Wend only to let not the script freeze. Whether the proxies work or not the script should go on, right?

This is what I've tried:

Sub MakeProxiedRequests()
    Dim Http As New ServerXMLHTTP60, Html As New HTMLDocument
    Dim elem As Object, proxyList As Variant, oProxy As Variant

    proxyList = Array( _
        "191.96.42.184:3129", _
        "138.197.108.5:3128", _
        "35.245.145.147:8080", _
        "173.46.67.172:58517", _
        "191.96.42.82:3129", _
        "157.55.201.224:8080", _
        "67.205.172.239:3128", _
        "191.96.42.106:3129" _
    )

    For Each oProxy In proxyList

        Debug.Print "trying with: " & oProxy

        With Http
            .Open "GET", "https://stackoverflow.com/questions/tagged/web-scraping", True
            .setRequestHeader "User-Agent", "Mozilla/5.0"
            .setProxy 2, oProxy
            .setTimeouts 600000, 600000, 15000, 15000 'I don't know the ideal timeout parameters
            On Error Resume Next
            .send
            While .readyState < 4: DoEvents: Wend 'to let not freeze the script

            Html.body.innerHTML = .responseText
            Set elem = Html.querySelectorAll(".summary .question-hyperlink")
            On Error GoTo 0
        End With

        If elem.Length > 0 Then
            Debug.Print elem(0).innerText
        Else:
            Debug.Print "failed with: " & oProxy
        End If

    Next oProxy
End Sub

How can I let my script run until all the proxies have been exhausted?


Solution

  • The possible way is controlling request overall elapsed time and limiting it. Any run-time errors are being checked also.

    Sub MakeProxiedRequests()
    
        Const Timeout = "0:00:15"
    
        Dim oHttp As New ServerXMLHTTP60
        Dim oHtml As New HTMLDocument
        Dim oElem As Object
        Dim aProxyList
        Dim sProxy
        Dim t As Date
        Dim bFailed As Boolean
    
        aProxyList = Array( _
            "191.96.42.184:3129", _
            "138.197.108.5:3128", _
            "35.245.145.147:8080", _
            "173.46.67.172:58517", _
            "191.96.42.82:3129", _
            "157.55.201.224:8080", _
            "67.205.172.239:3128", _
            "191.96.42.106:3129" _
        )
        For Each sProxy In aProxyList
            Debug.Print "Trying with: " & sProxy
            With oHttp
                .Open "GET", "https://stackoverflow.com/questions/tagged/web-scraping", True
                .setRequestHeader "User-Agent", "Mozilla/5.0"
                .setProxy 2, sProxy
                .setTimeouts 60000, 60000, 60000, 60000
                .send
                t = Now() + TimeValue(Timeout)
                bFailed = False
                On Error Resume Next
                Do
                    If .readyState = 4 Then Exit Do
                    bFailed = (Now() > t) Or (Err.Number <> 0)
                    If bFailed Then Exit Do
                    DoEvents
                Loop
                On Error GoTo 0
                If Not bFailed Then
                    oHtml.body.innerHTML = .responseText
                    Set oElem = oHtml.querySelectorAll(".summary .question-hyperlink")
                    bFailed = oElem.Length = 0
                End If
            End With
            If Not bFailed Then
                Debug.Print oElem(0).innerText
            Else
                Debug.Print "Failed with: " & sProxy
            End If
        Next
    
    End Sub