excelvbaweb-scraping

Extract data with VBA from weather website


I am trying to copy specific data from a weather forecast table on https://www.wunderground.com. To be more exact, I am trying to get the Time and the Cloud Cover in Excel in a tabular format (but for now any format will do) from a link like https://www.wunderground.com/hourly/ro/mizil/45.00,26.44/date/2020-04-15. So far I have tried many ways to get that specific data but I am falling short (I am new to web scraping with VBA). The concepts and commands are quite clear to me and they have worked for other sites, but for this site I am at wits' end. Currently, I am using:

Sub WeatherScrap()

Range("A1").Select

Dim mainlink As String Dim http As New XMLHTTP60, html As New HTMLDocument Dim CloudCover As Object

mainlink = "https://www.wunderground.com/hourly/ro/mizil/45.00,26.44/date/2020-04-15"

    With http
        .Open "GET", mainlink, False
        .send
        html.body.innerHTML = .responseText
    End With

    For Each CloudCover In html.getElementsByClassName("wu-value wu-value-to")
        ActiveCell.Value = CloudCover.innerText
        ActiveCell.Offset(1, 0).Select
    Next CloudCover

End Sub

I am obviously not referencing the right classes, tags or IDs on the html (I have tried with many so far, but none retrieve the desired data). The html element on the site is:

<lib-display-unit _ngcontent-app-root-c213="" _nghost-app-root-c122="" class="ng-star-inserted"><span _ngcontent-app-root-c122="" class="test- wu-unit wu-unit-chance ng-star-inserted"><!----><!----><!----><span _ngcontent-app-root-c122="" class="wu-value wu-value-to">100</span>&nbsp;<span _ngcontent-app-root-c122="" class="wu-label"><span _ngcontent-app-root-c122="" class="ng-star-inserted">%</span>

For now, just understanding how to fetch the Cloud Cover percentage from the table would suffice. Can anyone help? Thanks a lot!


Solution

  • I didn't read your whole question, but I'm guessing you want something like this (which is a pretty common way of interacting with a web-based table).

    Sub Web_Table()
        Dim HTMLDoc As New HTMLDocument
        Dim objTable As Object
        Dim lRow As Long
        Dim lngTable As Long
        Dim lngRow As Long
        Dim lngCol As Long
        Dim ActRw As Long
        Dim objIE As InternetExplorer
        Set objIE = New InternetExplorer
        objIE.Navigate "https://www.wunderground.com/hourly/ro/mizil/45.00,26.44/date/2020-04-15"
    
        Do Until objIE.ReadyState = 4 And Not objIE.Busy
            DoEvents
        Loop
        Application.Wait (Now + TimeValue("0:00:03")) 'wait for java script to load
        HTMLDoc.body.innerHTML = objIE.Document.body.innerHTML
        With HTMLDoc.body
            Set objTable = .getElementsByTagName("table")
            For lngTable = 0 To objTable.Length - 1
                For lngRow = 0 To objTable(lngTable).Rows.Length - 1
                    For lngCol = 0 To objTable(lngTable).Rows(lngRow).Cells.Length - 1
                        ThisWorkbook.Sheets("Sheet1").Cells(ActRw + lngRow + 1, lngCol + 1) = objTable(lngTable).Rows(lngRow).Cells(lngCol).innerText
                    Next lngCol
                Next lngRow
                ActRw = ActRw + objTable(lngTable).Rows.Length + 1
            Next lngTable
        End With
        objIE.Quit
    End Sub
    

    Result:

    enter image description here