parsingweb-scrapinggambas

Web scraping with Gambas, is it possible?


Using Gambas, is it possible to download a webpage to a string, and then parse that string. I know I can parse the data in the string once I have the data, I'm struggling with getting the data from the webpage into a string.


Solution

  • You can use the HttpClient class from the gb.net.curl component

    There you can also find an example how to read the data either synchronous or asynchronous.

    To get the data from the web in a string you could write following function (it would be synchronous in this case)

    Public Function GetTextFromUrl(url As String) As String
        Dim client As New HttpClient As "client"
    
        client.URL = url
        client.async = False
        client.Get()
    
        ' an error occured
        If client.Status < 0 Then
            Return ""
        Endif
    
        ' no data available
        If Not Lof(client) Then
            Return ""
        Endif
    
        ' Reads the data from the server and returns it as a String
        Return Read #client, Lof(client)
    
    End
    

    And you could call the function like this:

    Public Sub Main()
        Print GetTextFromUrl("http://stackoverflow.com")
    End