html.netgoogle-chromeopera

How can I get html code that is shown in a browser programmatically?


Using .NET, how can I get the html code that is shown in a browser (and that can be saved from browsers such as Chrome or Opera through Save As commands) programmatically?

Using HtmlDocument.Load() or wget is to no avail - I will not get what I want.

See also the discussion here.

EDIT

Unfortunatelly the .Net WebClient (or rather the new .Net.Http.HttpClient) class did not help (see the answer by bdcoder). I got the same result as with HtmlDocument.Load() or wget. Not the html code that the browsers save.

let myHtml =         
    async 
        {     
            let client = new System.Net.Http.HttpClient()
            let! responseBody = 
                client.GetStringAsync("https://www.kodis.cz/lines/region?tab=232-293")
                |> Async.AwaitTask
            return responseBody
        } |> Async.RunSynchronously

Solution

  • If you look in the network panel of the browser dev tools you can see the endpoint the JavaScript is calling to get the PDF data. You can use the HttpClient to request the same data then parse the JSON to get the pdf links.

    enter image description here