Using .NET, how can I get the html code that is shown in a browser (and that can be saved from browsers such as Chrome or Opera through Save As commands) programmatically?
Using HtmlDocument.Load()
or wget
is to no avail - I will not get what I want.
See also the discussion here.
EDIT
Unfortunatelly the .Net WebClient
(or rather the new .Net.Http.HttpClient
) class did not help (see the answer by bdcoder). I got the same result as with HtmlDocument.Load()
or wget
. Not the html code that the browsers save.
let myHtml =
async
{
let client = new System.Net.Http.HttpClient()
let! responseBody =
client.GetStringAsync("https://www.kodis.cz/lines/region?tab=232-293")
|> Async.AwaitTask
return responseBody
} |> Async.RunSynchronously
If you look in the network panel of the browser dev tools you can see the endpoint the JavaScript is calling to get the PDF data. You can use the HttpClient to request the same data then parse the JSON to get the pdf links.