Using HtmlProvider to access a web-based table sometimes returns a fraction as a string (correct) and, at other times, returns a DateTime (incorrect). What am I missing?
module Test =
open FSharp.Data
let [<Literal>] url = "https://www.example.com/fractions"
type profile = HtmlProvider<url>
let profile = profile.Load(url)
let [<Literal>] resultFile = @"C:\temp\data\Profile.csv"
let CsvResult =
do
use writer = new StreamWriter(resultFile, false)
writer.WriteLine "\"Date\";\"Fraction\""
for row in profile.Tables.Table1.Rows do
"\"" + row.``Date``.ToString() + "\"" + ";" |> writer.Write
"\"" + row.``Fraction``.ToString() + "\"" + ";" |> writer.WriteLine
writer.Close
let csvResult = CsvResult
Without seeing sample data I can't be 100% certain, but I'm guessing that it's parsing fractions as dates if the numbers involved would be valid dates in the culture you're using: e.g., 1/4
would be a valid date in any culture that uses /
as a separator, and would be treated either as April 1st or as January 4th, depending on which parsing culture your system defaults to.
Other type providers in FSharp.Data (such as the CSV type provideryou could ) allow you to configure how each column will be parsed, but that's not an option the HTML type provider gives you. (Which is a bit of a missing feature, of course). But since the HTML type provider does allow you to specify the culture info for datetime and number parsing, one way you might be able to work around this is specify a culture that does not use /
as a separator (but still uses .
as a decimal point, since otherwise if the HTML you're parsing has numbers written like 1,000
for one thousand, that could be interpreted as 1
). One such culture is the en-IN
culture ("English (India)"), where the date separator is -
and the decimal point is .
.
So try passing Culture=System.Globalization.CultureInfo.GetCultureInfo("en-IN")
in your HtmlProvider
options, and see if that helps it stop treating fractions as dates.