vb.netutf-8character-encodinglinq-to-xmlxml-encoding

XDocument fails loading on ÅÄÖ chars (VB.net)


I have this code for looping reports in a XML file:

Dim xmlr As XDocument = XDocument.Load("Myfile.xml")
For Each report As XElement In xmlr.Descendants("Report")
    'Do stuff with report values
Next

this works, but i get an error if the file contains chars like ÅÄÖ. The xml document has the encoding UFT-8;

<?xml version="1.0" encoding="utf-8"?>

I found this post here, and tried with this code instead, but it does not help;

Dim xmlr As XDocument

Using oReader As StreamReader = New StreamReader("Myfile.xml", Encoding.GetEncoding("UTF-8"))
    xmlr = XDocument.Load(oReader)
End Using

Any suggestions?


Solution

  • Based on your comments the input document you are trying to process is not well-formed XML as it has unescaped ampersands & in element or attribute content. As the ampersand in XML syntax serves to start a character (e.g. &#160;) or entity reference (&lt;) it has to be escaped as &amp; if it should appear literally in content (e.g. <foo>a &amp; b</foo>) (and alternative is CDATA section <foo><![CDATA[a & b]]></foo>).

    So the .NET framework's XML parser is doing the right thing by telling you the input you are trying to parse is not well-formed XML and telling you where the error is. That is all the APIs in System.Xml and below can do, they can read and write well-formed XML. There is no API to try to correct errors.