.netxmlvb.netlinq-to-xml

In VB 2010, using XDocument, how do I delete an XML node based on the value (innertext) of one of its descendants?


I need to delete all nodes with a specific attribute value. Using XDocument.Descendants.Where clause with an inline function it's a snap to delete nodes based on a specific attribute value.

That works great.

I also need to delete all nodes that have specific descendant value. It made sense to me to use the same method for checking a descendant node of each element, and when that descendant value (innertext) matches a given value to look for, delete the element.

It almost works.

My code:

Dim Xmlstring As String =
    "<?xml version=" & Chr(34) & "1.0" & Chr(34) & " encoding=" & Chr(34) & "UTF-8" & Chr(34) & "?>" & vbNewLine &
    "<domain:DOGs " & vbNewLine &
    "      page=" & Chr(34) & "1" & Chr(34) & vbNewLine &
    "      ofPages=" & Chr(34) & "1" & Chr(34) & vbNewLine &
    "      xmlns:domain=" & Chr(34) & "urn:cat:np.domain.2111-14-42" & Chr(34) & vbNewLine &
    "      xmlns:vs=" & Chr(34) & "urn:cat:vs.2111-15-31" & Chr(34) & vbNewLine &
    "      xmlns:base=" & Chr(34) & "urn:cat:base.2111-15-31" & Chr(34) & ">" & vbNewLine &
    "   <domain:DOG name=" & Chr(34) & "Fido" & Chr(34) & ">" & vbNewLine &
    "      <description>Good</description>" & vbNewLine &
    "   </domain:DOG>" & vbNewLine &
    "   <domain:DOG name=" & Chr(34) & "Abby" & Chr(34) & ">" & vbNewLine &
    "      <description>Bad</description>" & vbNewLine &
    "   </domain:DOG>" & vbNewLine &
    "   <domain:DOG name=" & Chr(34) & Chr(34) & ">" & vbNewLine &
    "      <description>Ugly</description>" & vbNewLine &
    "   </domain:DOG>" & vbNewLine &
    "   <domain:DOG name=" & Chr(34) & "Bruno" & Chr(34) & ">" & vbNewLine &
    "      <description>Sweaty</description>" & vbNewLine &
    "   </domain:DOG>" & vbNewLine &
    "   <domain:DOG name=" & Chr(34) & "Shep" & Chr(34) & ">" & vbNewLine &
    "      <description>Good</description>" & vbNewLine &
    "   </domain:DOG>" & vbNewLine &
    "</domain:DOGs>"

Dim Xdoc As XDocument = XDocument.Parse(Xmlstring)
Dim iFoundCount As Integer = 0

'This works:
With Xdoc.Descendants().Descendants()
    iFoundCount = .Where(Function(e) e.Attributes("name").Any(Function(a) a = "")).Count
    .Where(Function(e) e.Attributes("name").Any(Function(a) a = "")).Remove()
End With
Dim sResultFile As String = "c:\0\1st_result_" & iFoundCount.ToString & ".xml"
Xdoc.Save(sResultFile)
'The result:
'<?xml version="1.0" encoding="utf-8"?>
'<domain:DOGs page="1" ofPages="1" xmlns:domain="urn:cat:np.domain.2111-14-42" xmlns:vs="urn:cat:vs.2111-15-31" xmlns:base="urn:cat:base.2111-15-31">
'  <domain:DOG name="Fido">
'    <description>Good</description>
'  </domain:DOG>
'  <domain:DOG name="Abby">
'    <description>Bad</description>
'  </domain:DOG>
'  <domain:DOG name="Bruno">
'    <description>Sweaty</description>
'  </domain:DOG>
'  <domain:DOG name="Shep">
'    <description>Good</description>
'  </domain:DOG>
'</domain:DOGs>


'This looks like it would work as there aren't any errors in the code
'and it almost works.
Xdoc = Nothing
Xdoc = XDocument.Parse(My.Computer.FileSystem.ReadAllText(sResultFile))
iFoundCount = 0
With Xdoc.Descendants().Descendants()
    iFoundCount = .Where(Function(e) e.Descendants("description").Value.Any(Function(a) a = "Sweaty")).Count '<-- error occurs here
    .Where(Function(e) e.Descendants("description").Value.Any(Function(a) a = "Sweaty")).Remove()
End With

sResultFile = "c:\0\2nd_result_" & iFoundCount.ToString & ".xml"
Xdoc.Save(sResultFile)

I get a popup pointing to the 'iFoundCount =' line that reads

    ArgumentNullException was unhandled
    Value cannot be null.
    Parameter name: source

What am I missing?

What do I do?


Solution

  • I usually find it easier to use XPath queries instead of LINQ syntax, but maybe that's just me. It does need the use of a NamespaceManager, but that's easy to set up, as you can see below.

    Please make sure to set Option Strict to On. It tells Visual Studio to help you to get all variable types lined up correctly: the part that you commented as "This works" does not actually work with Option Strict On, so I have added code which does work.

    Here is what I ended up with:

    Option Infer On
    Option Strict On
    
    Imports System.Xml
    Imports System.Xml.XPath
    
    Module Module1
    
        Sub Main()
            Dim xmlstring = "<?xml version=""1.0"" encoding=""utf-8""?>
    <domain:DOGs page=""1"" ofPages=""1"" 
        xmlns:domain=""urn:cat:np.domain.2111-14-42"" 
        xmlns:vs=""urn:cat:vs.2111-15-31"" 
        xmlns:base=""urn:cat:base.2111-15-31""
    >
      <domain:DOG name=""Fido"">
        <description>Good</description>
      </domain:DOG>
      <domain:DOG name=""Abby"">
        <description>Bad</description>
      </domain:DOG>
      <domain:DOG name="""">
        <description>Ugly</description>
      </domain:DOG>
      <domain:DOG name=""Bruno"">
        <description>Sweaty</description>
      </domain:DOG>
      <domain:DOG name=""Shep"">
        <description>Good</description>
      </domain:DOG>
    </domain:DOGs>"
    
            Dim xdoc = XDocument.Parse(xmlstring)
    
            Dim nsm = New XmlNamespaceManager(New NameTable())
            nsm.AddNamespace("domain", "urn:cat:np.domain.2111-14-42")
            nsm.AddNamespace("vs", "urn:cat:vs.2111-15-31")
            nsm.AddNamespace("base", "urn:cat:base.2111-15-31")
    
            Dim nFound = 0
    
            ' Select for an empty-or-whitespace name:
            Dim namelessDogs = xdoc.XPathSelectElements("//domain:DOG[normalize-space(@name)='']", nsm)
    
            nFound = namelessDogs.Count
            namelessDogs.Remove()
    
            Dim resultFile = String.Format("C:\Temp\1st_result_{0}.xml", nFound)
    
            xdoc.Save(resultFile)
    
            ' ###############
    
            xdoc = XDocument.Load(resultFile)
    
            Dim excludeDescription = "Good"
    
            ' Select <domain:DOG> elements which contain a <description> element with a specified value:
            Dim excludedDogs = xdoc.XPathSelectElements(String.Format("//description[text() = '{0}']/parent::domain:DOG", excludeDescription), nsm)
    
            nFound = excludedDogs.Count
            excludedDogs.Remove()
    
            resultFile = String.Format("C:\Temp\2nd_result_{0}.xml", nFound)
    
            xdoc.Save(resultFile)
    
            Console.ReadLine()
    
        End Sub
    
    End Module
    

    I chose to test it with a value which appears more than once - sometimes you can get it working with a singular value and then discover it only works for one occurrence of a value.

    If you do want to allow the name to be spaces, then you can remove the normalize-space function like this: Dim namelessDogs = xdoc.XPathSelectElements("//domain:DOG[@name='']", nsm).

    Also, later versions of Visual Studio are available for free - there's no reason from that aspect to be stuck with VS2010.