.netxmlpowershellformattingvisual-studio-project

How to save an XmlDocument to a .csproj "formatted like Visual Studio" where empty elements are saved on separate lines?


Currently I've created a PowerShell script to go through hundreds (yes, hundreds) of Visual Studio projects and update the References for consistency which fixes a number of subtle bugs. This works for being able to round-trip 99% of the XML, such that subsequent editing project files in Visual Studio does not introduce another change.

However, with XmlDocument.Save(), EMPTY elements are saved as a pair on the same line..

<StartupObject></StartupObject>

..instead of the way the Visual Studio project editor saves the EMPTY element split between two lines..

<StartupObject>
</StartupObject>

This leads to needless noise on the initial commit that will be reset as programmers update the projects from within Visual Studio.

How can a [modified] XmlDocument be saved using the same formatting rules Visual Studio [at least as of 2017] uses?

Currently the save is done using an explicit XmlWriter and XmlWriterSettings using the following configuration:

$writerSettings = New-Object System.Xml.XmlWriterSettings
$writerSettings.Indent = $true
$writerSettings.IndentChars = $indent # value stolen previously

$writer = [System.Xml.XmlWriter]::Create($path, $writerSettings)
$xml.Save($writer)

If using the the XmlDocument.PreserveWhitespace setting, unmodified nodes are not affected. However, in this case "improperly formatted" entries are not fixed and new nodes do not have correct indenting/formatting applied.

Preferably this can be handled with a simple modification to saving and/or settings instead of a custom writer (being the script is in PowerShell) or some post-save text manipulation (as such feels a bit of a kluge).


Solution

  • Although in general I think you should not tamper XML using regex, I also could not find any setting in the XmlWriter class that will change the style in which empty elements are written.

    The only way I got it formatted the way you want is by having the XmlWriter write to a memory stream and use regex -replace on that:

    $writerSettings             = New-Object System.Xml.XmlWriterSettings
    $writerSettings.Indent      = $true
    $writerSettings.IndentChars = '  '
    $writerSettings.Encoding    = New-Object System.Text.UTF8Encoding $false  # set to UTF-8 No BOM
    
    # create a stream object to write to
    $stream = New-Object System.IO.MemoryStream
    
    $writer = [System.Xml.XmlWriter]::Create($stream, $writerSettings)
    $writer.WriteStartDocument()
    $writer.WriteStartElement("Root")
    $writer.WriteElementString("StartupObject", [string]::Empty)
    $writer.WriteElementString("AnotherEmptyElement", [string]::Empty)
    $writer.WriteEndElement()
    $writer.WriteEndDocument()
    $writer.Flush()
    $writer.Dispose()
    
    # get whatever is written to the stream in a string variable
    $xml = [System.Text.Encoding]::Default.GetString(($stream.ToArray()))
    
    # replace self-closing elements: <StartupObject />
    # replace empty elements in one line: <StartupObject></StartupObject>
    # to the format where opening and closing tags are on separate lines
    $format = '$1<$2>{0}$1</$2>' -f [Environment]::NewLine
    $xml = $xml -replace '(.*?)<(.*?)\s*/>', $format -replace '(.*?)<(.*?)></(.*?)>', $format
    
    # save do disk
    $xml | Out-File -FilePath 'D:\test.xml' -Force -Encoding default
    

    Result:

    <?xml version="1.0" encoding="utf-8"?>
    <Root>
      <StartupObject>
      </StartupObject>
      <AnotherEmptyElement>
      </AnotherEmptyElement>
    </Root>