xmlpowershellnewlinelf

Why does Powershell xml.save create LFs in multi line comments on Windows?


I found out that xml.save ends multi line comments with LF on windows instead of CR LF. Why is that? Here is my code

    [xml]$myXml= @"
<myTag>
        <!-- multi line comment
        each line ends with CR LF
        in my code, but with LF
        after .save gets called
        end of my multi line comment -->
</myTag>
"@

$myXml.save("C:\temp\my.xml")

Here is a screenshot of Notepad++ from my.ps1 to show that my code does not contain any LFs

enter image description here

Here is a screenshot of Notepad++ from my.xml to show the LFs

enter image description here

One way to fix this, which I found, is using XmlWriterSettings and XmlTextWriter:

$settings = New-Object System.Xml.XmlWriterSettings
$settings.NewLineChars = "`r`n"
$settings.Indent = $true
$writer = [System.Xml.XmlTextWriter]::Create($dependencyXmlPath, $settings)
$myXml.Save($writer)
$writer.Close()

Is this the most simple solution?


Solution

  • Instruct the [xml] (System.Xml.XmlDocument) instance to preserve insignificant whitespace before loading, by setting its PreserveWhitespace property to $true, which preserves the input newline format as well as the specific intra-line whitespace[1] (note how the indentation changed in your output file).

    # Create an [xml] instance explicitly, so that its
    # .PreserveWhitespace property can be set *before* loading content.
    ($myXml = [xml]::new()).PreserveWhitespace = $true
    # Now load the XML text (parse it into a DOM).
    $myXml.LoadXml(
    @"
    <myTag>
            <!-- multi line comment
            each line ends with CR LF
            in my code, but with LF
            after .save gets called
            end of my multi line comment -->
    </myTag>
    "@
    )
    
    $myXml.Save("C:\temp\my.xml")
    

    However, apart from preserving the indentation, the above only consistently results in Windows-format CRLF newlines if your .ps1 file uses them (which it does):

    See:


    Generally - unless .PreserveWhitespace = $true is in effect - the .Save() method:


    A workaround without .PreserveWhitespace = $true and / or ensuring a consistent output newline format:

    Note the two use cases:

    You can control the output newline format by explicitly creating a XmlWriter instance with with an XmlWriterSettings instance with the following properties:

    # Using an [xml] cast means that insignificant whitespace
    # is *not* preserved.
    [xml] $myXml= @"
    <myTag>
            <!-- multi line comment
            each line ends with CR LF
            in my code, but with LF
            after .save gets called
            end of my multi line comment -->
    </myTag>
    "@
    
    # Create an XML writer explicitly, with settings
    # that 
    $writer = [System.Xml.XmlWriter]::Create(
      "C:\temp\my.xml", 
      [System.Xml.XmlWriterSettings] @{ 
        # Pretty-print, using the value of .IndentChars
        # per indentation level; default is *two spaces*.
        Indent = $true
        # Replace all newlines in the DOM with the character(s) 
        # specified in the .NewLineChars property,
        # which defaults to the platform-native format.
        NewLineHandling = 'Replace'
       }
      )
    
    # Save to the target file via the writer.
    $myXml.Save($writer); $writer.Dispose()
    

    Testing a given file for the presence of LF-only newlines:
    # Returns $true if at least one LF-only newline is present.
    (Get-Content -Raw $myXmlPath) -match '(?<!\r)\n' 
    

    [1] There is one exception: intra-tag whitespace is not preserved, i.e. the specific whitespace - including any newlines - that separates the element name from the first attribute as well as the whitespace between attributes isn't preserved - see this answer.