powershellio-redirection

Why does PowerShell redirection >> change the formatting of the text content?


I want to use the redirect append >> or write > to write to a txt file, but when I do, I receive a weird format "\x00a\x00p...".

I successfully use Set-Content and Add-Content, why do they function as expected, but not the >> and > redirect operators?

Showing the output using PowerShell cat as well as simple Python print.

rocket_brain> new-item test.txt
rocket_brain> "appended using add-content" | add-content test.txt
rocket_brain> cat test.txt

 appended using add-content

but then if I use redirect append >>

rocket_brain> "appended using redirect" >> test.txt
rocket_brain> cat test.txt

 appended using add-content
 a p p e n d e d   u s i n g   r e d i r e c t

Simple Python script: read_test.py

with open("test.txt", "r") as file:   # open test.txt in readmode
    data = file.readlines()           # append each line to the list data
    print(data)                       # output list with each input line as an item

Using read_test.py I see a difference in formatting

rocket_brain> python read_test.txt
 ['appended using add-content\n', 'a\x00p\x00p\x00e\x00n\x00d\x00e\x00d\x00 \x00u\x00s\x00i\x00n\x00g\x00 \x00r\x00e\x00d\x00i\x00r\x00e\x00c\x00t\x00\r\x00\n', '\x00']

NOTE: If I use only the redirect append >> (or write >) without first using Add-Content, the cat output looks normal (instead of spaced out), but I will then get the /x00p format for every line when using the Python script (including any Add-Content command after starting with > operators). Opening the file in Notepad (or VS etc), the text always looks as expected. Using >> or > in cmd (instead of PS) also stores text in expected ascii format.

Related links: cmd redirection operators, PS redirection operators


Solution

  • Note: The problem is ultimately that in Windows PowerShell different cmdlets / operators use different default encodings. This problem has been resolved in PowerShell (Core) 7+, where BOM-less UTF-8 is consistently used.



    Aside from different default encodings (in Windows PowerShell), it is important to note that Set-Content / Add-Content on the one hand and > / >> / Out-File [-Append] on the other behave fundamentally differently with non-string input:

    In short: the former apply simple .ToString()-formatting to the input objects, whereas the latter perform the same output formatting you would see in the console - see this answer for details.


    [1] Due to the initial content set by Add-Content, Windows PowerShell interprets the file as ANSI-encoded (the default in the absence of a BOM), where each byte is its own character. The UTF-16 content appended later is therefore also interpreted as if it were ANSI, so the 0x0 bytes are treated like characters in their own right, which print to the console like spaces.