jsonpowershellspecial-charactersreadfilewritetofile

Json file to powershell and back to json file


I am trying to manipulate json file data in powershell and write it back to the file. Even before the manipulation, when I just read from the file, convert it to Json object in powershell and write it back to the file, some characters are being replaced by some codes. Following is my code:

$jsonFileData = Get-Content $jsonFileLocation

$jsonObject = $jsonFileData | ConvertFrom-Json

... (Modify jsonObject) # Commented out this code to write back the same object

$jsonFileDataToWrite = $jsonObject | ConvertTo-Json

$jsonFileDataToWrite | Out-File $jsonFileLocation

Some characters are being replaced by their codes. E.g.:

< is replaced by \u003c
> is replaced by \u003e. 
' is replaced by \u0027

Sample input:

{
    "$schema": "https://source.com/template.json#",
    "contentVersion": "1.0.0.0",
    "parameters": {
        "accountName": {
            "type": "string",
            "defaultValue": "<sampleAccountName>"
        },
        "accountType": {
            "type": "string",
            "defaultValue": "<sampleAccountType>"
        },
    },
    "variables": {
        "location": "sampleLocation",
        "account": "[parameters('accountName')]",
        "type": "[parameters('accountType')]",
    }
}

Output:

{
    "$schema": "https://source.com/template.json#",
    "contentVersion": "1.0.0.0",
    "parameters": {
        "accountName": {
            "type": "string",
            "defaultValue": "\u003csampleAccountName\u003e"
        },
        "accountType": {
            "type": "string",
            "defaultValue": "\u003csampleAccountType\u003e"
        },
    },
    "variables": {
        "location": "sampleLocation",
        "account": "[parameters(\u0027accountName\u0027)]",
        "type": "[parameters(\u0027accountType\u0027)]",
    }
}

Why is this happening and what can I do to make it not to replace the characters and write them back the same way?


Solution

  • Since ConvertTo-Json uses .NET JavaScriptSerializer under the hood, the question is more or less already answered here.

    Here's some shameless copypaste:

    The characters are being encoded "properly"! Use a working JSON library to correctly access the JSON data - it is a valid JSON encoding.

    Escaping these characters prevents HTML injection via JSON - and makes the JSON XML-friendly. That is, even if the JSON is emited directly into JavaScript (as is done fairly often as JSON is a valid2 subset of JavaScript), it cannot be used to terminate the element early because the relevant characters (e.g. <, >) are encoded within JSON itself.


    If you really need to turn character codes back to unescaped characters, the easiest way is probably to do a regex replace for each character code. Example:

    $dReplacements = @{
        "\\u003c" = "<"
        "\\u003e" = ">"
        "\\u0027" = "'"
    }
    
    $sInFile = "infile.json"
    $sOutFile = "outfile.json"
    
    $sRawJson = Get-Content -Path $sInFile | Out-String
    foreach ($oEnumerator in $dReplacements.GetEnumerator()) {
        $sRawJson = $sRawJson -replace $oEnumerator.Key, $oEnumerator.Value
    }
    
    $sRawJson | Out-File -FilePath $sOutFile