powershellbatch-fileencodingutf-8byte-order-mark

Remove BOM from UTF 8 using cmd


I need to remove BOM using cmd file from MyFile.txt. The file is located here.

| Out-File -encoding utf8 '%CD%\MyFile.txt'

I need it to be removed using only one cmd, that is, in the next lines. I need it to be backwards compatible to Windows 7. If I needed it for myself only I would just use -encoding default, but its not backwards compatible even to win 10. Just one file. There were many different questions about BOM on different situations, my issue is I need to use one .cmd, I already have utf8 with BOM and I need it without BOM. Please help me.

I was trying to use powershell but the issue is powershell syntax is not really compatible to cmd? It just says about unrecognised syntax everytime I tried anything from the very popular theme like this Using PowerShell to write a file in UTF-8 without the BOM.


Solution

  • PowerShell is indeed your best bet, and while you cannot directly use PowerShell commands from cmd.exe / a batch file, you can pass them to powershell.exe, the Windows PowerShell CLI (the solutions below also work on the no longer supported Windows 7 edition of Windows, as requested).

    Preface:

    Here's a sample batch file that demonstrates a solution, i.e. it converts a UTF-8 file with BOM to a BOM-less one in-place (best to make a backup copy first):

    @echo off & setlocal
    
    :: Specify the input file, assumed to be a UTF-8 file *with-BOM*.
    set "targetFile=%CD%\MyFile.txt"
    
    :: Call Windows PowerShell in order to 
    :: convert the file to a *BOM-less** UTF-8 file.
    powershell -noprofile -c $null = New-Item $env:targetFile -Force -Value (Get-Content -Raw -LiteralPath $env:targetFile)
    

    Note: The -c (-Command) and -noprofile CLI parameters aren't strictly necessary, but are included for conceptual clarity and to avoid unnecessary processing of profile files.

    Alternatively, using the [IO.File]::WriteAllText() .NET API:

    :: ...
    
    powershell -noprofile -c [IO.File]::WriteAllText((Convert-Path -LiteralPath $env:targetFile), (Get-Content -Raw -LiteralPath $env:targetFile))
    

    Note:


    The following solution addresses these problems:

    @echo off & setlocal
    
    set "targetFile=%CD%\MyFile.txt"
    
    powershell -noprofile -c $ErrorActionPreference='Stop'; $tempFile=New-TemporaryFile; $inFile=Convert-Path -LiteralPath $env:targetFile; [IO.File]::WriteAllLines($tempFile, [IO.File]::ReadLines($inFile)); $tempFile ^| Move-Item -Force -Destination $inFile
    

    Note:


    [1] These versions originally shipped with v2 of Windows PowerShell, though upgrades to later versions were possible. The solutions in this answer work even in v2.
    Calling Windows PowerShell from a batch file via powershell.exe) is the only way to solve your problem with built-in features; cmd.exe's built-in features are far less powerful and offer no solution, and Windows doesn't ship with a file-transcoding utility (whereas Unix-like platforms come with the iconv utility.

    [2] Note that in PowerShell (Core) 7 you wouldn't need to resort to .NET APIs, because the -Encoding parameter there now accepts any [Text.Encoding] instance, either directly, or by name (e.g., -Encoding Windows-1251) or by code-page number (e.g., -Encoding 1251); applied to your scenario:
    pwsh -noprofile -c Set-Content -Encoding 1251 -NoNewLine -LiteralPath $env:targetFile -Value (Get-Content -Raw -LiteralPath $env:targetFile)