I have a problem with MySQL and/or DBforge.
I would like to import a schema using an .sql file that DBforge generated for me by comparing 2 databases.
The file DBforge created for me contains characters with accents (éèêàâ...), so far nothing abnormal, I'm French and the stored procedures I need to import are written in French.
However, when I want to execute this .sql file in my database in a PowerShell :
Get-Content $SQLfile -Encoding UTF8 |
mysql --password=$($target.password) -h $($target.host) -P $($target.port) -u $($target.user) --default-character-set utf8mb4 -f -D $db 2> $tempErrorFile
All special characters are replaced by ?
despite the fact that my .sql file is in UTF8 and has accents.
Here's my call to dbforge to generate the .sql file for me (although I doubt that's where the problem is):
&$($DBFORGE_PATH + 'dbforgemysql.com') /schemacompare /source connection:$SOURCE_LINE /target connection:$TARGET_LINE /ExcludeComments:Yes /ExcludeDependencies:Yes /IncludeUseDatabase:Yes /filter:$selectedFilter /sync:$SQL_DBFORGE /log:$LOG_DBFORGE_FILE
Here is my call to mysql with the file generated by dbforge :
&cmd /c "mysql --password=$($target.password) -h $($target.host) -P $($target.port) -u $($target.user) --default-character-set=utf8mb4 -f -D $db < $SQLfile 2> $tempErrorFile"
I tried another way:
Get-Content $SQLfile -Encoding UTF8 |
mysql --password=$($target.password) -h $($target.host) -P $($target.port) -u $($target.user) --default-character-set=utf8mb4 -f -D $db 2> $tempErrorFile
tl;dr
Execute $OutputEncoding = [System.Text.UTF8Encoding]::new()
before your command to ensure that Windows PowerShell sends UTF-8 to mysql
:
$OutputEncoding = [System.Text.UTF8Encoding]::new()
Get-Content $SQLfile -Encoding UTF8 |
mysql --password=$($target.password) -h $target.host -P $target.port -u $target.user --default-character-set utf8mb4 -f -D $db 2> $tempErrorFile
The reason that your cmd
solution worked is that its <
and >
redirection operators function as raw byte conduits. Therefore, your input file's content was (correctly) passed as-is to mysql
, whereas in PowerShell Get-Content
first decoded the file content into .NET strings, and, on sending the result to mysql
via the pipeline, PowerShell re-encoded it based on $OutputEncoding
(incorrectly, in Windows PowerShell).
>
to save an external program's output to a file, and you can even send raw bytes from PowerShell to external programs, by piping [byte[]]
data - see this answer for details, which also explains PowerShell's decoding and re-encoding behavior in more detail.Your problem isn't specific to the CLI you're using (mysql
), it is rooted in a fundamental PowerShell behavior:
When PowerShell sends data to an external (native) program via the pipeline, it uses the character encoding stored in the $OutputEncoding
preference variable.
Unfortunately, Windows PowerShell uses ASCII(!) encoding by default, meaning that non-ASCII characters (characters outside the 7-bit Unicode range, such as é
) aren't supported and replaced with literal ?
characters, which is what you saw.
Therefore, you must (temporarily) assign a (BOM-less) UTF-8 encoding to $OutputEncoding
before calling your command:
# Instruct PowerShell to send UTF-8 to external programs via the pipeline.
$OutputEncoding = [System.Text.UTF8Encoding]::new()
# ... now make your Get-Content ... | mysql call.
PowerShell (Core) 7+ defaults to UTF-8.
When PowerShell receives data from external programs (via their stdout and stderr streams), PowerShell uses the encoding stored in [Console]::OutputEncoding
to decode the data (into .NET strings).
Unfortunately, both PowerShell editions defaults to the legacy OEM code page determined by the active legacy system locale (aka language for non-Unicode programs), e.g. 437
on US-English systems.
While many CLIs honor this code page when producing their output, modern CLIs increasingly do not, given that such (fixed single-byte) code pages limit you to 256 characters. node
, the Node.JS CLI, is a prominent example: it uses UTF-8 unconditionally. Others, such as mysql
, allow you to request UTF-8 on demand.
Thus, extra effort is needed[1] in order to both send and receive UTF-8-encoded text from external programs:
# Instruct PowerShell to send UTF-8 to external programs via the pipeline
# and to interpret their output as UTF-8.
$OutputEncoding =
[Console]::OutputEncoding = [System.Text.UTF8Encoding]::new()
[1] Except if you've configured your system to use UTF-8 system-wide, but note that doing so has far-reaching consequences - see this answer.