I am trying to compile/convert a Java program that I wrote from using the "normal" JDK (Java v24) to natively compiled code using the GraalVM (v25). The "classic" version runs fine. The native version starts up OK but then it has an issue with program arguments read-in from the command-line which are apparently in a different character set than the ones being used by the original Java application (or the filename are being dealt with differently).
The issue is triggerend by passing a filename-argument to the application (via drag-and-drop) that contains Umlauts. Both program variants echo the passed-in filename (actually the full file-path) as:
U:\Documents\Kontoausz�ge\...\Umsatz�bersicht_20250401-20250831.csv
i.e. also in the classic version the filename echoed on the console seems "garbled" (the � is actually an ü in the file system), but the classic Java version finds, opens and processes that file without any issue. The GraalVM's compiled native executable echoes the very same filename also "garbled" but then throws a FileNotFound-Exception:
java.io.FileNotFoundException: U:\Documents\Kontoausz�ge\...\Umsatz�bersicht_20250401-20250831_AT221600000100586916.csv (The system cannot find the path specified)
(Note: I left out irrelevant fractions of the path here for privacy reason and replaced them with ...).
Why is this name/path handling behavior is different? Is there an option to align the command-line character set used by the GraalVM generated native's code's with the character set being used by the JDK-based "vanilla" Java code?
Later addition - answering misc. questions:
This is on Windows 10 (all current fixes/patches/updates installed).
No "VM" other than the Java VM is involved here. The actual filename/path is U:\Documents\Kontoauszüge\...\Umsatzübersicht_20250401-20250831.csv
(the ...-part of the path does not contain any Umlauts or other special characters).
C:\>chcp
Active code page: 437
My Windows system region settings are "English (Switzerland)" (yes - that exists). The system's display language is "English (US)".
The drag-and-drop I mentioned is via the Windwos Explorer, the "GUI" that some respondents wanted to know about is the Windows desktop. I drag the file that I want to be processed over a command-file which then calls the java application passing the argument via %1 to the command line calling the java application (or the .exe in the GraalVM case).
This is a known issue on Windows: [GR-52826] Non-ASCII characters in command line arguments are replaced by U+FFFD in Windows (native-image) #8593. In short, the problem is that the command-line arguments are not being decoded correctly, which means your program is not receiving the correct file path. Hence a file-not-found exception. Unfortunately, this issue doesn't seem to be a priority. See this comment (March 2024):
Ok, so I have discussed this internally: The JDK seems to convert arguments in their app launchers: https://github.com/openjdk/jdk/blob/700d2b91defd421a2818f53830c24f70d11ba4f6/src/jdk.jpackage/windows/native/common/WinSysInfo.cpp#L137
Instead of doing this, we can avoid the additional overhead (and potential for errors) by switching to wmain on Windows. This will also allow us to provide other features on Windows such as a javaw.exe like entry point that allows running an app without a command prompt.
We currently have no ETA for this but we will update this ticket when we do [emphasis added].
And this comment (April 2024):
I would be surprised if this bug gets a fix this year. It just can't be a priority: it's only in Windows while the majority of Java apps run on Linux + it's in command line parsing and the majority of Java apps don't do much of it.
The last comment (June 2024) on the issue, as of this answer, says:
Apparently Microsoft does a U-turn with their encodings zoo and now promotes using UTF-8 for new applications using a dedicated manifest property. The introduced
activeCodePagemanifest property was introduced in Windows 10 Version 1903.Until recently, Windows has emphasized "Unicode" -W variants over -A APIs. However, recent releases have used the ANSI code page and -A APIs as a means to introduce UTF-8 support to apps. If the ANSI code page is configured for UTF-8, then -A APIs typically operate in UTF-8. This model has the benefit of supporting existing code built with -A APIs without any code changes.
The articles goes so far as to call "Win32 API [that] might only understand WCHAR" legacy.
It's also possible to slap this manifest onto an existing exe. I'll try it in the meantime.
See also a blog post from Raymond Chen about this feature.
The end of the comment mentions manifests and suggests you can modify an existing executable's manifest. This would allow you to set the activeCodePage to UTF-8. Modifying manifests can be done with the mt.exe tool, which is obtained by installing the Windows SDK.
From some testing, GraalVM's native-image does not add a manifest to the executable. At least for simple applications. So, you should be able to simply insert your own without worrying about overriding anything. Though as I understand it, the mt.exe tool does provide a way to merge multiple manifests together. I don't know how well that works as I've never tried it. Regardless, once you have your manifest you can insert it with the following command:
mt -manifest <manifest-file> -outputresource:<exe-file>;#1
If an executable already has a manifest then you can extract it with:
mt -inputresource:<exe-file>#1 -out:original.manifest
From some trial and error, the following manifest seems sufficient to set the active code page of the executable to UTF-8:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<assembly xmlns="urn:schemas-microsoft-com:asm.v1" manifestVersion="1.0" xmlns:asmv3="urn:schemas-microsoft-com:asm.v3">
<assemblyIdentity type="win32" name="Organization.Division.Name" version="1.0.0.0"/>
<asmv3:application>
<asmv3:windowsSettings xmlns="http://schemas.microsoft.com/SMI/2019/WindowsSettings">
<activeCodePage>UTF-8</activeCodePage>
</asmv3:windowsSettings>
</asmv3:application>
</assembly>
There may be a more appropriate manifest for this, such as including other elements like compatibility and security information. But I have very little experience with these manifests so I don't know what the best practices are. Note you will probably want to put your own values for the name and version attributes. For an example, the jpackage tool sets the name attribute to jpackageapplauncher.exe and the version attribute to the version of the JDK (e.g., 25.0.0.0 for JDK 25).
The mt.exe tool also provides a way to validate manifests:
mt -manifest <manifest-file> -validate_manifest
Which is likely easier than inserting the manifest, getting a so-called "side-by-side configuration" error, and then using sxstrace to debug the problem; I didn't notice the -validate_manifest option until after getting it to work 🤦.
Here's an example using the manifest solution. Note you'll need mt.exe on the path for the run-tests.ps1 PowerShell script to work.
The example includes a test of a jpackage application image to show that it works without needing the patch (as suggested by the first comment I quoted from the issue linked earlier).
D:\GRAALVM-ENCODING-TESTS
| run-tests.ps1
| test.manifest
| Umsatzübersicht.txt
|
\---src
| module-info.java
|
\---com
\---example
Main.java
module-info.java:
module test {}
Main.java:
package com.example;
import java.nio.file.Files;
import java.nio.file.Path;
public class Main {
public static void main(String[] args) {
if (args.length != 1) {
System.err.println("Expected a single file argument.");
System.err.flush();
System.exit(1);
}
System.out.println();
printProperty("stdout.encoding");
printProperty("stderr.encoding");
printProperty("file.encoding");
System.out.println();
boolean success;
try {
var path = Path.of(args[0]).toAbsolutePath().normalize();
System.out.printf("Path: %s%n%n", path);
var contents = Files.readString(path);
System.out.printf("Contents:%n%s%n", contents.indent(2));
success = true;
} catch (Exception ex) {
ex.printStackTrace();
success = false;
}
System.out.flush();
System.err.flush();
System.exit(success ? 0 : 2);
}
static void printProperty(String key) {
System.out.printf("%-15s = %s%n", key, System.getProperty(key));
}
}
Umsatzübersicht.txt
This is a test file whose name contains Unicode characters. If you're seeing
this text, then that means the program successfully found and read this file.
Hurray!
test.manifest
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<assembly xmlns="urn:schemas-microsoft-com:asm.v1" manifestVersion="1.0" xmlns:asmv3="urn:schemas-microsoft-com:asm.v3">
<assemblyIdentity type="win32" name="Organization.Division.Name" version="1.0.0.0"/>
<asmv3:application>
<asmv3:windowsSettings xmlns="http://schemas.microsoft.com/SMI/2019/WindowsSettings">
<activeCodePage>UTF-8</activeCodePage>
</asmv3:windowsSettings>
</asmv3:application>
</assembly>
run-tests.ps1
<#
.SYNOPSIS
Builds the project, runs tests on each executable, then report the results.
.DESCRIPTION
Runs tests of various Java-based executables to see if they correctly interpret command-line
arguments that contain Unicode characters. This script is designed to only work on Windows.
This script requires a GraalVM JDK to work. Which GraalVM JDK to use can be specified via the
-GraalVMHome parameter or the 'GRAALVM_HOME' environment variable. The parameter takes
precedence. If neither the parameter nor the environment variable are set to a valid GraalVM JDK
installation then this script will fail.
The Windows manifest tool ('mt.exe') must be on the path for this script to work. That tool can
be obtained by installing the Windows SDK.
When building the project, five things happen in the following order:
1) The Java source files under 'src\' are compiled and output to 'out\modules\test'.
2) The Java class files are packaged into a JAR file at 'out\test-1.0.jar'.
3) An application image is created via 'jpackage' and output to 'out\test'.
4) A native image is created and output to 'out\test-1.0.exe'
5) A copy of the native image is created and output to 'out\test-patched-1.0.exe'. The 'mt.exe'
tool is then used to insert 'test.manifest' into the copy.
The build step will be skipped if the 'out' directory already exists. To force a build, either
delete the 'out' directory or pass -Rebuild.
If the build succeeds, or the project was already built, then four tests are executed:
1) JAR file:
java -p out -m test .\Umsatzübersicht.txt
2) Application image:
.\out\test\test.exe .\Umsatzübersicht.txt
3) Native image:
.\out\test-1.0.exe .\Umsatzübersicht.txt
4) Patched native image:
.\out\test-patched-1.0.exe .\Umsatzübersicht.txt
The result of each test will be reported at the end.
.PARAMETER GraalVMHome
Path to the GraalVM JDK to use. If this parameter is not set then the 'GRAALVM_HOME' environment
variable is used. This parameter and/or the aforementioned environment variable must be set to a
valid GraalVM JDK installation.
.PARAMETER Rebuild
Forces a build of the project even if previous build output still exists.
.INPUTS
None
.OUTPUTS
None
#>
param(
[Parameter(Position=0)]
[string]$GraalVMHome,
[switch]$Rebuild
)
###############################################################################
# #
# Setup #
# #
###############################################################################
$ErrorActionPreference = 'Stop'
$ProgressPreference = 'Ignore'
# Ensure operating system is Windows.
if (!$IsWindows) {
Write-Host 'Operating system is not Windows.' -ForegroundColor Red
Exit
}
# Ensure GraalVM home has been configured, either via parameter or environment variable.
if (!$GraalVMHome) {
$GraalVMHome = $env:GRAALVM_HOME
if (!$GraalVMHome) {
Write-Host 'GraalVM location not specified.' -ForegroundColor Red
Exit
}
}
$GraalVMHome = Resolve-Path $GraalVMHome
Write-Host "GraalVM home: " -ForegroundColor Yellow -NoNewline
Write-Host $GraalVMHome -ForegroundColor Cyan
# Resolve tools. Will fail script if they can't be resolved.
$java = Resolve-Path (Join-Path $GraalVMHome 'bin\java.exe')
$javac = Resolve-Path (Join-Path $GraalVMHome 'bin\javac.exe')
$jar = Resolve-Path (Join-Path $GraalVMHome 'bin\jar.exe')
$jpackage = Resolve-Path (Join-Path $GraalVMHome 'bin\jpackage.exe')
$nativeImage = Resolve-Path (Join-Path $GraalVMHome 'bin\native-image.cmd')
<#
Ensure console output is UTF-8.
WARNING: This setting will be permanently changed for the current session. If needed, the session
will need to be restarted or the encoding will need to be manually set to the original
encoding to undo the change.
#>
Write-Host
Write-Host "Setting console output to UTF-8. This will not be undone!" -ForegroundColor Yellow -BackgroundColor DarkRed -NoNewline
Write-Host
& chcp 65001 > NUL
[System.Console]::OutputEncoding = [System.Text.Encoding]::UTF8
###############################################################################
# #
# Helper Functions #
# #
###############################################################################
$HEADER_FOOTER_WIDTH = 80 # Assumed to be even
$HEADER_FOOTER_COLOR = 'DarkYellow'
$RESULT_TEST_NAME_WIDTH = 50
function _PrintHeader {
param([string]$Label)
$pad = ($HEADER_FOOTER_WIDTH / 2) - ($Label.Length / 2) - 1
Write-Host
Write-Host ('#' * $HEADER_FOOTER_WIDTH) -ForegroundColor $HEADER_FOOTER_COLOR
Write-Host ('#' + (' ' * ($HEADER_FOOTER_WIDTH - 2)) + '#') -ForegroundColor $HEADER_FOOTER_COLOR
Write-Host ('#' + (' ' * $pad) + $Label + (' ' * $pad) + '#') -ForegroundColor $HEADER_FOOTER_COLOR
Write-Host ('#' + (' ' * ($HEADER_FOOTER_WIDTH - 2)) + '#') -ForegroundColor $HEADER_FOOTER_COLOR
Write-Host ('#' * $HEADER_FOOTER_WIDTH) -ForegroundColor $HEADER_FOOTER_COLOR
Write-Host
}
function _PrintFooter {
Write-Host
Write-Host ('*' * $HEADER_FOOTER_WIDTH) -ForegroundColor $HEADER_FOOTER_COLOR
Write-Host
}
function _PrintResult {
param([string]$TestName, [bool]$Successful)
$width = $RESULT_TEST_NAME_WIDTH
Write-Host $TestName -ForegroundColor Cyan -NoNewline
Write-Host ('.' * ($width - $TestName.Length)) -NoNewline
if ($Successful) {
Write-Host "SUCCESS" -ForegroundColor Green
} else {
Write-Host "FAILURE" -ForegroundColor Red
}
}
function _RunCommand {
param([string]$Executable, [string[]]$Arguments, [switch]$NoExitOnFailure, [switch]$DiscardOutput)
# log command and arguments
Write-Host "Running command: " -ForegroundColor Yellow -NoNewline
if ($Executable.StartsWith($GraalVMHome)) {
Write-Host "$(Split-Path $Executable -Leaf) " -ForegroundColor Blue -NoNewline
} else {
Write-Host "$Executable " -ForegroundColor Blue -NoNewline
}
foreach ($Argument in $Arguments) {
$FgColor = $Argument.StartsWith('-') ? 'Magenta' : 'Cyan'
Write-host "$Argument " -ForegroundColor $FgColor -NoNewline
}
Write-Host
# execute command
if ($DiscardOutput) {
$proc = Start-Process $Executable $Arguments -NoNewWindow -Wait -PassThru -RedirectStandardOutput 'NUL'
} else {
$proc = Start-Process $Executable $Arguments -NoNewWindow -Wait -PassThru
}
# handle command result
if ($proc.ExitCode -ne 0) {
if ($NoExitOnFailure) {
return $false
} else {
Write-Host "Command failed!" -ForegroundColor Red
Exit
}
} elseif ($NoExitOnFailure) {
return $true
}
}
###############################################################################
# #
# Build & Test Implementation #
# #
###############################################################################
# Build project if requested or previous output does not exist.
$outputExists = Test-Path 'out' -PathType Container
if ($Rebuild -or !$outputExists) {
_PrintHeader 'Building Project'
# Delete previous output if it exists.
if ($outputExists) {
Write-Host 'Deleting previous output.' -ForegroundColor Yellow
Remove-Item 'out' -Recurse
} else {
Write-Host "No previous output." -ForegroundColor Yellow
}
# Compile Java source files.
_RunCommand $javac @('--module-source-path', 'test=src', '-m', 'test', '--module-version', '1.0', '-d', 'out\modules')
# Package Java class files in JAR file.
_RunCommand $jar @('-c', '-f', 'out\test-1.0.jar', '-e', 'com.example.Main', '-C', 'out\modules\test', '.')
# Create application image.
_RunCommand $jpackage @(
'-t', 'app-image',
'-n', 'test',
'--app-version', '1.0',
'-p', 'out',
'-m', 'test',
'-d', 'out',
'--jlink-options', '"--no-header-files --no-man-pages --strip-native-commands --compress zip-0"'
'--win-console')
Set-ItemProperty -Path 'out\test\test.exe' -Name 'IsReadOnly' -Value $false
# Create GraalVM native image.
_RunCommand $nativeImage @('-p', 'out', '-m', 'test', '-Ob', '-o', 'out\test-1.0', '--silent')
# Copy GraalVM native image then patch the copy with the new manifest file.
Copy-Item 'out\test-1.0.exe' 'out\test-patched-1.0.exe'
_RunCommand 'mt.exe' @('-manifest', 'test.manifest', '-outputresource:out\test-patched-1.0.exe;#1') -DiscardOutput
Write-Host "Build successful!" -ForegroundColor Green
_PrintFooter
}
# Build done. Now run tests.
_PrintHeader 'Testing JAR File'
$jarTest = _RunCommand $java @('-p', 'out', '-m', 'test', '.\Umsatzübersicht.txt') -NoExitOnFailure
_PrintFooter
_PrintHeader 'Testing Application Image (jpackage)'
$appTest = _RunCommand '.\out\test\test.exe' @('.\Umsatzübersicht.txt') -NoExitOnFailure
_PrintFooter
_PrintHeader 'Testing Native Image'
$nativeTest = _RunCommand '.\out\test-1.0.exe' @('.\Umsatzübersicht.txt') -NoExitOnFailure
_PrintFooter
_PrintHeader 'Testing Native Image (Patched)'
$patchedNativeTest = _RunCommand '.\out\test-patched-1.0.exe' @('.\Umsatzübersicht.txt') -NoExitOnFailure
_PrintFooter
Write-Host "Results:" -ForegroundColor Yellow
_PrintResult 'JAR File' $jarTest
_PrintResult 'Application Image (jpackage)' $appTest
_PrintResult 'Native Image' $nativeTest
_PrintResult 'Patched Native Image' $patchedNativeTest
Write-Host
Write-Host "TESTING COMPLETE" -ForegroundColor Green
PS D:\graalvm-encoding-tests> ./run-tests
GraalVM home: D:\Program Files\GraalVM\graalvm-jdk-25+37.1
Setting console output to UTF-8. This will not be undone!
################################################################################
# #
# Building Project #
# #
################################################################################
No previous output.
Running command: javac.exe --module-source-path test=src -m test --module-version 1.0 -d out\modules
Running command: jar.exe -c -f out\test-1.0.jar -e com.example.Main -C out\modules\test .
Running command: jpackage.exe -t app-image -n test --app-version 1.0 -p out -m test -d out --jlink-options "--no-header-files --no-man-pages --strip-native-commands --compress zip-0" --win-console
Running command: native-image.cmd -p out -m test -Ob -o out\test-1.0 --silent
Running command: mt.exe -manifest test.manifest -outputresource:out\test-patched-1.0.exe;#1
Build successful!
********************************************************************************
################################################################################
# #
# Testing JAR File #
# #
################################################################################
Running command: java.exe -p out -m test .\Umsatzübersicht.txt
stdout.encoding = UTF-8
stderr.encoding = UTF-8
file.encoding = UTF-8
Path: D:\graalvm-encoding-tests\Umsatzübersicht.txt
Contents:
This is a test file whose name contains Unicode characters. If you're seeing
this text, then that means the program successfully found and read this file.
Hurray!
********************************************************************************
################################################################################
# #
# Testing Application Image (jpackage) #
# #
################################################################################
Running command: .\out\test\test.exe .\Umsatzübersicht.txt
stdout.encoding = UTF-8
stderr.encoding = UTF-8
file.encoding = UTF-8
Path: D:\graalvm-encoding-tests\Umsatzübersicht.txt
Contents:
This is a test file whose name contains Unicode characters. If you're seeing
this text, then that means the program successfully found and read this file.
Hurray!
********************************************************************************
################################################################################
# #
# Testing Native Image #
# #
################################################################################
Running command: .\out\test-1.0.exe .\Umsatzübersicht.txt
stdout.encoding = UTF-8
stderr.encoding = UTF-8
file.encoding = UTF-8
Path: D:\graalvm-encoding-tests\Umsatz�bersicht.txt
java.nio.file.NoSuchFileException: D:\graalvm-encoding-tests\Umsatz�bersicht.txt
at java.base@25/sun.nio.fs.WindowsFileSystemProvider.newByteChannel(WindowsFileSystemProvider.java:231)
at java.base@25/java.nio.file.Files.newByteChannel(Files.java:357)
at java.base@25/java.nio.file.Files.newByteChannel(Files.java:399)
at java.base@25/java.nio.file.Files.readAllBytes(Files.java:2973)
at java.base@25/java.nio.file.Files.readString(Files.java:3043)
at java.base@25/java.nio.file.Files.readString(Files.java:3006)
at test@1.0/com.example.Main.main(Main.java:26)
at java.base@25/java.lang.invoke.LambdaForm$DMH/sa346b79c.invokeStaticInit(LambdaForm$DMH)
********************************************************************************
################################################################################
# #
# Testing Native Image (Patched) #
# #
################################################################################
Running command: .\out\test-patched-1.0.exe .\Umsatzübersicht.txt
stdout.encoding = UTF-8
stderr.encoding = UTF-8
file.encoding = UTF-8
Path: D:\graalvm-encoding-tests\Umsatzübersicht.txt
Contents:
This is a test file whose name contains Unicode characters. If you're seeing
this text, then that means the program successfully found and read this file.
Hurray!
********************************************************************************
Results:
JAR File..........................................SUCCESS
Application Image (jpackage)......................SUCCESS
Native Image......................................FAILURE
Patched Native Image..............................SUCCESS
TESTING COMPLETE
| Test | Executable | Successful |
|---|---|---|
| JAR File | out/test-1.0.jar |
✔️ |
| App Image (jpackage) | out/test/test.exe |
✔️ |
| Native Image | out/test-1.0.exe |
❌ |
| Patched Native Image | out/test-patched-1.0.exe |
✔️ |
Note sdout.encoding, stderr.encoding, and file.encoding are UTF-8 in all cases.
Also note out/test-1.0.exe (the unpatched native image) prints ...\Umsatz�bersicht.txt. That contains a replacement character even though the output is UTF-8. This, along with the "patched" executable working, strongly indicates argument decoding was the primary issue.