We read an important parameter as vm argument and it is a path to a file. Now, users are using vm argument with some korean characters (folders have been named with korean characters) and the program started to break since the korean characters are read as question marks! The below experiment shows the technical situation.
I tried to debug a program in eclipse and in "Debug Configurations" under "arguments" tab in "VM arguments", I gave the input like this
-Dfilepath=D:\XXXX\카운터
But when I read it from the program like this
String filepath = System.getProperty("filepath");
I get the output with question marks like below.
D:\XXXX\???
I understand that eclipse debug GUI uses the right encoding (?) to display the right characters, But when the value is read in program it uses different encoding which is not able to read the characters properly.
what is the default encoding does java uses to read vm arguments supplied to it?
How to change the encoding in eclipse so that the program reads the characters properly ?
My conclusion is the conversion depended on default encoding(Windows setting "Language for non-Unicode programs") Here is the program for testing:
package test;
import java.io.FileOutputStream;
public class Test {
public static void main(String[] args) throws Exception {
StringBuilder sb = new StringBuilder();
sb.append("[카운터] sysprop=[").append(System.getProperty("cenv"));
if (args.length > 0) {
sb.append("], cmd args=[").append(args[0]);
}
sb.append("], file.encoding=").append(System.getProperty("file.encoding"));
FileOutputStream fout = new FileOutputStream("/testout");
fout.write(sb.toString().getBytes("UTF-8"));
fout.close();//write result to a file instead of System.out
//Thread.sleep(10000);//For checking arguments using Process Explorer
}
}
Exceute in command prompt: java -Dcenv=카운터 test.Test 카운터
(Korean chars are correct when I verify the arguments using Process Explorer)
Result:
[카운터] sysprop=[카운터], cmd args=[카운터], file.encoding=MS949
Exceute in command prompt(paste from clipboard): java -Dcenv=카운터 test.Test 카운터
(I cannot see Korean chars in command windows. However, Korean chars are correct when I verify the arguments using Process Explorer)
Result:
[카운터] sysprop=[???], cmd args=[???], file.encoding=MS950
Launch from Eclipse by setting Program arguments and VM arguments (The command line in Process Explorer is C:\pg\jdk160\bin\javaw.exe -agentlib:jdwp=transport=dt_socket,suspend=y,address=localhost:50672 -Dcenv=카운터 -Dfile.encoding=UTF-8 -classpath S:\ws\wtest\bin test.Test 카운터
This is the same as you see in the Properties dialog of Eclipse Debug view)
Result:
[카운터] sysprop=[???], cmd args=[bin], file.encoding=UTF-8
[碁石] sysprop=[碁石], cmd args=[碁石], file.encoding=MS949
[碁石] sysprop=[碁石], cmd args=[碁石], file.encoding=MS950
[碁石] sysprop=[碁石], cmd args=[碁石], file.encoding=UTF-8
[鈥焢] sysprop=[??], cmd args=[??], file.encoding=MS949
[鈥焢] sysprop=[鈥焢], cmd args=[鈥焢], file.encoding=MS950
[鈥焢] sysprop=[鈥焢], cmd args=[鈥焢], file.encoding=UTF-8
[宽广] sysprop=[??], cmd args=[??], file.encoding=MS949
[宽广] sysprop=[??], cmd args=[??], file.encoding=MS950
[宽广] sysprop=[??], cmd args=[??], file.encoding=UTF-8
java -Dcenv=宽广 test.Test 宽广
in command prompt
Result:
[宽广] sysprop=[宽广], cmd args=[宽广], file.encoding=GBK
During testing, I always check the command line via Process Explorer, and make sure all chars are correct.
However, the command argument chars are converted using default encoding before invoke main(String[] args) of Java class
. If one of char does not exist in the charset of default encoding, the program will get unexpected argument.
I'm not sure the problem is caused by java.exe/javaw.exe or Windows. But passing non-ASCII parameter via command arguments is not a good idea.
BTW, I also try to execute the command via .bat file(file encoding is UTF-8). Maybe someone is interest,
The command line in Process Explorer is java -Dcenv=移댁슫?? test.Test 移댁슫??
(The Korean chars are collapsed)
Result:
[카운터] sysprop=[移댁슫??], cmd args=[移댁슫??], file.encoding=MS949
Add another VM arguments. The command line in Process Explorer is java -Dfile.encoding=UTF-8 -Dcenv=移댁슫?? test.Test 移댁슫??
(The Korean chars are collapsed)
Result:
[카운터] sysprop=[移댁슫??], cmd args=[移댁슫??], file.encoding=UTF-8
The command line in Process Explorer is java -cp s:\ws\wtest\bin -Dcenv=儦渥?? test.Test 儦渥??
(The Korean chars are collapsed)
Result:
[카운터] sysprop=[儦渥??], cmd args=[儦渥??], file.encoding=MS950