javaeclipsecharacter-encodingwindows-console

How to read accented letters from terminal in Java?


I have the following Java snippet:

System.out.print("What is the first name of the Hungarian poet Petőfi? ");
String correctAnswer = "Sándor";
Scanner sc = new Scanner(System.in);
String answer = sc.next();
sc.close();
if (correctAnswer.equals(answer)) {
    System.out.println("Correct!");
} else {
    System.out.println("The answer (" + answer + ") is incorrect, the correct answer is " + correctAnswer);
}

This works fine in Eclipse, but does not work in Windows terminal: even though I enter the correct answer Sándor, the comparison fails. This is how it looks like in Eclipse:

What is the first name of the Hungarian poet Petőfi? Sándor
Correct!

The same from command line:

What is the first name of the Hungarian poet Petőfi? Sándor
The answer (S?ndor) is incorrect, the correct answer is Sándor

What I tried without success are the following:

I double-checked: the encoding of the Java source file is UTF-8.

When converting to bytes (Arrays.toString(input.getBytes())) I experience the following:

So to narrow down to the letter á we have the following:

It works in Git Bash, but the letter ő (and all the other accented characters, not just this one) is incorrectly displayed in that terminal:

What is the first name of the Hungarian poet Pet▒fi? Sándor
Correct!

It is strange that even the comparison works, and entering the accented characters looks fine, repeated displaying the same does not work:

What is the first name of the Hungarian poet Pet▒fi? Péter
The answer (P▒ter) is incorrect, the correct answer is S▒ndor

The following helped in Windows terminal:

Console console = System.console();
String answer = console.readLine();

But this does not work in Eclipse:

What is the first name of the Hungarian poet Petőfi? Sándor
The answer (Sándor) is incorrect, the correct answer is Sándor

UPDATE: it seems it depends on the system settings. I have 2 laptops, one of Hungarian and the other of English settings.

My Java version is 22.0.2, but the problem does not seem version-specific.

As a cross-check I tried the same in Python, and it works fine both in the Windows terminal and also in IDE without any problem:

answer = input('What is the first name of the Hungarian poet Petőfi? ')
correct_answer = 'Sándor'
if answer == correct_answer:
    print('Correct')
else:
    print('The answer (' + answer + ') is incorrect, the correct answer is ' + correct_answer)

So my question is: how to make it work? Is there an universal solution which works in both Windows terminal and Eclipse?


Solution

  • Scanner scanner = new Scanner(System.in, System.out.charset());
    

    This solution works with Java 18+. This works both in Eclipse with default settings and in Windows command prompt having code page 852. Checking the code page:

    chcp
    

    Changing it to 852:

    chcp 852
    

    Thanks for everyone who helped reaching the solution!