javacharacter-encodingwindows-1255

Java subtract value of char code in string


I am trying to convert string to hebrew encoding (windows 1255) so I need to substract from the value of any char 1264 and put here in new string.

this is the code in javascript that I am trying to convert:

strText = strText.replace(/[א-ת]/ig, function(a,b,c) {
        return escape(String.fromCharCode(a.charCodeAt(0)-1264));
    });

And this is what I made in Java but I am not getting the expected value:

String test = "שלום";
byte[] testBytes = test.getBytes();
String testResult = "";
for (int i = 0;i < testBytes.length;i++)
     {
        testResult += (char)((int)testBytes[i]-1264);
     }

What am I doing wrong?


Solution

  • As you are using a byte array, the maximum number that can be stored is 255, and the minimum is 0, so it can only store extended ASCII characters (afaik it doesn't cover hebrew characters). What you need is a char array (can store any unicode character).

    So, change this

    byte[] testBytes = test.getBytes();
    

    to this

    char[] testBytes = test.toCharArray();