I am calling MessageDigest.digest()
method to get the hash of the password.
If the password contains a Norwegian character e.g. 'ø'
, this method returns same hash for different strings with different last character.
"Høstname1"
and "Høstname2"
have same hash but "Hostnøme1"
will have a different hash as 'ø'
location is different. This is with "utf-8" encoding. For "iso-8859-1" encoding, I am not seeing this issue. Is this a known problem or am I missing something here?
This is my code:
import java.security.MessageDigest;
String password = "Høstname1";
String salt = "6";
MessageDigest messageDigest = MessageDigest.getInstance("SHA-256");
byte[] hash = new byte[40];
messageDigest.update(salt.getBytes("utf-8"), 0, salt.length());
messageDigest.update(password.getBytes("utf-8"), 0, password.length());
hash = messageDigest.digest();
You shouldn't pass the length of the string to messageDigest.update
messageDigest.update(password.getBytes("utf-8"), 0, password.length());
but the length of the byte array since the utf-8 encoded string usually has more bytes than the number of characters in the string:
byte[] pwd = password.getBytes("utf-8");
messageDigest.update(pwd, 0, pwd.length);
or even shorter (thanks @Matt)
messageDigest.update(password.getBytes("utf-8"));
Same for salt
.
Therefore your code was only hashing the beginning of the password.