javahashmessage-digestpassword-policy

MessageDigest.digest() returning same hash for different strings having Norwegian character


I am calling MessageDigest.digest() method to get the hash of the password. If the password contains a Norwegian character e.g. 'ø', this method returns same hash for different strings with different last character. "Høstname1" and "Høstname2" have same hash but "Hostnøme1" will have a different hash as 'ø' location is different. This is with "utf-8" encoding. For "iso-8859-1" encoding, I am not seeing this issue. Is this a known problem or am I missing something here?

This is my code:

    import java.security.MessageDigest;

    String password = "Høstname1";
    String salt = "6";

    MessageDigest messageDigest = MessageDigest.getInstance("SHA-256");
    byte[] hash = new byte[40];
    messageDigest.update(salt.getBytes("utf-8"), 0, salt.length());
    messageDigest.update(password.getBytes("utf-8"), 0, password.length());
    hash = messageDigest.digest();

Solution

  • You shouldn't pass the length of the string to messageDigest.update

    messageDigest.update(password.getBytes("utf-8"), 0, password.length());
    

    but the length of the byte array since the utf-8 encoded string usually has more bytes than the number of characters in the string:

    byte[] pwd = password.getBytes("utf-8");
    messageDigest.update(pwd, 0, pwd.length);
    

    or even shorter (thanks @Matt)

    messageDigest.update(password.getBytes("utf-8"));
    

    Same for salt.

    Therefore your code was only hashing the beginning of the password.