I'm writing a Java implementation of Bitcoin's BIP39. So far, my code is able to produce a randomly generated mnemonic phrase correctly. However, when converting the 12 word Mnemonic Phrase into a 512-bit seed, the resulting value does not match with results from Ian Coleman's BIP39 Tool.
To start, a SecureRandom object generates a random 512-bit Entropy value (ENT). The ENT value is hashed using SHA256 to compute the checksum value (CS), which is the first 4 bits of the hash. The checksum is concatenated to end of ENT to give ENT_CS. ENT_CS is split into sections of 11 bits each, and the corresponding integer value of the 11 bits is used as the index number to obtain a word from the Word List. This generates my 12 word Mnemonic Phrase. So far, all steps up to this point match the expected results from aforementioned BIP39 Tool.
To create the Seed, I am using PBKDF2 with HmacSHA512, setting the iterations to 2048, and the key-size to 512 bits (64 bytes). I have tested my implementation of PBKDF2 against these Test Vectors, Google's "crypto" package implementation, and NovaCrypto's Java BIP39 implementation. The mnemonic words, excluding separators, is used as input along with a salt of "mnemonic"+password
as per the Bitcoin Core BIP39 Specifications.
public static byte[] PBKDF2(String mnemonic, String salt) {
try {
byte[] fixedSalt = ("mnemonic"+salt).getBytes(StandardCharsets.UTF_8);
KeySpec spec = new PBEKeySpec(mnemonic.toCharArray(), fixedSalt, 2048, 512);
SecretKeyFactory f = SecretKeyFactory.getInstance("PBKDF2WithHmacSHA512");
return f.generateSecret(spec).getEncoded();
} catch (NoSuchAlgorithmException | InvalidKeySpecException ex) {
throw new RuntimeException(ex);
}
}
public static String[] generateMnemonic() {
// Generate 128-bit Random Number for Entropy
byte[] ENT = getEntropy();
// Hash the Entropy value
byte[] HASH = SHA256(ENT);
// Copy first 4 bits of Hash as Checksum
boolean[] CS = Arrays.copyOfRange(bytesToBits(HASH), 0, 4);
// Add Checksum to the end of Entropy bits
boolean[] ENT_CS = Arrays.copyOf(bytesToBits(ENT), bytesToBits(ENT).length + CS.length);
System.arraycopy(CS, 0, ENT_CS, bytesToBits(ENT).length, CS.length);
// Split ENT_CS into groups of 11 bits and creates String array for
// mnemonicWords
String[] mnemonicWords = new String[12];
for (int i = 0; i < 12; i++) {
boolean[] numBits = Arrays.copyOfRange(ENT_CS, i * 11, i * 11 + 11);
mnemonicWords[i] = wordList.get(bitsToInt(numBits));
}
return mnemonicWords;
}
// Returns randomly generated, 16-byte number
public static byte[] getEntropy() {
byte[] ent = new byte[16];
sr.nextBytes(ent);
return ent;
}
// Returns bit representation of byte array
public static boolean[] bytesToBits(byte[] data) {
boolean[] bits = new boolean[data.length * 8];
for (int i = 0; i < data.length; ++i)
for (int j = 0; j < 8; ++j)
bits[(i * 8) + j] = (data[i] & (1 << (7 - j))) != 0;
return bits;
}
// Returns hex string from byte array
private final static char[] hexArray = "0123456789ABCDEF".toCharArray();
public static String bytesToHex(byte[] bytes) {
char[] hexChars = new char[bytes.length * 2];
for (int j = 0; j < bytes.length; j++) {
int v = bytes[j] & 0xFF;
hexChars[j * 2] = hexArray[v >>> 4];
hexChars[j * 2 + 1] = hexArray[v & 0x0F];
}
return new String(hexChars);
}
// Returns SHA256 hash of input data
public static byte[] SHA256(byte[] data) {
try {
MessageDigest digest = MessageDigest.getInstance("SHA-256");
System.out.println(Arrays.toString(data));
return digest.digest(data);
} catch (NoSuchAlgorithmException ex) {
throw new RuntimeException(ex);
}
}
// Returns int value of a bit array
public static int bitsToInt(boolean[] bits) {
int n = 0, l = bits.length;
for (int i = 0; i < l; ++i) {
n = (n << 1) + (bits[i] ? 1 : 0);
}
return n;
}
// Generate Mnemonic Words, Mnemonic Phrase, and Seed
String[] mnemonicWords = generateMnemonic();
String mnemonicPhrase = "";
for (String word : mnemonicWords)
mnemonicPhrase += word;
byte[] seed = PBKDF2(mnemonicPhrase, "");
System.out.println("Seed: " + bytesToHex(seed));
My Program Trial
Entropy (hex): 3CCB62D9AF76F1E8DB113E66B2D84656
Checksum bits: 1100
Raw Binary: 00111100110 01011011000 10110110011 01011110111 01101111000 11110100011 01101100010 00100111110 01100110101 10010110110 00010001100 1010110
Mnemonic: devote force reopen galaxy humor virtual hobby chief grit nothing bag pulse
Seed: 013FFA714C57AA26C59DC215880D9C2398A8B38D10D7E41A882CF98C35976F0BF26BCC08B0B196945DE8778C7FD561FB0F20A8B9BAD46B12196C963A85E3B40E
Expected Results (Derived from same Entropy)
Entropy (hex): 3CCB62D9AF76F1E8DB113E66B2D84656
Checksum bits: 1100
Raw Binary: 00111100110 01011011000 10110110011 01011110111 01101111000 11110100011 01101100010 00100111110 01100110101 10010110110 00010001100 1010110
Mnemonic: devote force reopen galaxy humor virtual hobby chief grit nothing bag pulse
Seed: 0c3c5f9ae724a2a3ed70aeb24919c10506e4962223a5375f70164be8b897d615ec9bf9f3e64a889cff03318cc5d0b3c8378ba0264d198e307c609632016ddd01
Looks like I was able to answer my own question. In my program, I was concatenating the seed words without spaces using
String mnemonicPhrase = "";
for (String word : mnemonicWords)
mnemonicPhrase += word;
But this is not the correct format, as the spaces are to be included. Changing this code block to add spaces:
String mnemonicPhrase = "";
for(int i=0; i<mnemonicWords.length; i++) {
mnemonicPhrase += mnemonicWords[i];
if(i < mnemonicWords.length-1) mnemonicPhrase += " ";
}
yields expected Test Vector results published here using a password of "TREZOR".