I am using crypto-js to calculated the MD5 checksum for my file before uploading, below is my code.
import CryptoJS from "crypto-js";
const getMd5 = async (fileObject) => {
let md5 = "";
try {
const fileObjectUrl = URL.createObjectURL(fileObject);
const blobText = await fetch(fileObjectUrl)
.then((res) => res.blob())
.then((res) => new Response(res).text());
const hash = CryptoJS.MD5(CryptoJS.enc.Latin1.parse(blobText));
md5 = hash.toString(CryptoJS.enc.Hex);
} catch (err) {
console.log("Error occured getMd5:", err);
}
return md5;
};
Above code is working fine for text files only but while working with non text files file images, videos etc., the checksum is calculated incorrectly.
Any help/input is appreciated. Thanks!
Response.text()
reads the response stream and converts it to a string using a UTF-8 encoding. Arbitrary binary data that is not UTF-8 compliant will be corrupted in this process (e.g. images, videos, etc.), s. also the other answer.
This is prevented by using Response.arrayBuffer()
instead, which simply stores the data unchanged in an ArrayBuffer
.
Since CryptoJS works internally with WordArray
s, thus a further conversion of the ArrayBuffer
into a WordArray
is necessary.
The following fix works on my machine:
(async () => {
const getMd5 = async(fileObject) => {
let md5 = "";
try {
const fileObjectUrl = URL.createObjectURL(blob);
const blobText = await fetch(fileObjectUrl)
.then((res) => res.blob())
.then((res) => new Response(res).arrayBuffer()); // Convert to ArrayBuffer
const hash = CryptoJS.MD5(CryptoJS.lib.WordArray.create(blobText)); // Import as WordArray
md5 = hash.toString(CryptoJS.enc.Hex);
} catch (err) {
console.log("Error occured getMd5:", err);
}
return md5;
};
const blob = new Blob([new Uint8Array([0x01, 0x02, 0x03, 0x7f, 0x80, 0x81, 0xfd, 0xfe, 0xff])]);
console.log(await(getMd5(blob)));
})();
<script src="https://cdnjs.cloudflare.com/ajax/libs/crypto-js/4.0.0/crypto-js.min.js"></script>
For simplicity, I did not use a file object for the test, but a blob object with data that is not UTF8 compliant. The generated hash is correct and can be verified online e.g. here