I'm trying to convert a specific implementation of a CRC32 generator made in C# into JavaScript for a static website.
It combines two strings into one, converts that into one, generates a CRC out of it and converts that into a hexadecimal output.
Problem is, the output of the JavaScript version (CF2F4221) does not match the C# version (67CDF980).
C# Code:
using System;
using System.Text;
namespace CRC32.Generator
{
internal class CRC
{
public static readonly uint[] Table;
private uint _value = 0xFFFFFFFF;
static CRC()
{
Table = new uint[256];
const uint kPoly = 0xEDB88320;
for (uint i = 0; i < 256; i++)
{
uint r = i;
for (int j = 0; j < 8; j++)
if ((r & 1) != 0)
r = (r >> 1) ^ kPoly;
else
r >>= 1;
Table[i] = r;
}
}
public void Init()
{
_value = 0xFFFFFFFF;
}
public void UpdateByte(byte b)
{
_value = Table[(((byte) (_value)) ^ b)] ^ (_value >> 8);
}
public void Update(byte[] data, uint offset, uint size)
{
for (uint i = 0; i < size; i++)
_value = Table[(((byte) (_value)) ^ data[offset + i])] ^ (_value >> 8);
}
public uint GetDigest()
{
return _value ^ 0xFFFFFFFF;
}
}
class Program
{
static void Main(string[] args)
{
string string1 = "TestString1";
string string2 = "TestString2";
string stringData = string1 + string2;
byte[] data = Encoding.Unicode.GetBytes(stringData);
CRC crc = new CRC();
crc.Update(data, 0, (uint)data.Length);
uint crc32Value = crc.GetDigest();
byte[] bytes = BitConverter.GetBytes(crc32Value);
Array.Reverse(bytes);
uint reversedCrc32Value = BitConverter.ToUInt32(bytes, 0);
Console.WriteLine(reversedCrc32Value.ToString("X"));
}
}
}
JavaScript Code:
// CRC32 Table
const crc32Table = new Array(256);
for (let i = 0; i < 256; i++) {
let crc = i;
for (let j = 0; j < 8; j++) {
crc = (crc & 1) ? (crc >>> 1) ^ 0xEDB88320 : crc >>> 1;
}
crc32Table[i] = crc;
}
// CRC32 Function
function crc32(data) {
let crc = 0xFFFFFFFF;
for (let i = 0; i < data.length; i++) {
crc = (crc >>> 8) ^ crc32Table[(crc ^ data[i]) & 0xFF];
}
return (crc ^ 0xFFFFFFFF) >>> 0; // Unsigned right shift
}
// Generate CRC
function generateCRC() {
let string1 = document.getElementById('string1').value;
let string2 = document.getElementById('string2').value;
let stringData = string1 + string2;
// Convert the string to a UTF-16LE byte array
let encoder = new TextEncoder('utf-16le');
let data = encoder.encode(stringData);
let crc32Value = crc32(new Uint8Array(data.buffer));
document.getElementById('output').textContent = crc32Value.toString(16).toUpperCase(); // Convert to hexadecimal
}
From the research that I've done, I understand that the discrepancies I'm seeing between the C# and JavaScript outputs are likely due to differences in how the two languages handle bitwise operations, endianness, and possibly character encoding.
I'm hoping that someone who has more knowledge about those things can point me in the right direction, as admittedly this is outside my comprehension.
The C# code is computing the CRC-32 on the little-endian UTF-16 representation of the string, which are the ASCII bytes in the string each followed by a zero byte, 44 bytes total. That CRC is 0x80f9cd67
. The C# code reverses those bytes to show the result 0x67CDF980
.
The JavaScript code is correctly computing the CRC-32, but doing so directly on the 22 ASCII or UTF-8 bytes, which gives 0xcf2f4221
(note not byte reversed).
I can find no evidence that TextEncoder()
supports the 'utf-16le'
argument, or any arguments at all. As far as I can tell, it only supports encoding to UTF-8, which is what it's doing in this case. TextEncoder()
most unfortunately does report an error, no matter what its argument is.
You could instead use charCodeAt()
to get each UTF-16 value directly from the string, and place them manually in little-endian order in a byte array, two bytes per character, in order to replicate what the C# code is providing to the CRC computation.
This will compute the CRC directly on the string, giving the result of the C# code:
// CRC32 Function
function crc32(string) {
let crc = 0xFFFFFFFF;
for (let i = 0; i < string.length; i++) {
crc ^= string.charCodeAt(i);
crc = (crc >>> 8) ^ crc32Table[crc & 0xFF];
crc = (crc >>> 8) ^ crc32Table[crc & 0xFF];
}
return (crc ^ 0xFFFFFFFF) >>> 0; // Unsigned right shift
}
That returns the integer 2163854695 or 0x80f9cd67 for the string "TestString1TestString2"
. You can then convert the number to hexadecimal and reverse the bytes as needed.