I came across a JPEG parsing function in Node.js that I'm attempting to adapt for use in a browser environment. The original code can be found here.
The original code uses Node.js' Buffer class. As i would like to use it it for a browser environment we have to use the DataView.getUint16(0, false /* big endian */)
instead of buffer.readUInt16BE(0) /*BE = big endian */
Interestingly, DataView is also available in NodeJs, so the result could be cross environement.
Here what I found so far :
j
starting from 4 helps get the correct offset for the first iteration, as the buffer 4 first bytes are sliced : let j=4 // match the buffer slicing above
j+=2; // match the buffer slicing below ( i + 2 )
buffer = buffer.slice(i + 2); // Buffer is sliced of two bytes, 0 offset is now 2 bytes further ?
Here is the function with logging added
function calculate (buffer) {
// Skip 4 chars, they are for signature
buffer = buffer.slice(4);
let j=4 // match the buffer slicing above
let aDataView=new DataView(buffer.buffer);
var i, next;
while (buffer.length) {
// read length of the next block
i = buffer.readUInt16BE(0);
console.log("i="+i,"read="+aDataView.getUint16(j,false));
j+=2; // match the buffer slicing below ( i + 2 )
// ensure correct format
validateBuffer(buffer, i);
// 0xFFC0 is baseline standard(SOF)
// 0xFFC1 is baseline optimized(SOF)
// 0xFFC2 is progressive(SOF2)
next = buffer[i + 1];
if (next === 0xC0 || next === 0xC1 || next === 0xC2) {
return extractSize(buffer, i + 5);
}
// move to the next block
buffer = buffer.slice(i + 2);
}
throw new TypeError('Invalid JPG, no size found');
}
Actual result on this image:
node .\start.js
i=16 read=16 # Seems to be the correct offset
i=91 read=19014 # Wrong offset
i=132 read=18758
My debbuging steps are so far:
Installed buffer-image-size from npm
npm install buffer-image-size --save
Wrote start.js as the following
var sizeOf = require('buffer-image-size');
const fs = require('fs');
fileBuffer = fs.readFileSync("flowers.jpg");
var dimensions = sizeOf(fileBuffer);
console.log(dimensions.width, dimensions.height);
Edited "node_modules\buffer-image-size\lib\types\jpg.js" adding mentioned lines and logging
Do you have any hint about
j
does no helps to get the correct offset.I appreciate any insights or guidance on resolving this issue. Thank you!
Yeah, avoid to both advance offsets and re-slice the buffer, it only gets confusing. I would write
function calculate(typedArray) {
const view = new DataView(typedArray.buffer, typedArray.byteOffset, typedArray.byteLength);
let i = 0;
// Skip 4 chars, they are for signature
i += 4;
while (i < view.byteLength) {
// read length of the next block
const blockLen = view.getUint16(i, false /* big endian */);
// ensure correct format
// index should be within buffer limits
if (i + blockLen > view.byteLength) {
throw new TypeError('Corrupt JPG, exceeded buffer limits');
}
// Every JPEG block must begin with a 0xFF
if (view.getUint8(i + blockLen) !== 0xFF) {
throw new TypeError('Invalid JPG, marker table corrupted');
}
// 0xFFC0 is baseline standard(SOF)
// 0xFFC1 is baseline optimized(SOF)
// 0xFFC2 is progressive(SOF2)
const next = view.getUint8(i + blockLen + 1);
if (next === 0xC0 || next === 0xC1 || next === 0xC2) {
return extractSize(view, i + blockLen + 5);
}
// move to the next block
i += blockLen + 2;
}
throw new TypeError('Invalid JPG, no size found');
}
Notice that this code, which is a straightforward translation of the source, is slightly confusing and buggy:
i
does not point to the start of the segment, but rather two bytes into the segment (after the marker)RangeError
exceptions from accessing bytes beyond the end of the buffer, as it only checks for the past block to be within limits