cbit-manipulationhuffman-codebinary-operators

Bit bufferer doesn't behave as expected in C


I have to write individual Bits to a file (for huffman code). To do so I send bits to a function which buffers them until a byte is filled and then returns the byte. I can´t figure out why it doesn't work, the function outputs wrong bytes (but no errors). Here is the bitsBuffer function:

// Buffers bits until one byte is filled, the returns it and return code turns positive
int bitsBuffer(char inByte, char nBits, char* outByte, char finish)
{
    int i;
    static short unsigned buffer = 0;
    static char unsigned offset = 0;

    // Very readable way to add nBits Bits to buffer
    buffer |= (inByte & ~(~0 << nBits)) << offset;
    offset += nBits;
    if (offset >= 8) {
        *outByte = (char)buffer;
        buffer >>= 8;
        offset -= 8;
        return 1;
    }
    if (finish) {
        buffer = 0;
        if (offset) {
            *outByte = (char)buffer;
            offset = 0;
            return 1;
        }
        offset = 0;
        return 0;
    }
    return 0;
}

I use this program to test the bitbuffer an then I pipe the output to xxd -b to view the bits:

#include "bitsHandler.h"
#include <stdio.h>

int main()
{
    char a[] = { 0b0110, 0b1, 0b100010, 0b111, 0b100110, 0b0 };
    char b[] = { 4, 1, 6, 3, 6, 1 };
    char c[100];
    int counter = 0;
    for (int i = 0; i < 6; i++) {
        if (bitsBuffer(a[i], b[i], &c[counter], 0)) {
            counter++;
        }
    }
    if (bitsBuffer(0, 0, &c[counter], 1)) {
        counter++;
    }
    fwrite(c, sizeof(char), counter, stdout);
}

I replicated the function on paper (went through every step by hand) and I can't find my mistake. Help is appreciated.

EDIT: Expected output (piped through xxd, should be identical to a[] array):

00000000: 01101100 01011110 01100000                             V..

Actual output:

00000000: 01010110 10111100 00001001                             V..

Solution

  • Apart from the fact that your bit order is gong to be a bit weird because you add new bits to the top of the buffer and take them off of the bottom, you have a bug in the finish path.

        if (finish) { 
            buffer = 0; // <<<==== This is a bug I think
            if (offset) {
                *outByte = (char)buffer;
                offset = 0;
                return 1;
            }
            offset = 0;
            return 0;
        }
    

    When I dry run your test data, when your code hits the line marked "this is a bug I think" that zaps buffer there are still bits in it.Namely 0b01001 and offset is 5. Those last five bits are never getting written to the output.

    As far as the bit order is concerned, I think it could be fixed by shifting buffer left by nBits, adding the new bits into the bottom and, if you have eight or more bits taking the top eight bits for your next output byte.

    This is probably about right. It gives your expected output

    int bitsBuffer2(char inByte, char nBits, char* outByte, char finish)
    {
        int i;
        static short unsigned buffer = 0;
        static char unsigned totalBits = 0;
    
        // Note: nBits is assumed to b <= 8 !
        buffer <<= nBits; // Make space for the bits
        buffer |= inByte & ~(~0u << nBits); // Put the bits in the buffer
        totalBits += nBits;
        if (totalBits >= 8) {
            // Slice the top 8 bits off the buffer
            totalBits -= 8;
            *outByte = (char)(buffer >> totalBits); // Note: Shift before cast
            buffer &= ~(~0u << totalBits);
            return 1;
        }
        if (finish) {
            if (totalBits > 0) { // Test if anything is in the buffer
                // Pack the remaining bits at the top of the byte
                *outByte = (char)(buffer << (8 - totalBits));
                // Zap everything
                totalBits = 0;
                buffer = 0;
                return 1;
            }
            return 0;
        }
        return 0;
    }
    

    It might not be 100% correct if finish == 1 and a lot of bits in the buffer