.netvb.netbit-manipulation

Bug discovered in function for reading and summing bits


I have written a function in VB.Net (.NET Framework 4.8) that reads a specific number of bits from a byte array starting from a given position and sums the result. Example:
NextBits({255, 255, 255}, 0, 24) returns 765 because 255 + 255 + 255.

If a few bits are read and these cross a byte boundary, then a new byte is returned with the specified bits as the most significant bits. Example:
NextBits({0, 15, 200}, 12, 5) returns 248 because 5 bits are read from 0000 0000 0000 1111 1000 0000, and these become 1111 1000 (248).

I originally copied the function from a Stackoverflow answer that was written in C. In that version, a result array was passed by reference to the function – however, I need the sum of the result bytes. The function seemed to have several bugs, so I made two improvements:

  1. If startposition + numBits is still within a single byte, I call another function. This works, but I don't find it very elegant.
  2. An intermediate result was different in VB.net compared to C, so I added Math.Min(). In C, tmp |= tmp2 >> (8 - offset); resulted in 0 in certain cases, but in VB.net it was 1, so I added Math.Min(). (C behaves differently and I had to make a few adjustments).

The problem now is that I discovered a bug yesterday: In the test case, NextBits({48, 2, 250}, 0, 24), it should return 300 because all 3 bytes are read (48 + 2 + 250), but the result is 254. This is because the Math.Min() that I added is causing an issue. All previous test cases worked correctly.

Edit October 8: I've just determined that the issue is actually in the line If CUInt(tmp) << CInt(8UI - CUInt(offset)) > 255UI Then. In the last test case, the code enters the If block 3 times, whereas in the other test cases it goes into the Else once and then into the If block twice.

I made a test project:

Option Strict On
Module Module1

    Sub Main()
        ' 1111 1111  1111 1111  1111 1111
        '--------------------------------
        ' 255           255         255

        Dim nextBitsResult1 As UInteger = NextBits({255, 255, 255}, 0, 23)
        Debug.WriteLine("NextBits 764 = " & nextBitsResult1.ToString())


        ' 0000 0000  0000 0001  0000 0010
        '--------------------------------
        '         0          1          2

        Dim nextBitsResult2 As UInteger = NextBits({0, 1, 2}, 0, 24)
        Debug.WriteLine("NextBits 3 = " & nextBitsResult2.ToString())


        ' 0000 0000  0000 1111  1100 1000
        '                 ____  _
        '--------------------------------
        '         0         15        200

        Dim nextBitsResult3 As UInteger = NextBits({0, 15, 200}, 12, 5)
        Debug.WriteLine("NextBits 248 = " & nextBitsResult3.ToString())


        ' 0000 0000  0001 1111  0000 0000
        '               _ ____
        '--------------------------------
        '         0         31          0

        'Dim nextBitsResult4 As UInteger = NextBits({0, 31, 0}, 11, 5)
        'Debug.WriteLine("NextBits 31 = " & nextBitsResult4.ToString())


        ' 0011 0000  0000 0010  1111 1010
        '--------------------------------
        '        48          2        250

        Dim nextBitsResult5 As UInteger = NextBits({48, 2, 250}, 0, 24)
        Debug.WriteLine("NextBits 300 = " & nextBitsResult5.ToString())
    End Sub

    ''' <summary>
    ''' Reads the next <b>n</b> bits from a byte array starting at a specified bit position and sums their values.<br></br>
    ''' If n bits are to be read across bytes, a new byte is returned, such that the bits of the last byte are considered the leftmost
    ''' (most significant) bits of the result byte.<br></br>
    ''' Serves as a look ahead.
    ''' </summary>
    ''' <param name="byteArray"></param>
    ''' <param name="startPosition">The start position (0-based) in bits from which to begin reading.</param>
    ''' <param name="numBits">The number of bits to read from the byte array.</param>
    ''' <param name="dest">To specify a position in a temporary output array. This parameter is not relevant for end users.</param>
    ''' <returns></returns>
    Friend Function NextBits(byteArray() As Byte, startPosition As Integer, numBits As Integer, Optional dest As Integer = 0) As UInt32
        ' https://stackoverflow.com/a/50899946

        If IsWithinOneByte(startPosition, numBits) Then
            Dim idx As Integer = startPosition \ 8
            'Return AnotherFunction(byteArray(idx), startPosition, numBits)
        End If

        Dim bitmask As Integer = -128
        Dim len As Integer = numBits
        Dim b As Byte() = New Byte(byteArray.Length - 1) {}

        While len > 0
            Dim idx As Integer = startPosition \ 8
            Dim offset As Integer = startPosition Mod 8
            Dim tmp As Byte = byteArray(idx) << offset
            Dim next_bits As Integer = offset + len - 8

            If len > 8 Then
                next_bits += 1
            End If

            If next_bits < 0 Then
                ' Don't even need all of the current byte -> remove trailing bits
                tmp = CByte(tmp And (bitmask >> (len - 1)))
            ElseIf next_bits > 0 Then
                ' Need to include part of next byte
                Dim tmp2 As Byte = CByte(byteArray(idx + 1) And (bitmask >> (next_bits - 1)))
                If offset <> 0 Then
                    tmp = tmp Or (tmp2 >> (8 - offset))
                Else
                    tmp = Math.Min(tmp, tmp2 >> (8 - offset))
                End If
            End If

            ' Determine byte index and offset in output byte array
            idx = dest \ 8
            offset = dest Mod 8
            b(idx) = b(idx) Or tmp << (8 - offset)
            If CUInt(tmp) << CInt(8UI - CUInt(offset)) > 255UI Then ' In the original C code, this check is redundant because if the value exceeds 255, the assignment b(idx + 1) = 0 effectively handles the overflow. (In C, there are no exceptions)
                If idx + 1 < b.Length Then ' The original C code did write an array index too far.
                    b(idx + 1) = 0
                End If
            Else
                If idx + 1 < b.Length Then
                    b(idx + 1) = b(idx + 1) Or (tmp << (8 - offset))
                End If
            End If

            ' Update start position and length for next pass
            If len > 8 Then
                len -= 8
                dest += 8
                startPosition += 8
            Else
                len -= len
            End If

        End While

        Dim ret As UInt32 = CUInt(b.Sum(Function(x As UInt32) x))
        Return ret
    End Function

    Private Function IsWithinOneByte(startPosition As Integer, numBits As Integer) As Boolean
        Dim startByte As Integer = startPosition \ 8
        Dim endByte As Integer = (startPosition + numBits - 1) \ 8
        Return startByte = endByte
    End Function
End Module

Outputs:

NextBits 764 = 764
NextBits 3 = 3
NextBits 248 = 248
NextBits 300 = 254


Solution

  • Solution 1

    I translated the C code from answer into VB:

    Function NextBits(byteArray() As Byte,
                  startPosition As Integer, numBits As Integer)
    
        Const bitmask = &HFFFFFF80
        Dim len = numBits
        Dim dest = 0
        Dim sum = 0
    
        While len > 0
            'Determine the byte index and offset in the input byte array
            Dim idx = startPosition \ 8
            Dim offset = startPosition Mod 8
            Dim tmp As Integer = byteArray(idx) << offset
            Dim next_bits = offset + len - 8
    
            If next_bits < 0 Then
                'Don't even need all of current byte => remove trailing bits ...
                tmp = tmp And (bitmask >> (len - 1))
            ElseIf next_bits > 0 Then
                'Need to include part of next byte ...
                Dim tmp2 = byteArray(idx + 1) And (bitmask >> (next_bits - 1))
                tmp = tmp Or (tmp2 >> (8 - offset))
            End If
    
            'Determine byte index and offset in output byte array
            idx = dest \ 8
            offset = dest Mod 8
            sum += (tmp >> offset)
            sum += ((tmp << (8 - offset)) And &HFF)
    
            'Update start position and length for next pass ...
            If len > 8 Then
                len -= 8
                dest += 8
                startPosition += 8
            Else
                Exit While
            End If
    
        End While
    
        Return sum
    
    End Function
    

    I didn't find any errors there. But there are several nuances related to the translation:

    1. I removed b array from the calculations, since it isn't required to find the sum. (The bit ranges don't overlap, therefore they can be summed separately.) If you still need b array, then keep in mind that this algorithm uses one more element, i.e. you will need to create it like this:
    Dim b As Byte() = New Byte(byteArray.Length) {}
    
    '...
    idx = dest \ 8
    offset = dest Mod 8
    b(idx) = b(idx) Or (tmp >> offset)
    b(idx + 1) = b(idx + 1) Or ((tmp << (8 - offset)) And &HFF)
    '...
    
    Return b.Sum(Function(x) x)
    
    1. I converted all calculations to Integer using "And &HFF" mask in one place.
    2. In the C code, arithmetical shift of the signed number to the right was used to create the mask: "bitmask >> ...". Therefore, bitmask constant had to be supplemented with 1s on the left: &HFFFFFF80 (-128). (By the way, if you set the mask to &HFFFFFF00, the expressions will be simplified: tmp = tmp And (bitmask >> len) and tmp2 = byteArray(idx + 1) And (bitmask >> next_bits).)
    3. The left shift "tmp << (8 - offset)" worked on its own. The fact is that VB uses only the 3 lowest bits of the operand to shift the byte, so the operation "ByteVal << 8" doesn't change the result.

    The compilation differences between C and VB are as follows:

    C

    unsigned char ByteVal = 255;
    unsigned char u7 = 7;
    unsigned char u8 = 8;
    unsigned char u9 = 9;
    
    // integral promotion:
    int Shift7 = ByteVal << u7;     //  32640
    int Shift8 = ByteVal << u8;     //  65280
    int Shift9 = ByteVal << u9;     // 130560
    

    VB

    Dim ByteVal As Byte = 255
    Dim u7 As Byte = 7
    Dim u8 As Byte = 8
    Dim u9 As Byte = 9
    
    'using the 3 lowest bits:
    Dim Shift7 As Integer = ByteVal << u7   '128
    Dim Shift8 As Integer = ByteVal << u8   '255
    Dim Shift9 As Integer = ByteVal << u9   '254
    

    Solution 2

    This is not exactly the OP's code. The fact is that the question was asked in such detail that by the time I finished reading it, my code had already been invented. I didn't notice that there is an implementation of NextBits method in the scrollbox at the end. Therefore, I decided to present and finalize my implementation, because it is fully operational and ready for use:

    Function NextBits(byteArray() As Byte,
                      startPosition As Integer, numBits As Integer)
    
        Dim totalNumBits = byteArray.Length * 8
        Dim endPosition = startPosition + numBits
    
    #If NET8_0_OR_GREATER Then
        ArgumentOutOfRangeException.ThrowIfNegative(startPosition)
        ArgumentOutOfRangeException.ThrowIfGreaterThan(startPosition, totalNumBits)
        ArgumentOutOfRangeException.ThrowIfNegative(numBits)
        ArgumentOutOfRangeException.ThrowIfGreaterThan(endPosition, totalNumBits)
    #Else
        If startPosition < 0 Then Throw New ArgumentOutOfRangeException
        If startPosition > totalNumBits Then Throw New ArgumentOutOfRangeException
        If numBits < 0 Then Throw New ArgumentOutOfRangeException
        If endPosition > totalNumBits Then Throw New ArgumentOutOfRangeException
    #End If
    
        Dim s_offs = startPosition \ 8
        Dim s_mod = startPosition Mod 8
        Dim s_mod_inv = 8 - s_mod
    
        Dim e_offs = endPosition \ 8
        Dim e_mod = endPosition Mod 8
    
        Dim GetByte = Function(i) _
            ((byteArray(i) << s_mod) _
            Or (byteArray(i + 1) >> s_mod_inv)) _
            And &HFF
    
        Dim GetLastByte =
            Function(i)
                Dim b = byteArray(i) << s_mod
                Dim i2 = i + 1
                If i2 < byteArray.Length Then
                    b = b Or (byteArray(i2) >> s_mod_inv)
                End If
                Return b And &HFF
            End Function
    
        Dim sum
    
        If s_mod = 0 Then
            sum = byteArray.Skip(s_offs).Take(e_offs - s_offs).Sum(Function(b) b)
            If e_mod <> 0 Then sum += byteArray(e_offs) And (&HFFFFFF00 >> e_mod)
        ElseIf s_mod >= e_mod Then
            sum = Enumerable.Range(s_offs, e_offs - s_offs - 1).Sum(GetByte)
            sum += GetLastByte(e_offs - 1) And (&HFF << (s_mod - e_mod))
        Else
            sum = Enumerable.Range(s_offs, e_offs - s_offs).Sum(GetByte)
            sum += GetLastByte(e_offs) And (&HFFFFFF00 >> (e_mod - s_mod))
        End If
    
        Return sum
    
    End Function
    

    Both solutions work within a single byte:

    ' 0101 0110
    '   __ __ 
    '----------
    '        86
    Console.WriteLine("NextBits 80 = " & NextBits({86}, 2, 4))