javabytebuffer

Weird behaviour in Java 8 with ByteBuffer and BitSet


I'm new to java and started to implement a UDP sender with BitSet and ByteBuffer for some reason I get behaviour which I would not expect.

import java.nio.ByteBuffer;
import java.nio.ByteOrder;
import java.util.BitSet;

public class Main
{
    public static void main(String[] args) {
        
        ByteBuffer out = ByteBuffer.allocate(2);
        BitSet byt = new BitSet(8);
        byt.set(0, true);
        byt.set(1, false);
        out.put(byt.toByteArray());
        byt.set(0, true);
        byt.set(1, false);
        byt.set(2, true);
        out.put(byt.toByteArray());
        
        System.out.println("First byte is " + out.array()[0]+ " second is " + out.array()[1]);
    }
}

where I get the output

First byte is 1 second is 5

which I think is not okay since the endianness is wrong

When I try to run this:

public class Main
{
    public static void main(String[] args) {
        
        ByteBuffer out = ByteBuffer.allocate(2);
        BitSet byt = new BitSet(8);
        byt.set(0, false);
        byt.set(1, false);
        out.put(byt.toByteArray());
        byt.set(0, true);
        byt.set(1, false);
        byt.set(2, true);
        out.put(byt.toByteArray());
        
        System.out.println("First byte is " + out.array()[0]+ " second is " + out.array()[1]);
    }
}

The output changes to

First byte is 5 second is 0

Which I think is the right answer.

Notice the change is only on line 7 while the order of the bytes also change.

A quick test is here and here

I'm fairly new to Java. So it could all be a big misunderstanding. Thanks anyway!


Solution

  • BitSet.toByteArray() creates a byte array which is of the minimum length necessary to represent the BitSet. You can read in docs:

    https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/util/BitSet.html#toByteArray()

    byte[] bytes = s.toByteArray();
    then bytes.length == (s.length()+7)/8 and
    s.get(n) == ((bytes[n/8] & (1<<(n%8))) != 0)
    for all n < 8 * bytes.length.
    

    and the s.length() is:

    Returns the "logical size" of this BitSet: the index of the highest set bit in the BitSet plus one. Returns zero if the BitSet contains no set bits.

    So, essentially, if you store only false in your bitset, toByteArray() will return empty array.

    ByteBuffer will fill the bytes as you put them in. So the order in which the bytes are stored is exactly the order in which you call the .put method.

    Now lets follow your code example:

    Your ByteByffer of size 2, initially:

    | x | x |
      0   1
    

    x means its empty

    Then you create BitSet: [true, false, false, false, false, false, false, false] This is array: [1]. You then put it into your ByteBuffer:

    | 1 | x |
      0   1
    

    in the first entry you have now 1.

    Then you create second bitset: [true, false, true, false, false, false, false, false] which is [5] (2^0 + 2^2 = 1 + 4 = 5). Then you put it to BytBuffer:

    | 1 | 5 |
      0   1
    

    So, your output is : 'First byte is 1 second is 5'. And that is correct.

    which I think is not okay since the endianness is wrong

    I dont think your problem has anything to do with endianess, if you expected to see 'First byte is 5 second is 1' then switch the order of adding bytes to ByteBuffer.

    Now, lets follow the second example:

    Your ByteByffer of size 2, initially:

    | x | x |
      0   1
    

    then a BitSet with [false, false, false, false, false, false, false, false], which is 0 in binary. But BitSet.toByteArray will produce an empty array because the BitSet has no set bits.

    so, there is no change to a ByteBuffe, its still empty:

    | x | x |
      0   1
    

    The second bit set is [true, false, true, false, false, false, false, false], which is [5], after puting to ByteBuffer:

    | 5 | x |
      0   1
    

    Here, because the first BitSet added nothing, we only see the result of adding the second byte.

    So, you problem originates from how BitSet produces array when using toByteArray(). You might consider not using BitSet and use some other class, see this SO for some hints: Java BitSet and byte[] usage

    [edit] fix for the second example could be manually taking into account that a zero array can be returned by BitSet.toByteBuffer and manually adding [0] array. Of course, the same can be done for second byte.

        ByteBuffer out = ByteBuffer.allocate(2);
        BitSet byt = new BitSet(8);
        byt.set(0, false);
        byt.set(1, false);
        
        byte[] byteArr = byt.toByteArray();
        if(byteArr.length == 0) {
            out.put((byte)0);
        } else {
            out.put(byteArr);
        }
        
        byt.set(0, true);
        byt.set(1, false);
        byt.set(2, true);
        
        byteArr = byt.toByteArray();
        out.put(byteArr);
        
        System.out.println("First byte is " + out.array()[0]+ " second is " + out.array()[1]);