Currently working on a Steganography project where, given a message in bytes and the number of bits to modify per byte, hide a message in an arbitrary byte array.
In the first decoded byte of the resulting message, the value has it's first (leftmost) bit set to '1' instead of '0'. For example, when using message "Foo".getBytes()
and maxBits = 1
the result is "Æoo", not "Foo" (0b01000110 gets changed to 0b11000110). With message "Æoo".getBytes()
and maxBits = 1
result is "Æoo", meaning the bit is not getting flipped as far as I can tell.
Only certain values of maxBits
for certain message bytes cause this error, for example "Foo"
encounters this problem at maxBits
equal to 1, 5, and 6, whereas "Test"
encounters this problem at maxBits
equal to 1, 3, and 5. Only the resulting first character ends up with its first bit set, and this problem only occurs at the specified values of this.maxBits
related to the initial data.
maxBits
, is the first bit of the
resulting decoded message always 1?maxBits
that
work fine, and others that do not?maxBits
and the
resulting erroneous results in relation to the original data?Encode and Decode Methods:
public byte[] encodeMessage(byte[] data, byte[] message) {
byte[] encoded = data;
boolean[] messageBits = byteArrToBoolArr(message);
int index = 0;
for (int x = 0; x < messageBits.length; x++) {
encoded[index] = messageBits[x] ? setBit(encoded[index], x % this.maxBits) : unsetBit(encoded[index], x % this.maxBits);
if (x % this.maxBits == 0 && x != 0)
index++;
}
return encoded;
}
public byte[] decodeMessage(byte[] data) {
boolean[] messageBits = new boolean[data.length * this.maxBits];
int index = 0;
for (int x = 0; x < messageBits.length; x++) {
messageBits[x] = getBit(data[index], x % this.maxBits);
if (x % this.maxBits == 0 && x != 0)
index++;
}
return boolArrToByteArr(messageBits);
}
Unset, Set, and Get Methods:
public byte unsetBit(byte data, int pos) {
return (byte) (data & ~((1 << pos)));
}
public byte setBit(byte data, int pos) {
return (byte) (data | ((1 << pos)));
}
public boolean getBit(byte data, int pos) {
return ((data >>> pos) & 0x01) == 1;
}
Conversion Methods:
public boolean[] byteArrToBoolArr(byte[] b) {
boolean bool[] = new boolean[b.length * 8];
for (int x = 0; x < bool.length; x++) {
bool[x] = false;
if ((b[x / 8] & (1 << (7 - (x % 8)))) > 0)
bool[x] = true;
}
return bool;
}
public byte[] boolArrToByteArr(boolean[] bool) {
byte[] b = new byte[bool.length / 8];
for (int x = 0; x < b.length; x++) {
for (int y = 0; y < 8; y++) {
if (bool[x * 8 + y]) {
b[x] |= (128 >>> y);
}
}
}
return b;
}
Sample Code and Output:
test("Foo", 1);//Æoo
test("Foo", 2);//Foo
test("Foo", 3);//Foo
test("Foo", 4);//Foo
test("Foo", 5);//Æoo
test("Foo", 6);//Æoo
test("Foo", 7);//Foo
test("Foo", 8);//Foo
test("Test", 1);//Ôest
test("Test", 2);//Test
test("Test", 3);//Ôest
test("Test", 4);//Test
test("Test", 5);//Ôest
test("Test", 6);//Test
test("Test", 7);//Test
test("Test", 8);//Test
private static void test(String s, int x) {
BinaryModifier bm = null;
try {
bm = new BinaryModifier(x);//Takes maxBits as constructor param
} catch (BinaryException e) {
e.printStackTrace();
}
System.out.println(new String(bm.decodeMessage(bm.encodeMessage(new byte[1024], s.getBytes()))));
return;
}
Your logic of incrementing index
has two flaws, which overwrite the first bit of the first letter. Obviously, the bug is expressed when the overwriting bit is different to the first bit.
if (x % this.maxBits == 0 && x != 0)
index++;
The first problem has to do with embedding only one bit per byte, i.e. maxBits = 1
. After you have embedded the very first bit and reached the above conditional, x
is still 0, since it will be incremented at the end of the loop. You should be incrementing index
at this point, but x != 0
prevents you from doing so. Therefore, the second bit will also be embedded in the first byte, effectively overwriting the first bit. Since this logic also exists in the decode method, you read the first two bits from the first byte.
More specifically, if you embed a 00
or 11
, it will be fine. But a 01
will be read as 11
and a 10
will be read as 00
, i.e., whatever value is the second bit. If the first letter has an ascii code less or equal than 63 (00xxxxxx), or greater or equal than 192 (11xxxxxx), it will come out fine. For example:
# -> # : 00100011 (35) -> 00100011 (35)
F -> Æ : 01000110 (70) -> 11000110 (198)
The second problem has to do with the x % this.maxBits == 0
part. Consider the case where we embed 3 bits per byte. After the 3rd bit, when we reach the conditional we still have x = 2
, so the modulo operation will return false. After we have embedded a 4th bit, we do have x = 3
and we're allowed to move on to the next byte. However, this extra 4th bit will be written at the 0th position of the first byte, since x % this.maxBits
will be 3 % 3
. So again, we have a bit overwriting our very first bit. However, after the first cycle the modulo operation will correctly write only 3 bits per byte, so the rest of our message will be unaffected.
Consider the binary for "F", which is 01000110. By embedding N bits per byte, we effectively embed the following groups in the first few bytes.
1 bit 01 0 0 0 1 1 0
2 bits 010 00 11 0x
3 bits 0100 011 0xx
4 bits 01000 110x
5 bits 010001 10xxxx
6 bits 0100011 0xxxxx
7 bits 01000110
8 bits 01000110x
As you can see, for groups of 5 and 6 bits, the last bit of the first group is 1, which will overwrite our initial 0 bit. For all other cases the overwrite doesn't affect anything. Note that for 8 bits, we end up using the first bit of the second letter. If that happened to have an ascii code greater or equal than 128, it would again overwrite the firstmost 0 bit.
To address all problems, use either
for (int x = 0; x < messageBits.length; x++) {
// code in the between
if ((x + 1) % this.maxBits == 0)
index++;
}
or
for (int x = 0; x < messageBits.length; ) {
// code in the between
x++;
if (x % this.maxBits == 0)
index++;
}
Your code has another potential problem which hasn't been expressed. If your data
array has a size of 1024, but you only embed 3 letters, you will affect only the first few bytes, depending on the value of maxBits
. However, for the extraction, you define your array to have a size of data.length * this.maxBits
. So you end up reading bits from all of the bytes of the data
array. This is currently no problem, because your array is populated by 0s, which are converted to empty strings. However, if your array had actual numbers, you'd end up reading a lot of garbage past the point of your embedded data.
There are two general ways of addressing this. You either