I wish to move bits 0,8,16,24 of a 32-bit value to bits 0,1,2,3 respectively. All other bits in the input and output will be zero.
Obviously I can do that like this:
c = c>>21 + c>>14 + c>>7 + c;
c &= 0xF;
But is there a faster (fewer instructions) way?
c = (((c&BITS_0_8_16_24) * BITS_0_7_14_21) >> 21) & 0xF;
Or wait for Intel Haswell processor, doing all this in exactly one instruction (pext).
Update
Taking into account clarified constraints
and assuming 32-bit unsigned values
, the code may be simplified to this:
c = (c * BITS_7_14_21_28) >> 28;