[SOLVED] Redistribute least significant bits from a 4-byte array to a nibble

I wish to move bits 0,8,16,24 of a 32-bit value to bits 0,1,2,3 respectively. All other bits in the input and output will be zero.

Obviously I can do that like this:

c = c>>21 + c>>14 + c>>7 + c;
c &= 0xF;

But is there a faster (fewer instructions) way?

c = (((c&BITS_0_8_16_24) * BITS_0_7_14_21) >> 21) & 0xF;

Or wait for Intel Haswell processor, doing all this in exactly one instruction (pext).

Update

Taking into account clarified constraints and assuming 32-bit unsigned values, the code may be simplified to this:

c = (c * BITS_7_14_21_28) >> 28;