cbit-manipulationbit-shiftinteger-arithmeticsigned-integer

How to implement arithmetic right shift in C


Many lossless algorithms in signal processing require an evaluation of the expression of the form ⌊ a / 2b ⌋, where a, b are signed (a possibly negative, b non-negative) integers and ⌊·⌋ is the floor function. This usually leads to the following implementation.

int floor_div_pow2(int numerator, int log2_denominator)
{
    return numerator >> log2_denominator;
}

Unfortunately, the C standard states that the result of the >> operator is implementation-defined if the left operand has a signed type and a negative value.

To ensure the correct behaviour on all platforms, one could replace this simple function with multiple if-else conditions, resulting in poor program performance. (One must to treat an overflow of an integer and consider the case when the numerator is INT_MIN.)

Therefore I ask, what is the best practice for implementation of the arithmetic right shift in C? Ideally, I'm looking for the construct that compiles into the same code1 as the code fragment above while avoiding the implementation-defined behaviour.

1 considering, e.g., gcc and x86-64 platform

UPDATE:

After some thought, I realized that I made an incorrect implication in the above question. Computing the floor function for negative numbers using an arithmetic shift doesn't make sense if the platform doesn't use two's complement. The goal is to implement the expression ⌊ a / 2b ⌋ in a portable way.


Solution

  • #define USES_ARITHMETIC_SHR(TYPE) ((TYPE)(-1) >> 1 == (TYPE)(-1))
    
    int asr(int value, int amount) /* Better codegen on some older compilers */
    {
        return !USES_ARITHMETIC_SHR(int) && value < 0 ? ~(~value >> amount) : value >> amount ;
    }
    
    int asr2(int value, int amount) /* Completely portable */
    {
        return value < 0 ? ~(~value >> amount) : value >> amount ;
    }
    

    This code decides whether to just use the builtin >> operator or not first. You might want to either trust or not trust the preprocessor giving you the same result as the target architecture, but a safe fallback is to not trust it.

    Let's explain the value < 0 ? ~(~value >> amount) : value >> amount part.

    1. If value >= 0 then it doesn't matter whether >> is logical or arithmetic, we can use it.
    2. If value < 0 then ~value is the bitwise complement which will be a positive number and (~value >> amount) will be portable (the top amount number of bits will be cleared, the rest shifted right as expected).
      ~(~value >> amount) will flip all the bits back, including flipping the top amount number of zeroes to ones which is exactly what you want with arithmetic right shifting.

    The code assuming USES_ARITHMETIC_SHR(int) == true compiles with -O2 into:

    asr(int, int): // x86-64 GCC 4.4.7
        mov     eax, edi
        mov     ecx, esi
        sar     eax, cl
        ret
    asr(int, int): // x86-64 Clang 3.4.1
        mov     cl, sil
        sar     edi, cl
        mov     eax, edi
        ret
    asr(int, int): // ARM GCC 4.5.4
        mov     r0, r0, asr r1
        bx      lr
    

    This should be portable but I'm also unsure if it pedantically really is. If you aren't either, you can #define USES_ARITHMETIC_SHR(TYPE) false or just omit checking it and only check value < 0. But that results in less optimal code on the some older compilers.

    The newest version of the compilers (GCC 8+, Clang 7+) compile both versions, asr and asr2 to the same, efficient assembly as above, so you can use either versions of the code. Below is how older compilers do with asr2, a very portable solution.

    asr2(int, int): // x86-64 GCC 4.4.7
        test    edi, edi
        js      .L8
        mov     eax, edi
        mov     ecx, esi
        sar     eax, cl
        ret
      .L8:
        mov     eax, edi
        mov     ecx, esi
        not     eax
        sar     eax, cl
        not     eax
        ret
    asr2(int, int): // x86-64 Clang 3.4.1
        mov     cl, sil
        sar     edi, cl
        mov     eax, edi
        ret
    asr2(int, int): // ARM GCC 4.5.4
        cmp     r0, #0
        mvnlt   r0, r0
        mvnlt   r0, r0, asr r1
        movge   r0, r0, asr r1
        bx      lr