c++coptimizationbit-manipulationsaturation-arithmetic

Saturating subtract/add for unsigned bytes


Imagine I have two unsigned bytes b and x. I need to calculate bsub as b - x and badd as b + x. However, I don't want underflow/overflow occur during these operations. For example (pseudo-code):

b = 3; x = 5;
bsub = b - x; // bsub must be 0, not 254

and

b = 250; x = 10;
badd = b + x; // badd must be 255, not 4

The obvious way to do this includes branching:

bsub = b - min(b, x);
badd = b + min(255 - b, x);

I just wonder if there are any better ways to do this, i.e. by some hacky bit manipulations?


Solution

  • The article Branchfree Saturating Arithmetic provides strategies for this:

    Their addition solution is as follows:

    u32b sat_addu32b(u32b x, u32b y)
    {
        u32b res = x + y;
        res |= -(res < x);
    
        return res;
    }
    

    modified for uint8_t:

    uint8_t  sat_addu8b(uint8_t x, uint8_t y)
    {
        uint8_t res = x + y;
        res |= -(res < x);
    
        return res;
    }
    

    and their subtraction solution is:

    u32b sat_subu32b(u32b x, u32b y)
    {
        u32b res = x - y;
        res &= -(res <= x);
    
        return res;
    }
    

    modified for uint8_t:

    uint8_t sat_subu8b(uint8_t x, uint8_t y)
    {
        uint8_t res = x - y;
        res &= -(res <= x);
    
        return res;
    }