optimizationsignal-processingsimdsaturation-arithmetic

Saturate 16-bit signed integer to 12-bits signed


I'm working with an SDR that has a 12-bit signed ADC/DAC that stores in 16-bit IQ samples. I want to ensure that after all the DSP is done the samples saturate at 12 bits instead of getting truncated by the SDR.

This is the equivalent c++ code:

        for (int i = 0; i < block_size_with_header; i++) {
            if (floatSamples[i].real() > 2047)
                floatSamples[i].real(2047);
            if (floatSamples[i].imag() > 2047)
                floatSamples[i].imag(2047);
            if (floatSamples[i].real() < -2048)
                floatSamples[i].real(-2048);
            if (floatSamples[i].imag() < -2048)
                floatSamples[i].imag(-2048);
        }

Is there a faster way to do this using SIMD or Assembly? I've seen questions on here saturating at 16 bits or 8 bits, but not 12.

Thanks.


Solution

  • One interesting property of clamping, applying it twice doesn’t change the output, i.e. clamp( clamp( x ) ) == clamp( x ) for all x. This greatly simplifies handling of remainder. Here’s AVX2 example, untested.

    #include <stdint.h>
    #include <immintrin.h>
    
    // Clamp 16 int16_t numbers in memory to the specified min/max values
    inline void clamp16( int16_t* ptr, __m256i min, __m256i max )
    {
        __m256i v = _mm256_loadu_si256( ( const __m256i* )ptr );
        v = _mm256_min_epi16( v, max );
        v = _mm256_max_epi16( v, min );
        _mm256_storeu_si256( ( __m256i* )ptr, v );
    }
    
    void saturate12bits_avx2( int16_t* ptr, size_t length )
    {
        if( length >= 16 )
        {
            const __m256i max = _mm256_set1_epi16( 2047 );
            const __m256i min = _mm256_set1_epi16( -2048 );
    
            // We want a remainder of length [ 1 .. 16 ],
            // saves a branch testing for no remainder
            int16_t* const last = ptr + length - 16;
            for( ; ptr < last; ptr += 16 )
                clamp16( ptr, min, max );
            clamp16( last, min, max );
        }
        else
        {
            // Very small input, can't load AVX vectors
            int16_t* const end = ptr + length;
            for( ; ptr < end; ptr++ )
            {
                int16_t i = *ptr;
                i = std::min( i, (int16_t)2047 );
                i = std::max( i, (int16_t)-2048 );
                *ptr = i;
            }
        }
    }
    

    The input pointer doesn’t need to be aligned. Still, when it is aligned by 32 bytes, the function will run slightly faster.