c++visual-c++type-conversionintrinsicssign-extension

_mm256_movemask_epi8 to uint64_t


Can someone please explain me why tr2 and tr4 show different result:

auto test1 = _mm256_set1_epi8(-1);

    uint64_t tr2 = _mm256_movemask_epi8(test1);
    uint32_t tr3 = _mm256_movemask_epi8(test1);
    uint64_t tr4 = tr3;

_mm256_movemask_epi8(test1) should return int32, so assigning it to int64 should just assign lower bits.

Instead, tr2 prints 0xFFFFFFFFFFFFFFFF and tr4 prints 0x00000000FFFFFFFF

Is there any performance in doing it as tr4?

I'm new to both C++ and intrinsics so maybe I'm missing something obvious.

I'm using Visual Studio 2019 C++ compiler.


Solution

  • As Paul above said, this has to do with assignment of signed/unsigned with bigger integers. Here's an example:

    #include <iostream>
    #include <iomanip>
    
    int main()
    {
        int32_t negInt = -1;
        uint32_t unInt = static_cast<uint32_t>(negInt);
        int64_t negBigInt = static_cast<int64_t>(negInt);
        uint64_t unBigInt = static_cast<uint64_t>(negInt);
        uint64_t fromUnsigned = static_cast<uint64_t>(unInt);
    
        std::cout << std::hex;
        std::cout << "0x" << std::setfill('0') << std::setw(16) << negInt << "\n";
        std::cout << "0x" << std::setfill('0') << std::setw(16) << unInt << "\n";
        std::cout << "0x" << std::setfill('0') << std::setw(16) << negBigInt << "\n";
        std::cout << "0x" << std::setfill('0') << std::setw(16) << unBigInt << "\n";
        std::cout << "0x" << std::setfill('0') << std::setw(16) << fromUnsigned << "\n";
    }
    

    This prints:

    0x00000000ffffffff
    0x00000000ffffffff
    0xffffffffffffffff
    0xffffffffffffffff
    0x00000000ffffffff
    

    So Paul is right, but notably this doesn't happen if you assign a signed number to higher bit-width fields.