cryptographyaessolarissunstudio

SSE3/SSSE3 + AES/RDRAND/RDSEED under Sun Studio


I'm working under Sun Studio 12.3 on SunOS 5.11 (Solaris 11.3). I'm tuning a script that includes negative tests, and that includes odd combinations of CPU features. We do this to understand if and how we fail; and to ensure there are no unexpected surprises.

I'm trying to figure out a way to enable the native instruction set plus AES, RDRAND and RDSEED. The native instruction set is courtesy Xeon 5100's, which is effectively SSE3/SSSE3 plus some additional instructions.

Compiling all source files with /opt/solarisstudio12.3/bin/CC -DNDEBUG -g3 -xO2 -template=no%extdef -native -m64 -KPIC -xarch=aes -D__AES__=1 results in:

$ ./cryptest.exe
ld.so.1: cryptest.exe: fatal: cryptest.exe: hardware capability (CA_SUNW_HW_1) unsupported: 0x1000000  [ SSE4.2 ]
Killed

This is kind of expected because Sun Studio assumes a progression of features and availability. When I modify the makefile to build cpu.cpp (used for feature tests), rijndael.cpp (provides AES implementation), and test.cpp (performs the testing) with -xarch=aes, the program still crashes because SSE4 is creeping into test.cpp.

I tried to use -xarch=aes -D__AES__=1 -xarch=no%sse4_1 -xarch=no%sse4_2 to remove unwanted instruction sets, but it failed to compile as expected. no%sse4_1 simply comes from -template=no%extdef because the no% prefix appears to be the way to turn things off.

How do I use SSE3/SSSE3 with the addition of AES/RDRAND/RDSEED under Sun Studio? Is it even possible?


The pattern we use, which has worked well up until now, is to combine compile time support with runtime support. So AES code will look like:

#if (__AES__ >= 1) || (SUNPRO_CC >= 0x512)
# define HAVE_AES 1
#endif

#if defined(HAVE_AES)
if (HasAES())
{
    // Optimized implementation
    ...
    return;
}
#endif
{
    // Fall into C/C++ implementation
    ...
}

For compilers like Clang and GCC, we simply -march=native -maes -mrdrnd -mrdseed. I was happy to accept no cross-polination occurred.

Then I cam across two messages on Oracle's message boards indicating RDRAND is broken under Sun Studio 12.3 and 12.4 (here for 12.3 and here for 12.4). So I have to ensure RDRAND is enabled to ensure its being tested, and that requires -xarch=aes.


Based on _mm_aeskeygenassist_si128 intrinsic requires at least -xarch=aes, this may not be possible. This question is effectively due diligence to ensure we are doing everything we can to ensure a trouble free experience.


$ isainfo -v
64-bit amd64 applications
        ssse3 ahf cx16 sse3 sse2 sse fxsr mmx cmov amd_sysc cx8 tsc fpu 
32-bit i386 applications
        ssse3 ahf cx16 sse3 sse2 sse fxsr mmx cmov sep cx8 tsc fpu 

Solution

  • There, I have created an AES+SSSE3 binary for you.

    $ cat tmp.c
    #include 
    #include 
    #include 
    int main(int argc, char* argv[])
    {
       // SSE2
       int64_t x[2];
       __m128i y = _mm_loadu_si128((__m128i*)x);
    
       // AES
       __m128i z = _mm_aeskeygenassist_si128(y,0);
    
       return 0;
    }
    
    $ cat tmp2.c
    #include 
    #include 
    void foo(void)
    {       
            __m128i x;
            x       =  _mm_hadd_epi16 (x, x);
    }
    
    $ cc tmp.c tmp2.c -xarch=aes
    tmp.c:
    tmp2.c:
    
    $ file a.out
    a.out:          ELF 32-bit LSB executable 80386 Version 1 [AES SSSE3 SSE2 SSE], dynamically linked, not stripped
    

    The Hardware Capabilities bits are assinged by the compiler depending on the actual presence of insructions in the final executable.

    So tmp.o has AES bit assigned. And tmp2.o has up to SSSE3 bits assined.

    When linked together they produce [AES SSSE3] binary. Because HWCAP bits are ORed together.