Oracle released Sun Studio 12.6 recently. We have a SHA-1 and SHA-256 intrinsic based implementation (for ARM and Intel), and we want to enable the extension on Solaris i86 machines.
The 12.6 manual and -xarch
options is available at A.2.115.3 -xarch Flags for x86, but it does not discuss SHA.
Which -xarch
option do we use for SHA?
If Studio 12.6 doesn't support the SHA instruction set (and I strongly suspect it doesn't since I can't find "SHA" mentioned at all, in any form, in the What's New in the Oracle Developer Studio 12.6 Release documentation), you're out of luck.
Almost.
What you can do is create your own inline assembler functions. See man inline
:
inline(4)
Name
inline, filename.il - Assembly language inline template files
Description
Assembly language call instructions are replaced by a copy of their corresponding function body obtained from the inline template (*.il) file.
Inline template files have a suffix of .il, for example:
% CC foo.il hello.c
Inlining is done by the compiler's code generator.
...
Examples
Please review libm.il or vis.il for examples. You can find a version of these libraries that is specific to each supported architecture under the compiler's lib/ directory.
...
An example can be found here (emphasis mine):
Performance Tuning With Sun Studio Compilers and Inline Assembly Code
...
This paper provides a demonstration of how to measure the performance of a critical piece of code. An example using a compiler flag and another example using inline assembly code are provided. The results are compared to show the benefits and differences of each approach.
...
Example 8: Inline Assembly Code for the Iterative Mandelbrot Calculation
Knowing all these facts, the inline code can be written, as shown in Example 8.
.inline mandel_il,0 // x is stored in %xmm0 // y is stored in %xmm1 // 4.0 is stored in %xmm2 // max_int is stored in %rdi // set registers to zero xorps %xmm3, %xmm3 xorps %xmm4, %xmm4 xorps %xmm5, %xmm5 xorps %xmm6, %xmm6 xorps %xmm7, %xmm7 xorq %rax, %rax .loop: // check to see if u2 - v2 > 4.0 movss %xmm5, %xmm7 addss %xmm6, %xmm7 ucomiss %xmm2, %xmm7 jp .exit jae .exit // v = 2 * v * u + y mulss %xmm3, %xmm4 addss %xmm4, %xmm4 addss %xmm1, %xmm4 // u = u2 - v2 + x movss %xmm5, %xmm3 subss %xmm6, %xmm3 addss %xmm0, %xmm3 // u2 = u * u movss %xmm3, %xmm5 mulss %xmm3, %xmm5 // v2 = v * v movss %xmm4, %xmm6 mulss %xmm4, %xmm6 incl %eax cmpl %edi, %eax jl .loop .exit: // end of mandel_il .end
It's not hard at all. I had to write a lot of SPARC inline assembler functions for a customer I was consulting for back in the Solaris 8 days, some of them were pretty basic - effectively one-liners to wrap a single instruction. I swear some of them wound up in later versions of the Studio compiler suite (since we were sub-contracted by Sun itself, that's not surprising, nevermind the fact that some of them were blatantly obvious - floor()
and ceil()
, IIRC, were two of them - and should have been there in the first place...)