performanceopenssldebianvmwarexen

"openssl speed rsa" less performant on (normally) better cpu


I'm trying to figure ou why the "openssl speed rsa" gives me worse result on a better cpu

1st server: Linux Debian 8 (running a Xen) - kernel: 4.9.0-amd64

model name : Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz cpu MHz : 2200.004 cache size : 30720 KB flags : fpu de tsc msr pae mce cx8 apic sep mca cmov pat clflush mmx fxsr sse sse2 ss ht syscall nx lm constant_tsc rep_good nopl eagerfpu pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch fsgsbase bmi1 hle avx2 bmi2 erms rtm rdseed adx xsaveopt ibpb ibrs stibp bogomips : 4400.00

2nd server: Linux Debian 8 (running a Vmware ESXi (I don't know which one yet) - kernel: 4.9.0-amd64)

model name : Intel(R) Xeon(R) CPU E5-2698 v4 @ 2.20GHz cpu MHz : 2199.058 cache size : 51200 KB flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts mmx fxsr sse sse2 ss syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts nopl xtopology tsc_reliable nonstop_tsc eagerfpu pni pclmulqdq ssse3 cx16 pcid sse4_1 sse4_2 x2apic popcnt aes xsave avx hypervisor lahf_lm kaiser arat bogomips : 4399.99 Running a "openssl speed rsa" is giving me this (only pasting 4096bits because it's the only relevant for what I want to do):

1st server:

Doing 4096 bits private rsa's for 10s: **1699** 4096 bits private RSA's in 10.00s Doing 4096 bits public rsa's for 10s: 105493 4096 bits public RSA's in 10.00s

2nd server:

Doing 4096 bits private rsa's for 10s: **1229** 4096 bits private RSA's in 10.00s Doing 4096 bits public rsa's for 10s: 78677 4096 bits public RSA's in 10.00s

What could explain the difference of the keys created (=470 (1699-1229)) ?

Both servers have their cpu with the aes flag.

The only difference I see are the engine available, 1st server has "(rdrand) Intel RDRAND engine" and the other not.

Any idea?


Solution

  • Edit:

    As stated by @Alexei Khlebnikov, the openssl speed rsa command only measures the speed of the rsa sign/verify functions, and these don't use random numbers. Because of that, my original answer doesn't answer the question.

    After a quick search, I found that the 1st server has bmi2 and adx instructions, while the 2nd server doesn't. These instructions are used to improve the performance of Montgomery’s integer multiplication/squaring, that are used in the RSA signing operations. It's hard to confirm that's the reason for the performance difference, but it can be one of the reasons.

    Original answer:

    To generate RSA keys you need random and large prime numbers. The process to find a random and large prime number consists in:

    1. Generate a random number;
    2. Check if it's prime;
    3. If it's not, repeat.

    As you can see, this involves a lot of RNG, and generating good RNG is really slow. So, having a faster RNG means a faster RSA key generation.