I'm trying to figure ou why the "openssl speed rsa" gives me worse result on a better cpu
1st server: Linux Debian 8 (running a Xen) - kernel: 4.9.0-amd64
model name : Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz
cpu MHz : 2200.004
cache size : 30720 KB
flags : fpu de tsc msr pae mce cx8 apic sep mca cmov pat clflush mmx fxsr sse sse2 ss ht syscall nx lm constant_tsc rep_good nopl eagerfpu pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch fsgsbase bmi1 hle avx2 bmi2 erms rtm rdseed adx xsaveopt ibpb ibrs stibp
bogomips : 4400.00
2nd server: Linux Debian 8 (running a Vmware ESXi (I don't know which one yet) - kernel: 4.9.0-amd64)
model name : Intel(R) Xeon(R) CPU E5-2698 v4 @ 2.20GHz
cpu MHz : 2199.058
cache size : 51200 KB
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts mmx fxsr sse sse2 ss syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts nopl xtopology tsc_reliable nonstop_tsc eagerfpu pni pclmulqdq ssse3 cx16 pcid sse4_1 sse4_2 x2apic popcnt aes xsave avx hypervisor lahf_lm kaiser arat
bogomips : 4399.99
Running a "openssl speed rsa" is giving me this (only pasting 4096bits because it's the only relevant for what I want to do):
1st server:
Doing 4096 bits private rsa's for 10s: **1699** 4096 bits private RSA's in 10.00s
Doing 4096 bits public rsa's for 10s: 105493 4096 bits public RSA's in 10.00s
2nd server:
Doing 4096 bits private rsa's for 10s: **1229** 4096 bits private RSA's in 10.00s
Doing 4096 bits public rsa's for 10s: 78677 4096 bits public RSA's in 10.00s
What could explain the difference of the keys created (=470 (1699-1229)) ?
Both servers have their cpu with the aes flag.
The only difference I see are the engine available, 1st server has "(rdrand) Intel RDRAND engine" and the other not.
Any idea?
Edit:
As stated by @Alexei Khlebnikov, the openssl speed rsa
command only measures the speed of the rsa sign/verify functions, and these don't use random numbers. Because of that, my original answer doesn't answer the question.
After a quick search, I found that the 1st server has bmi2 and adx instructions, while the 2nd server doesn't. These instructions are used to improve the performance of Montgomery’s integer multiplication/squaring, that are used in the RSA signing operations. It's hard to confirm that's the reason for the performance difference, but it can be one of the reasons.
Original answer:
To generate RSA keys you need random and large prime numbers. The process to find a random and large prime number consists in:
As you can see, this involves a lot of RNG, and generating good RNG is really slow. So, having a faster RNG means a faster RSA key generation.