ssepowerpcx264libx264altivec

x264 library speed - Altivec vs SSE4 -


I have simple cheap dualcore intel-3ghz-debian and access to super-expensive powerPc7-Aix.

And after few days of strugle, i compiled libx264 and tested it on both computers:

  1. GCC: library x264 on intel (with SSE2 capabilities) and
  2. GCC on 16 core powerPc (with altivec).

... and result is that cheap intel is x2 times faster ! (with altivec disabled, intel is 10x times faster)

My question: is this normal? Does all other powerPC-users have same results? Can powerPc-altivec-optimisation of x264 library work at same speed with intel... or MMX/SSE optimisation is officially at least 2 times faster for this library?

I am not interested in multi-thread options. Number of cores and threads are irrelevant. Just simple one-thread x264 encoding with default "medium preset" using rawvideo as source, sse vs altivec.

Maybe native Aix XLC compiler provide better results? (i managed only gcc to work)

... mac-powerpc-users maybe know something about this.

powrPc7-Aix:$ time (cat raw10sec.y4m |x264 --input-res 720x576 --fps 50 -o /dev/null -)
x264: 64-bit XCOFF
x264 [info]: using cpu capabilities: Altivec
time: real 0m33.559s
---
intelDebian:$ time (cat raw10sec.y4m |x264 --input-res 720x576 --fps 50 -o /dev/null -)
x264: ELF 32-bit LSB executable
x264 [info]: using cpu capabilities: MMX2 SSE2Fast SSSE3 FastShuffle SSE4.1 Cache64
time: real 0m16.503s

Solution

  • A few things spring to mind:

    A more interesting comparison would be against a PS3 with code optimized to take advantage of all cores — apparently PS3s are great at bruteforcing crypto. Sadly they've stopped making them, and I don't know how easy it is to run Linux on one these days.