androidrenderscript

Renderscript pow(), powr() and pown() are very slow on GPU in Nexus 5 with Android 4.4 and Android 4.4.1


To be sure, I tested my code in the following two dev settings:

Development OS: Windows 7 32-bit
Phone: Nexus 5
Phone OS version: Android 4.4 and Android 4.4.1
SDK bundle: adt-bundle-windows-x86-20131030
Build-tool version: 19
SDK tool version: 22.3
Platform tool version: 19

and

Development OS: Windows 7 32-bit
Phone: Nexus 5
Phone OS version: Android 4.4 and Android 4.4.1
SDK bundle: adt-bundle-windows-x86-20130729
Build-tool version: 18.1
SDK tool version: 22.2.1
Platform tool version: 18.0.1

The code is also very simple as follows

#pragma rs_fp_relaxed
uchar4 __attribute__((kernel)) sample(uchar4 in, uint32_t x, uint32_t y){
    const float4 fin = convert_float4(in);
    float tmp = pow(2.f, 2.f);  // very slow on GPU
    fin.x = tmp;
    return convert_uchar4(fin);
}

The code will be automatically run on GPU. However, the problem I met was, the pow() function is very slow. If I run this script with a 1600*1067 image, it will take 1927ms on GPU. If I use adb to force the code to run on CPU, it'll only take 10ms to 12ms. If I comment out the pow() function, it'll run fast in both CPU and GPU. I also tried the alternative powr() and pown() function and the result was the same. And I also tried to include:

#include "rs_cl.rsh"

and the result was the same.

I'm wondering if this is the expected behavior. Thank you in advance.


Solution

  • two things:

    1. pow() and similar functions are generally very slow on GPUs due to precision requirements. you can use native_powr() if you have less strict precision requirements, which is often dramatically faster.

    2. if you comment out the pow(), you might not be doing anything except a memcpy. the compiler will optimize out a lot in those cases, but yes, pow() is very slow.