androidandroid-studioadbandroid-sdk-toolsandroid-traceview

Difference between Android trace-based and sampling-based method profiling and its impact on reported cpu times


What is the difference between trace-based and sampling-based profiling methods in Android Traceview ? I thought trace-based is more accurate,however, It seems like it can distort the actual cpu times especially if there are other calls with a function.

For example, I want to evaluate a function A which has two implementations such as A-1, and A-2.

  1. A-1 has one more function call, such as A-1-1.
  2. A-2 has also more function call, such as A-1-1, however, A-1-1 has also one function call inside it such as A-1-1-1.

Now I think that the trace-based profiling will report higher values for A-2 because it needs to trace one extra function A-1-1-1 and this extra cpu usage will be reported in the cpu time of A-2. Am I right ?

So the question becomes, does trace-based method take into account the cpu overhead time taken by tracing child methods when it reports the actual cpu time of a parent function?

On the other hand, the problem with sampling-based method is that it may not catch the very light-weight functions. What if my function takes 0.2 milliseconds cpu time, and sampling interval is 1 millisecond ? I did some experiments with it and it cannot catch the lightweight function calls. Any ideas or reference to documentation on their differences ?

The final question is which one is more accurate for relative comparisons ?


Solution

  • As far as I know neither sampling method takes overhead into account.

    Tracing will correctly count every single function call.

    Sampling will take a snapshot of the stack with a certain frequency, which gives you an overall picture where the time is actually spent in your program.

    What if my function takes 0.2 milliseconds cpu time, and sampling interval is 1 millisecond?

    Go for sampling and optimize by refactoring larger parts of your program, the ones that the sampler shows you. Microoptimizations on a JVM won't get you far.