pythonexcelnumpypercentiledynatrace

Nth percentile in python is different from Dynatrace result


I am trying to create a report based on the data extracted from Dynatrace.

I am extracting the data on daily basis for the events, in my Python Django report, I need to show the Nth percentile data (like 30th percentile, 60th Percentile, 75th Percentile, 90th Percentile).

When I try to pull the data from Dynatrace the below list is the result: [1563,2731,3586,3966,4174,4971,6055,9175,15667]

For this list, when I use numpy.percentile or df.quantile, I am getting one value which is similar to the percentile value like the formula I used in Excel However the Dynatrace PERCENTILE function is showing a different value all together

For example, From the excel and Python, I am getting 75th Percentile as - 6055 From Dynatrace I am getting - 6835

I tried to use some online tools to calculate the Percentile but all seems to be giving 6055. If someone can explain this how DynaTrace is calculating this formula that would be a great help

Thanks in advance


Solution

  • This sort of discrepancies are normally due to the interpolation method, very noticeable when the sample is very small.

    However, 6055 is exactly percentile 75 in your sample:

    1563   2731   3586   3966   4174   4971   6055   9175  15667
     0/8    1/8    2/8    3/8    4/8    5/8    6/8    7/8    8/8
       0  0.125   0.25  0.375    0.5  0.625   0.75  0.875      1
    

    Accordingly, Numpy produces the same result using any of its interpolation methods (linear, lower, higher, nearest, midpoint).

    Dynatrace may be using a more complex interpolation method like this one. One of the authors is affiliated to Dynatrace.