sqlsql-serverperformancenullpercentile-cont

how to avoid calling percentile_cont twice in a case clause to replace null with zero?


I need to calculate the median transaction amount for each customer in the past 52 weeks, but percentile_cont returns NULL if there's no transaction for a particular customer. In such a case, I have to replace NULL with zero, I acheived this by using a CASE clause in sql, however I am using PERCENTILE_CONT twice for this purpose which makes the query slow for a huge list of customers to process. is there a better way to use the PERCENTILE_CONT only once inside the CASE clause?

SELECT DISTINCT customer,

       CASE WHEN 
       PERCENTILE_CONT(0.5) 
       WITHIN GROUP (ORDER BY 
       transamt) OVER 
       (PARTITION BY
      customer) IS NOT NULL THEN

      PERCENTILE_CONT(0.5) WITHIN 
      GROUP (ORDER BY transamt) 
      OVER (PARTITION BY
      customer)

      ELSE 0
      END  AS median_amt

FROM trans_table

WHERE trans_date BETWEEN DATEADD(WEEK, -52, GETDATE() ) AND GETDATE() 

Solution

  • According to "...percentile_cont returns NULL if there's no transaction for a particular customer...", it could be a data issue or by design there could be entries for a customer without any transactions could have trans_date in the past 52 weeks but transamt is null. If that's the case, maybe this work for you by changing order by transamt to the following:

    select distinct 
           customer,
           percentile_cont(0.5) within group 
           (order by case when transamt is null then 0 else transamt end) 
           over (partition by customer) as median_amt
      from trans_table;
    

    Second guess: if percentile_cont() return NULL then show 0. Using coalesce().

    select distinct customer,
           coalesce(percentile_cont(0.5) within group (order by transamt) over (partition by customer),0) as median_amt
      from trans_table;