sqlgoogle-bigquerysubqueryaggregate-functionswindow-functions

What does sql error "Analytic function cannot be an argument of another analytic function" mean?


I wrote the following SQL query to find out orders whose maximum quantity is strictly greater than the average quantity of every order (including itself). One order consists of multiple products bought, and hence a quantity corresponding to each product. Refer the table in attached image

Table:order_details

SQL code:

SELECT DISTINCT
    t.order_id
FROM
    (SELECT
         order_id,
         MAX(quantity) OVER (PARTITION BY order_id) AS max_qty,
         AVG(quantity) OVER (PARTITION BY order_id) AS avg_qty,
         MAX(AVG(quantity) OVER (PARTITION BY order_id)) OVER() AS max_avg_qty
     FROM
         ordersdetails) AS t
WHERE
    max_qty > max_avg_qty
ORDER BY
    order_id

It throws an error:

Analytic function cannot be an argument of another analytic function

What does this mean?

I tried to get the answer using window functions. Using the window function:

avg(quantity) over(partition by order_id) as avg_qty,

I calculated average quantity bought for every order, then I put this inside another window function max() over() in order to get maximum value among all the average quantities. But it didn't work.


Solution

  • You have a couple of operations that are stepping on each other's feet, from an order-of-operations standpoint.

    There are three ways in Bigquery sql to filter a result set:

    1. WHERE clause: This step is executed very early in query execution, often as a first step when selecting data from disk. This can't filter data that results from an aggregation either through a GROUP BY or a Window Function.
    2. HAVING clause: This step is executed after GROUP BY aggregation and can be used to filter the result set based on that results of the GROUP BY aggregation.
    3. QUALIFY clause: This step is executed at the very end of execution after Window Functions have been run against the result set.

    Because you want to filter based on the results of window functions, a QUALIFY is the route you want to aim for.

    To further complicate things, you are wanting to run a window function against data that is determined by another window function. This simply can't happen without a subquery as the step is only ran once per select.

    Accounting for order of operations and limitations in step/processing that you have to deal with, your sql will look like:

    SELECT DISTINCT t.order_id
    FROM
        (SELECT
             order_id,
             MAX(quantity) OVER (PARTITION BY order_id) AS max_qty,
             AVG(quantity) OVER (PARTITION BY order_id) AS avg_qty         
         FROM
             ordersdetails) AS t
    QUALIFY max_qty > MAX(avg_qty) OVER ()
    ORDER BY order_id