sqloracle-databasederived-tableinline-view

Oracle SQL query is working by itself, but failing when wrapped into "select count(*) from (<query>)"


I have the following example of the Oracle SQL query which is failing with “ORA-00979: not a GROUP BY expression”. If remove the “select count(*) from (...)” wrapper the query works as expected. The idea is to get the count of rows returned by the inner query.

I am wondering why it is not working with this wrapper. Can anyone explain why it is happening? I am just interested in the reason why it is happening.

The expected result is that this query will return a count of all rows returned by the inner query.

I was searching for the reason and for the solution for days but without luck. I was not able to find information about that on the internet as well.

While I was working on it, I noticed that the following changes make the query work:

  1. Remove the concatenation from HAVING (remove this part - “NVL(D.computer_price, 9999) || NVL(t3.accessory_price, 9999)”).
  2. Remove HAVING.
  3. Change the “JSON_VALUE(laptop_info, '$.model') info” to “null info”.
  4. Remove one of the parts of the UNION ALL.

Any suggestions or comments will be much appreciated.

Notes:

  1. Version of the Oracle SQL is 19c.
  2. This is just a fake example of the query prepared from the original query. But it is demonstrating the issue. So please, never mind if the result of the query or query structure does not make sense to you.
  3. The original query is autogenerated and more complex, so it is unlikely that I can change the structure of this query much. I just hope to find an explanation for why it is failing.

Here is the query:

-- QUERY
select count(*) from (
    SELECT
      CASE WHEN GROUPING_ID(t3.accessory_name, D.name, NVL(D.computer_price, 9999) || NVL(t3.accessory_price, 9999)) != 0 THEN COUNT(*) END "Records",
      t3.accessory_name "accessory name",
      D.name "device name",
      SUM(D.computer_price) "computer price",
      SUM(D.laptop_price) "laptop price",
      SUM(t3.accessory_price) "accessory price",
      CASE WHEN GROUPING_ID(t3.accessory_name, D.name, NVL(D.computer_price, 9999) || NVL(t3.accessory_price, 9999)) = 0 THEN ANY_VALUE(t3.accessory_info) END "accessory info"
    FROM (
      SELECT
        computer_id id,
        computer_name name,
        computer_info info,
        computer_price computer_price,
        null laptop_price
      FROM computer
      UNION ALL
      SELECT
        laptop_id id,
        laptop_name name,
        JSON_VALUE(laptop_info, '$.model') info,
        null computer_price,
        laptop_price laptop_price
      FROM laptop
    ) D
    JOIN accessory t3 on t3.accessory_id = D.id  
    GROUP BY
      ROLLUP (t3.accessory_name, D.name, NVL(D.computer_price, 9999) || NVL(t3.accessory_price, 9999))
    HAVING
      (GROUPING_ID(t3.accessory_name, D.name, NVL(D.computer_price, 9999) || NVL(t3.accessory_price, 9999))=3)
    ORDER BY NVL(t3.accessory_name, 0), NVL(D.name, 0), NVL(D.computer_price, 9999) || NVL(t3.accessory_price, 9999) DESC
);

Here are the scripts for preparing test tables and test data:

-- CREATE TEST TABLES
create table computer(computer_id NUMBER, computer_name varchar2(50), computer_info varchar2(50), computer_price NUMBER);
create table laptop(laptop_id NUMBER, laptop_name varchar2(50), laptop_info varchar2(50), laptop_price NUMBER);
create table accessory(accessory_id NUMBER, accessory_name varchar2(50), accessory_info varchar2(50), accessory_price NUMBER);

-- INSERT TEST DATA
insert into computer (computer_id, computer_name, computer_info, computer_price) values (1, 'computer 1', 'some info about computer 1', 10);
insert into computer (computer_id, computer_name, computer_info, computer_price) values (2, 'computer 2', 'some info about computer 2', 20);
insert into computer (computer_id, computer_name, computer_info, computer_price) values (3, 'computer 3', 'some info about computer 3', 30);

insert into laptop (laptop_id, laptop_name, laptop_info, laptop_price) values (1, 'laptop 1', '{''model'': ''model 1''}', 15);
insert into laptop (laptop_id, laptop_name, laptop_info, laptop_price) values (2, 'laptop 2', '{''model'': ''model 2''}', 25);
insert into laptop (laptop_id, laptop_name, laptop_info, laptop_price) values (3, 'laptop 3', '{''model'': ''model 3''}', 35);

insert into accessory (accessory_id, accessory_name, accessory_info, accessory_price) values (1, 'accessory 1', 'accessory 1 info', 1);
insert into accessory (accessory_id, accessory_name, accessory_info, accessory_price) values (2, 'accessory 2', 'accessory 2 info', 2);
insert into accessory (accessory_id, accessory_name, accessory_info, accessory_price) values (3, 'accessory 3', 'accessory 3 info', 3);

I was searching for the solution for days and was not able to find any similar issue on the web. Also, I have tried to rewrite the wrapper in different ways, but the result was the same.


Solution

  • I was able to resolve this issue. It was related to the Oracle Optimizer. When we run the query, Oracle Optimizer tries to optimize the query to make it work faster and more efficiently. During optimization, Oracle does some Query Transformations. In this case, Oracle did it wrong, and the query failed.

    To fix this issue we need to disable Query Transformation by saying Oracle Optimizer to not transform the original query and execute it as is. To do this we can use Optimizer Hint NO_QUERY_TRANSFORMATION.

    After adding this Hint query works as expected. Here is an example:

    -- QUERY
    select count(*) from (
        SELECT
          /*+ NO_QUERY_TRANSFORMATION */
          CASE WHEN GROUPING_ID(t3.accessory_name, D.name, NVL(D.computer_price, 9999) || NVL(t3.accessory_price, 9999)) != 0 THEN COUNT(*) END "Records",
          t3.accessory_name "accessory name",
          D.name "device name",
          SUM(D.computer_price) "computer price",
          SUM(D.laptop_price) "laptop price",
          SUM(t3.accessory_price) "accessory price",
          CASE WHEN GROUPING_ID(t3.accessory_name, D.name, NVL(D.computer_price, 9999) || NVL(t3.accessory_price, 9999)) = 0 THEN ANY_VALUE(t3.accessory_info) END "accessory info"
        FROM (
          SELECT
            computer_id id,
            computer_name name,
            computer_info info,
            computer_price computer_price,
            null laptop_price
          FROM computer
          UNION ALL
          SELECT
            laptop_id id,
            laptop_name name,
            JSON_VALUE(laptop_info, '$.model') info,
            null computer_price,
            laptop_price laptop_price
          FROM laptop
        ) D
        JOIN accessory t3 on t3.accessory_id = D.id  
        GROUP BY
          ROLLUP (t3.accessory_name, D.name, NVL(D.computer_price, 9999) || NVL(t3.accessory_price, 9999))
        HAVING
          (GROUPING_ID(t3.accessory_name, D.name, NVL(D.computer_price, 9999) || NVL(t3.accessory_price, 9999))=3)
        ORDER BY NVL(t3.accessory_name, 0), NVL(D.name, 0), NVL(D.computer_price, 9999) || NVL(t3.accessory_price, 9999) DESC
    );
    

    Hope it will be helpful for someone else since it made me spend more than a week resolving this issue.