sqlsql-servergroup-byconditional-aggregation

Counting records based on conditions over potential null values


I have this tables:

CREATE TABLE customer_orders (
  "order_id" INTEGER,
  "customer_id" INTEGER,
  "pizza_id" INTEGER,
  "exclusions" VARCHAR(4),
  "extras" VARCHAR(4),
  "order_time" TIMESTAMP
);
INSERT INTO customer_orders
  ("order_id", "customer_id", "pizza_id", "exclusions", "extras", "order_time")
VALUES
  ('1', '101', '1', '', '', '2020-01-01 18:05:02'),
  ('2', '101', '1', '', '', '2020-01-01 19:00:52'),
  ('3', '102', '1', '', '', '2020-01-02 23:51:23'),
  ('3', '102', '2', '', NULL, '2020-01-02 23:51:23'),
  ('4', '103', '1', '4', '', '2020-01-04 13:23:46'),
  ('4', '103', '1', '4', '', '2020-01-04 13:23:46'),
  ('4', '103', '2', '4', '', '2020-01-04 13:23:46'),
  ('5', '104', '1', NULL, '1', '2020-01-08 21:00:29'),
  ('6', '101', '2', NULL, NULL, '2020-01-08 21:03:13'),
  ('7', '105', '2', NULL, '1', '2020-01-08 21:20:29'),
  ('8', '102', '1', NULL, NULL, '2020-01-09 23:54:33'),
  ('9', '103', '1', '4', '1, 5', '2020-01-10 11:22:59'),
  ('10', '104', '1', NULL, NULL, '2020-01-11 18:34:49'),
  ('10', '104', '1', '2, 6', '1, 4', '2020-01-11 18:34:49');

CREATE TABLE runner_orders (
  "order_id" INTEGER,
  "runner_id" INTEGER,
  "pickup_time" VARCHAR(19),
  "distance" DECIMAL(5,2) NULL,
  "duration" VARCHAR(10),
  "cancellation" VARCHAR(23)
);
INSERT INTO runner_orders
  ("order_id", "runner_id", "pickup_time", "distance", "duration", "cancellation")
VALUES
  ('1', '1', '2020-01-01 18:15:34', '20', '32', ''),
  ('2', '1', '2020-01-01 19:10:54', '20', '27', ''),
  ('3', '1', '2020-01-03 00:12:37', '13.4', '20', NULL),
  ('4', '2', '2020-01-04 13:53:03', '23.4', '40', NULL),
  ('5', '3', '2020-01-08 21:10:57', '10', '15', NULL),
  ('6', '3', NULL, NULL, NULL, 'Restaurant Cancellation'),
  ('7', '2', '2020-01-08 21:30:45', '25', '25mins', NULL),
  ('8', '2', '2020-01-10 00:15:02', '23.4', '15', NULL),
  ('9', '2', NULL, NULL, NULL, 'Customer Cancellation'),
  ('10', '1', '2020-01-11 18:50:20', '10', '10', NULL);

And I they gave me the question: For each customer, how many delivered pizzas had at least 1 change and how many had no changes?

Now, I tried to do this:

WITH cte_1
AS (
    SELECT co.customer_id
        , co.order_id
        , co.exclusions
        , isnull(co.exclusions,'') AS exc
        , isnull(co.extras,'') AS ext
    FROM customer_orders as co
    INNER JOIN runner_orders as ru 
        ON co.order_id = ru.order_id
    WHERE NOT ru.cancellation IN ('Restaurant Cancellation', 'Customer Cancellation')    
)    
SELECT customer_id
    , COUNT(order_id) AS no_changes 
FROM cte_1
WHERE exc = '' AND ext = ''
GROUP BY customer_id

As you can see this so long but at least works and the numbers are correct. But then when I try to do the same to get the result for pizzas with changes like this:

WITH cte_1
AS (
    SELECT co.customer_id
        , co.order_id
        , co.exclusions
        , isnull(co.exclusions,'') AS exc
        , isnull(co.extras,'') AS ext
    FROM customer_orders as co
    INNER JOIN runner_orders as ru 
        ON co.order_id = ru.order_id
    WHERE NOT ru.cancellation IN ('Restaurant Cancellation', 'Customer Cancellation')    
)    
SELECT customer_id
    , COUNT(order_id) AS with_changes 
FROM cte_1
WHERE not exc = '' 
GROUP BY customer_id
UNION
SELECT customer_id
    , COUNT(order_id) AS with_changes 
FROM cte_1
WHERE not ext = '' 
GROUP BY customer_id

It doesn't work, and I'm not sure if it's because of the Nulls and blanks or the approach in general. Please if there is a better and shorter way to do this.

EXPECTED RESULT would be something like:

customer_id orders_with_changes orders_with_NO_changes
101 --- 2
102 --- 3
103 3 ---
104 2 1
105 1 ---

Solution

  • You can try approaching this problem by:

    SELECT customer_id, 
           COUNT(CASE WHEN NOT COALESCE(extras, '') = '' 
                        OR NOT COALESCE(exclusions, '') = '' THEN 1 END) AS num_with_changes,
           COUNT(CASE WHEN COALESCE(extras, '') = ''
                       AND COALESCE(exclusions, '') = '' THEN 1 END) AS num_with_no_changes
    FROM customer_orders co
    WHERE NOT EXISTS(SELECT 1 
                     FROM runner_orders ro 
                     WHERE co.order_id = ro.order_id 
                       AND NOT COALESCE(cancellation, '') = '')
    GROUP BY customer_id
    

    Output:

    customer_id num_with_changes num_with_no_changes
    101 0 2
    102 0 3
    103 3 0
    104 2 1
    105 1 0

    Check the demo here.