Using MariaDB, I am trying to get a monthly total of items that were created minus items that were deleted that month, for each month. If no items were deleted, the total should be just the number of items that were created that month. If more items were deleted than created, the total should be a negative number.
The table has a created_at
column which is never null and a deleted_at
column which is set once the item has been 'deleted'
To illustrate, the (simplified) schema is like this: TABLE Items:
+----------------------------------------------------------------------------+
| idItem | created_at | deleted_at |
+----------------------------------------------------------------------------+
| 1 | 2020-03-20T04:28:41.000+00:00 | 2021-07-27T02:36:05.000+00:00 |
| 2 | 2020-03-20T04:28:41.000+00:00 | 2021-07-27T02:36:05.000+00:00 |
| 3 | 2021-03-02T21:39:10.000+00:00 | ∅ |
| 4 | 2021-03-05T21:13:13.000+00:00 | ∅ |
| 5 | 2021-06-08T13:49:11.000+00:00 | 2021-07-27T02:36:05.000+00:00 |
| 6 | 2021-07-13T02:36:05.000+00:00 | ∅
| 7 | 2021-09-17T21:12:13.000+00:00 | ∅ |
+----------------------------------------------------------------------------+
The information I need is the monthly total that have not been deleted, like so:
+-----------------------------------+
| total_existing | during_month |
+-----------------------------------+
| 2 | 2020-03 | -- two were added
+-----------------------------------+
| 4 | 2021-03 | -- another two were created
+-----------------------------------+
| 5 | 2021-06 | -- another was added
+-----------------------------------+
| 3 | 2021-07 | -- three deleted, one added
+-----------------------------------+
| 4 | 2021-09 | -- one added
+-----------------------------------+
Ultimately, I need to display the total for each month. I've tried this but it's not right.
SELECT
count(created.idItem) AS monthly_created_count,
count(deleted.idItem) AS monthly_deleted_count,
count(created.idItem) - count(deleted.idItem) as total,
DATE_FORMAT(created.created_at, '%Y-%m') as created_month ,
DATE_FORMAT(deleted.deleted_at, '%Y-%m') as deleted_month
FROM
Item created
LEFT JOIN
Item deleted
ON
DATE_FORMAT(deleted.deleted_at, '%Y-%m') = DATE_FORMAT(created.created_at, '%Y-%m')
GROUP BY DATE_FORMAT(created.created_at, '%Y-%m'), DATE_FORMAT(deleted.deleted_at, '%Y-%m')
I keep thinking I'm so close, but when we look at the rows where the deleted_at dates are set, it's obvious I'm off the mark.
If you're looking for a cumulative total of rows created/deleted, one approach is COUNT the number of records created and deleted by month/year separately. Then join the counts together with UNION ALL and calculate the sum totals:
SELECT t.YearMonth
, SUM(t.TotalCreated) - SUM(t.TotalDeleted) AS TotalExisting
FROM (
SELECT DATE_FORMAT(created_at, '%Y-%m') AS YearMonth
, COUNT(*) AS TotalCreated
, 0 AS TotalDeleted
FROM Item
GROUP BY DATE_FORMAT(created_at, '%Y-%m')
UNION ALL
SELECT DATE_FORMAT(deleted_at, '%Y-%m') AS YearMonth
, 0 AS TotalCreated
, COUNT(*) AS TotalDeleted
FROM Item
WHERE deleted_at IS NOT NULL
GROUP BY DATE_FORMAT(deleted_at, '%Y-%m')
) t
GROUP BY t.YearMonth
ORDER BY t.YearMonth
Results:
YearMonth | TotalExisting
:-------- | ------------:
2020-03 | 2
2021-03 | 2
2021-06 | 1
2021-07 | -2
2021-09 | 1
Then wrap those statements in a CTE and use a Window Function to calculate the rolling total:
See also db<>fiddle
WITH cte AS (
SELECT t.YearMonth
, SUM(t.TotalCreated) - SUM(t.TotalDeleted) AS TotalExisting
FROM (
SELECT DATE_FORMAT(created_at, '%Y-%m') AS YearMonth
, COUNT(*) AS TotalCreated
, 0 AS TotalDeleted
FROM Item
GROUP BY DATE_FORMAT(created_at, '%Y-%m')
UNION ALL
SELECT DATE_FORMAT(deleted_at, '%Y-%m') AS YearMonth
, 0 AS TotalCreated
, COUNT(*) AS TotalDeleted
FROM Item
WHERE deleted_at IS NOT NULL
GROUP BY DATE_FORMAT(deleted_at, '%Y-%m')
) t
GROUP BY t.YearMonth
ORDER BY t.YearMonth
)
SELECT YearMonth, SUM(TotalExisting) OVER (ORDER BY YearMonth) AS TotalExisting
FROM cte;
Final Results:
YearMonth | TotalExisting
:-------- | ------------:
2020-03 | 2
2021-03 | 4
2021-06 | 5
2021-07 | 3
2021-09 | 4