I have a database with a single table. The table includes a column called threat_group
, and another called post_date
. This query gives me a list of all posts for each threat_group
per weekday:
SELECT
distinct threat_group AS "Group"
,extract('dow' from post_date) AS "Weekday"
,count(post_id) AS "Reports"
FROM
ransomwatch_posts
WHERE
post_date BETWEEN '<start_date>' and '<end_date>'
group by 1,2
order by 2
Now, there are two issues from a usability/viewability perspective that I'm trying to solve for.
First, if there are no entries at all for a given threat_group
on a given day, there will just be no result. For example, if the group conti
does not have any posts on Sundays, there will not be any lines like this:
threat_group | weekday | reports |
---|---|---|
conti | 0 | 0 |
I would like to get NULL
(preferably 0) for the sake of building out graphs or tables.
Second, I want this data consolidated, I think through use of crosstab
, but I've never built a crosstab, so it looks like this:
threat_group | Sun | Mon | Tues | Wed | Thu | Fri | Sat |
---|---|---|---|---|---|---|---|
conti | 0 | 2 | 4 | 1 | 12 | 0 | 0 |
Can someone provide some assistance here? The best attempt I've come up with was this:
SELECT *
FROM crosstab(
$$
SELECT
distinct threat_group AS "Group"
,extract('dow' from post_date) AS "Weekday"
,count(post_id) AS "Reports"
FROM
ransomwatch_posts
WHERE
post_date BETWEEN '2022-04-01' and '2022-06-30'
group by 1,2
order by 2
$$
) as ct("Group" TEXT, "0" numeric, "1" numeric, "2" numeric, "3" numeric, "4" numeric, "5" numeric, "6" numeric)
which fails with:
ERROR: return and sql tuple descriptions are incompatible
To get NULL
for missing values, you need the 2-parameter variant of crosstab()
:
SELECT *
FROM crosstab(
$$
SELECT threat_group -- AS grp
, extract('isodow' FROM post_date) -- AS weekday
, count(*) -- AS reports
FROM ransomwatch_posts
WHERE post_date BETWEEN '2022-04-01' and '2022-06-30'
GROUP BY 1,2
ORDER BY 1,2
$$
, 'VALUES (1),(2),(3),(4),(5),(6),(7)'
) AS ct(grp text, mon int, tue int, wed int, thu int, fri int, sat int, sun int)
See:
And no DISTINCT
here.
And ORDER BY 1,2
.
And I suggest 'isodow' instead of 'dow' to get 1 - 7 for Mon - Sun.
And the faster count(*)
- unless the column post_id
can be NULL (which seems unlikely).
To get 0
instead of NULL
for missing values:
SELECT grp
, COALESCE(mon, 0) AS mon
, COALESCE(tue, 0) AS tue
, COALESCE(wed, 0) AS wed
, COALESCE(thu, 0) AS thu
, COALESCE(fri, 0) AS fri
, COALESCE(sat, 0) AS sat
, COALESCE(sun, 0) AS sun
FROM crosstab( ...