my base data:
Process ID | Location | Date | Timeliness |
---|---|---|---|
2030608 | New York | May 24 | in time |
2067393 | Ohio | May 24 | overdue |
1329306 | Ohio | May 24 | in time |
1740814 | Ohio | June 24 | overdue |
1924676 | Chicago | May 24 | overdue |
1985313 | Chicago | May 24 | overdue |
1715694 | San Fransisco | June 24 | in time |
1981887 | San Fransisco | June 24 | overdue |
1752416 | San Fransisco | July 24 | overdue |
1671006 | New York | May 24 | in time |
2030607 | New York | June 24 | in time |
and I want to aggregate it to use it in a visualisation tool.
Current output of my aggregation query:
Location | Date | intime | total | percentage |
---|---|---|---|---|
New York | May 24 | 3 | 3 | 100 |
Ohio | May 24 | 1 | 2 | 50 |
San Fransisco | June 24 | 1 | 2 | 50 |
My problem: certain data drops out of the aggregation because there are no intime processes at that location and month.
The output that I need: (follows directly from the base data above)
Location | Date | intime | total | percentage |
---|---|---|---|---|
New York | May 24 | 3 | 3 | 100 |
Ohio | May 24 | 1 | 2 | 50 |
Ohio | June 24 | 0 | 1 | 0 |
San Fransisco | June 24 | 1 | 2 | 50 |
San Fransisco | July 24 | 0 | 1 | 0 |
Chicago | May 24 | 0 | 2 | 0 |
The query that I am currently using for aggregation is:
SELECT
gesamt2.Location,
gesamt2.date,
intime2.anzahl AS intime,
gesamt2.anzahl AS total,
CAST(intime2.anzahl AS float) / CAST(gesamt2.anzahl AS float) * 100 AS percentage
FROM
(
SELECT
location,
anzahl,
date
FROM
(
SELECT
Location,
Date,
timeliness,
COUNT(1) AS anzahl
FROM
basedata
GROUP BY ROLLUP (Location, Date, Timeliness)
) AS gesamt
WHERE
(timeliness IS NULL)
) AS gesamt2
INNER JOIN
(
SELECT
mapping, anzahl, date
FROM
(
SELECT
Location,
Date,
timeliness,
COUNT(1) AS anzahl
FROM
basedata
WHERE
(BASE_FINAL_APPROVAL > GETDATE() - 366)
GROUP BY ROLLUP (Location, Date, timeliness)
) AS intime
WHERE
(timeliness = 'in time')
) AS intime2 ON
gesamt2.Location = intime2.Location
AND
gesamt2.date = intime2.date
The query is much simpler if you use conditional aggregation.
SELECT *, ROUND(intime/total*100, 0) as percentage
FROM
(SELECT Location,
[Date],
SUM(CASE WHEN Timeliness='in time' THEN 1 ELSE 0 END) as intime,
COUNT(*) as total
FROM YourTable
GROUP BY Location, [Date]) agg
;
or with CTE
WITH agg AS
(SELECT Location,
[Date],
SUM(CASE WHEN Timeliness='in time' THEN 1 ELSE 0 END) as intime,
COUNT(*) as total
FROM YourTable
GROUP BY Location, [Date])
SELECT *, ROUND(intime/total*100, 0) percentage
FROM agg
The above was not tested in SQL Server.