mysqlgroup-bycount

MySQL: How to count rows (on different conditions) on a filtered data set?


With my limited knowledge of MySQL I cannot find a proper syntax for what I want to obtain in MySQL (MariaDB 10.0; it is not possible to upgrade it to more recent releases):

  1. upload the last 24h of data (in this example: CPU_load; in real life it could be a computationally heavy filtering and not a simple last 24h like in this example);

  2. count the values that satisfy different conditions upon the filtered set.

The single query "how many values are above 95% in the last 24 hours?" is straightforward:

SELECT COUNT(*)
FROM (
    SELECT *
    FROM `server-statistics` 
    WHERE `Date_Time` >= DATE_SUB(UTC_TIMESTAMP(), INTERVAL 1 DAY) 
) as H24
WHERE CPU_load > 95

For the sake of explanation I left the (here useless) definition of H24.

Problem

If I am going to count on multiple conditions (i.e.: >90% in the last 24h, >95% in the last 24h, >98% in the last 24h, etc.) with a number of queries like the previous one, the filtered group H24 is recalculated over and over.

This is not a real problem if it is cached among calls, but I would prefer to count the rows on multiple conditions in a single shot.

In pseudo-SQL it should be something like:

SELECT COUNT(*) WHERE CPU_load > 90,
       COUNT(*) WHERE CPU_load > 95,
       COUNT(*) WHERE CPU_load > 98
FROM `server-statistics` 
WHERE `Date_Time` >= DATE_SUB(UTC_TIMESTAMP(), INTERVAL 1 DAY) 

Is it possible in pure MySQL?


Solution

  • You can use CASE for the conditional count as below

    SELECT SUM(CASE WHEN CPU_load BETWEEN 90 AND 94 THEN 1 ELSE 0 END) AS A1,
           SUM(CASE WHEN CPU_load BETWEEN 95 AND 97 THEN 1 ELSE 0 END) AS A2,
           SUM(CASE WHEN CPU_load > 98 THEN 1 ELSE 0 END) AS A3
    FROM `server-statistics` 
    WHERE `Date_Time` >= DATE_SUB(UTC_TIMESTAMP(), INTERVAL 1 DAY)
    

    It seems more meaning to count the data on range but if you still want to stay on your condition then use below instead:

    SUM(CASE WHEN CPU_load > 90 THEN 1 ELSE 0 END) AS A1
    SUM(CASE WHEN CPU_load > 95 THEN 1 ELSE 0 END) AS A2
    SUM(CASE WHEN CPU_load > 98 THEN 1 ELSE 0 END) AS A3
    

    To address decimal values as 94.5

    --Adjust the comparison value according to your requirement just provided the example.
    
    SUM(CASE WHEN CPU_load > 90 AND CPU_load <= 95 THEN 1 ELSE 0 END) AS A1
    SUM(CASE WHEN CPU_load > 95 AND CPU_load <= 98 THEN 1 ELSE 0 END) AS A2
    SUM(CASE WHEN CPU_load > 98 THEN 1 ELSE 0 END) AS A3