splunksplunk-query

Splunk - Grouping by distinct field with stats of another field


I have the following Splunk search that gathers the different Statuses from my dataset:

some type of search | eval Status = (REJECT_REASON) | bucket _time span=day | stats count by Status

This is a sample how my dataset looks:

CorrelationId      Reject_Reason    DATE_TIME
12345679           Accepted         20231030 14:00:00 
12345679           Accepted         20231030 14:00:00
12345679           Accepted         20231030 14:00:00
12345679           Sent             20231030 00:00:00
12345679           Sent             20231030 00:00:00
12345679           Sent             20231030 00:00:00
99399394           Rejected         20231030 00:00:00
99399394           Rejected         20231030 00:00:00
88393933           Accepted         20231030 14:00:00
88393933           Sent             20231030 00:00:00
33454545           Rejected         20231030 00:00:00

I would like to only get statuses for the distinct correlationId's, meaning that with the sample dataset I would only get back a count for 4 correlationId's and the statuses that are the latest date.

Example of desired result:

Status    Count
Accepted   2
Rejected   2

I have tried using "dedup correlationId" but it returned no results when I added that on to the search.


Solution

  • Based on the latest (as of Nov 1st) requirements below is my query:

    |makeresults count=11 | streamstats count
    | eval CorrelationID=case(count >=1 and count<=6, 12345679, count in (7,8), 99399394, count in (9,10), 88393933, count=11, 33454545),
           Reject_Reason=case(count in (1,2,3) OR count=9, "Accepted", count in (4,5,6) or count=10, "Sent", count in (7,8)  or count=11, "Rejected"),
           DATE_TIME=case(count in (1,2,3) or count=9, "20231030 14:00:00", true(), "20231030 00:00:00" )
    | fields - _time, count      
    ``` The above is test data setup ```
    | eval Status=Reject_Reason
    | eventstats max(DATE_TIME) as mx by CorrelationID
    | where DATE_TIME=mx
    | dedup CorrelationID, Status
    | stats dc(CorrelationID) as "Count" by Status