splunksplunk-query

splunk query not returning the burst days count correctly for fiscal year


I have structured the splunk query to return the number of burst days. The query has two cases

  1. The fiscal time period the query must consider is from 1st Febuary of current year to 31st January of next year.

  2. When the query is run in the month of January it must still consider the same fiscal year. The query should still consider Febuary 1st of theprevious year.

Here is the query :

index=license_util_summary 
source="hourly data inventory summary generation per sourcetype"
| eval now_month = tonumber(strftime(now(), "%m"))
| eval now_year = tonumber(strftime(now(), "%Y"))
| eval fiscal_start = if(
    now_month==1, strptime("01-02-" . (now_year - 1), "%d-%m-%Y"), 
    strptime("01-02-" . now_year, "%d-%m-%Y")
    )
| eval fiscal_end = now()
| where _time >= fiscal_start AND _time < fiscal_end
| rename di_ind as Index di_sourcetype as Sourcetype
| bin span=1d _time
| stats sum(di_hour_mb) as total_size_in_MB by _time
| eval total_size_in_TB = total_size_in_MB / 1024 / 1024
| where total_size_in_TB >= 41
| stats count as burst_day_count

I expect the query to return the burst day count for the fiscal year (Feb 1st current year -> Jan 31st next year). In case query is run in the month of January then it is still in current fiscal year.


Solution

  • I think this question is quite interesting, although it might be a little off-topic here, since the query seems to be woriking just fine. However here are my 3 cents:

    1. It is bad practice to filter for time later on in the question. The same can be accomplished with a time filter of earliest=-1month@year+1month. See in this table how it works:
    When query is run after "-1month" after "@year" after "+1month"
    05-01-2025 05-12-2024 01-01-2024 01-02-2024
    27-02-2025 27-01-2025 01-01-2025 01-02-2025
    06-06-2025 06-05-2025 01-01-2025 01-02-2025

    Test it today by running earliest=-5month@year+5month to pretend the fiscal year was always starting in June -> you will get the data starting from June 1st 2025. And try earliest=-6month@year+6month for a fiscal year starting in July -> you will get the data starting from July 1st 2024.

    1. When you run the query (as the second most important step after setting the appropriate time) you should reduce the fields. Since you are only interested in the Date and the di_hour_mb field you should subset this in the very beginning.

    Thus your query would become something like this:

    index=license_util_summary 
    source="hourly data inventory summary generation per sourcetype"
    earliest=-1month@year+1month
    | fields _time di_hour_mb
    | bin span=1d _time
    | stats sum(di_hour_mb) AS total_size_in_MB BY _time
    | eval total_size_in_TB = total_size_in_MB / 1024 / 1024
    | where total_size_in_TB >= 41
    | stats count AS burst_day_count