group-byazure-data-explorerkqlanomaly-detection

Kusto, doing anomaly detection using series_decompose_anomalies and run that based on groupie


I am trying to write an anomaly detection code on Kusto, say checking the rate of selling bad apples compared to all between distributors. I want to make the series and have the bad_apples_per_all_t to get bad_apples_per_all per distributor on daily basis. It is giving me error :

Semantic error: Invalid summarize aggregation expressions usage: summarize must have at least one aggregate function. error comes from make-series line

any suggestions? Here is the sample code:

datatable(table_name: string, date: datetime, total_apples: int, total_bad_apples: int)
[
 
    [“distributor1", datetime(2023-06-01), 200, 10],
    ["distributor1", datetime(2023-06-02), 300, 20],
    ["distributor1", datetime(2023-06-03), 400, 30],
    ["distributor1", datetime(2023-06-04), 500, 40],
    ["distributor1", datetime(2023-06-05), 600, 50],
    ["distributor1", datetime(2023-06-06), 700, 60],
    ["distributor2", datetime(2023-06-01), 100, 5],
    ["distributor2", datetime(2023-06-02), 200, 10],
    ["distributor2", datetime(2023-06-03), 150, 8],
    ["distributor2", datetime(2023-06-04), 250, 12],
    ["distributor2", datetime(2023-06-05), 180, 9],
    ["distributor2", datetime(2023-06-06), 220, 11]
]



let start_date=(now()-70d);
let end_date=now();
table_name
|where date between (start_date .. end_date ) 
|summarize all_apples=sum(total_apples), bad_apples=sum(total_5g_drops) by date, distributor1, distributor2
|where all_apples>=100
|project bad_apples_per_all=round(iff(all_apples!=0, todouble(bad_apples)/todouble(all_apples), 0.000000000),10), date, distributor1, distributor2
|make-series bad_apples_per_all_t=bad_apples_per_all on date from start_date to end_date step 1d by distributor1, distributor2
|extend (anomalies, score, baseline) = series_decompose_anomalies(bad_apples_per_all_t, 1.5,-1,'linefit') 

Thanks


Solution

  • The error message you received indicates that there is an issue with the summarize aggregation function in the make-series query. The error message specifically states that the summarize function must have at least one aggregate function.

    To fix this error, you need to add an aggregate function to the summarize function. An aggregate function is a function that takes a set of values and returns a single value. Examples of aggregate functions include count(), sum(), avg(), min(), and max().

    Here is an example of how to use the sum() aggregate function in the summarize function:

    table
    |make-series bad_apples_per_all_t=sum(bad_apples_per_all) on date_col from start_date to end_date step 1d by table_name
    

    Since the requirement is to get all bad apples per distributor on daily basis, sum() aggregation is to be used in the code.

    Sample code:

    let table_name1 = datatable(table_name: string, date_col: datetime, total_apples: int, total_bad_apples: int)
    
    [
    
    "distributor1", datetime(2023-06-01), 200, 10,
    
    "distributor1", datetime(2023-06-02), 300, 20,
    
    "distributor1", datetime(2023-06-03), 400, 30,
    
    "distributor1", datetime(2023-06-04), 500, 40,
    
    "distributor1", datetime(2023-06-05), 600, 50,
    
    "distributor1", datetime(2023-06-06), 700, 60,
    
    "distributor2", datetime(2023-06-01), 100, 5,
    
    "distributor2", datetime(2023-06-02), 200, 10,
    
    "distributor2", datetime(2023-06-03), 150, 8,
    
    "distributor2", datetime(2023-06-04), 250, 12,
    
    "distributor2", datetime(2023-06-05), 180, 9,
    
    "distributor2", datetime(2023-06-06), 220, 11,
    
    ];
    
    let start_date=(now()-70d);
    
    let end_date=now();
    
    table_name1
    
    |where date_col between (start_date .. end_date )
    
    |summarize all_apples=sum(total_apples), bad_apples=sum(total_bad_apples) by date_col, table_name
    
    |where all_apples>=100
    
    |project bad_apples_per_all=round(iff(all_apples!=0, todouble(bad_apples)/todouble(all_apples), 0.000000000),10), date_col, table_name
    
    |make-series bad_apples_per_all_t=sum(bad_apples_per_all) on date_col from start_date to end_date step 1d by table_name
    
    |extend (anomalies, score, baseline) = series_decompose_anomalies(bad_apples_per_all_t, 1.5,-1,'linefit')
    

    fiddle

    Reference: make-series operator | Microsoft Learn