kqlazure-log-analyticsazure-diagnostics

How better I can optimize this Kusto Query to get my logs


I have below query which I am running and getting logs for Azure K8s, but its takes hour to generate the logs and i am hoping there is a better way to write what i have already written. Can some Kusto experts advice here as how can I better the performance?

AzureDiagnostics 
| where Category == 'kube-audit'
| where TimeGenerated between (startofday(datetime("2022-03-26")) .. endofday(datetime("2022-03-27"))) 
| where (strlen(log_s) >= 32000
and not(log_s has "aksService") 
and not(log_s has "system:serviceaccount:crossplane-system:crossplane")    
or strlen(log_s) < 32000
| extend op = parse_json(log_s) 
| where not(tostring(op.verb) in ("list", "get", "watch"))   
| where substring(tostring(op.responseStatus.code), 0, 1) == "2"
| where not(tostring(op.requestURI) in ("/apis/authorization.k8s.io/v1/selfsubjectaccessreviews"))
| extend user = op.user.username
| extend decision = tostring(parse_json(tostring(op.annotations)).["authorization.k8s.io/decision"])
| extend requestURI = tostring(op.requestURI)
| extend name = tostring(parse_json(tostring(op.objectRef)).name)
| extend namespace = tostring(parse_json(tostring(op.objectRef)).namespace)
| extend verb = tostring(op.verb)
| project TimeGenerated, SubscriptionId, ResourceId, namespace, name, requestURI, verb, decision, ['user']
| order by TimeGenerated asc

Solution

  • You could try starting your query as follow.
    Please note the additional condition at the end.

    AzureDiagnostics 
    | where TimeGenerated between (startofday(datetime("2022-03-26")) .. endofday(datetime("2022-03-27"))) 
    | where Category == 'kube-audit'
    | where log_s hasprefix '"code":2'
    

    I assumed that code is integer, in case it is string, use the following (added qualifier)

    | where log_s has prefix '"code":"2'