I'm trying to visualize the browser statistics of our app hosted in Azure.
For that I'm using the nginx logs and run an Azure Log Analytics query like this:
ContainerLog
| where LogEntrySource == "stdout" and LogEntry has "nginx"
| extend logEntry=parse_json(LogEntry)
| extend userAgent=parse_user_agent(logEntry.nginx.http_user_agent, "browser")
| extend browser=parse_json(userAgent)
| summarize count=count() by tostring(browser.Browser.Family)
| sort by ['count']
| render piechart with (legend=hidden)
Then I'm getting this diagram, which is exactly what I want:
But the query is very very slow. If I set the time range to more than just the last few hours it takes several minutes or doesn't work at all.
My solution is to use a search job like this:
ContainerLog
| where LogEntrySource == "stdout" and LogEntry has "nginx"
| extend d=parse_json(LogEntry)
| extend user_agent=parse_user_agent(d.nginx.http_user_agent, "browser")
| extend browser=parse_json(user_agent)
It creates a new table BrowserStats_SRCH on which I can do this search query:
BrowserStats_SRCH
| summarize count=count() by tostring(browser.Browser.Family)
| sort by ['count']
| render piechart with (legend=hidden)
This is much faster now and only takes some seconds.
But my problem is, how can I keep this up-to-date? Preferably this search job would run once a day automatically and refreshed the BrowserStats_SRCH table so that new queries on that table run always on the most recent logs. Is this possible? Right now I can't even trigger the search job manually again, because then I get the error "A destination table with this name already exists".
In the end I would like to have a deeplink to the pie chart with the browser stats without the need to do any further click. Any help would be appreciated.
But my problem is, how can I keep this up-to-date? Preferably this search job would run once a day automatically and refreshed the BrowserStats_SRCH table so that new queries on that table run always on the most recent logs. Is this possible?
You can leverage the api to create a search job. Then use a timer triggered azure function or logic app to call that api on a schedule.
PUT https://management.azure.com/subscriptions/00000000-0000-0000-0000-00000000000/resourcegroups/testRG/providers/Microsoft.OperationalInsights/workspaces/testWS/tables/Syslog_suspected_SRCH?api-version=2021-12-01-preview
with a request body containing the query
{
"properties": {
"searchResults": {
"query": "Syslog | where * has 'suspected.exe'",
"limit": 1000,
"startSearchTime": "2020-01-01T00:00:00Z",
"endSearchTime": "2020-01-31T00:00:00Z"
}
}
}
Or you can use the Azure CLI:
az monitor log-analytics workspace table search-job create --subscription ContosoSID --resource-group ContosoRG --workspace-name ContosoWorkspace --name HeartbeatByIp_SRCH --search-query 'Heartbeat | where ComputerIP has "00.000.00.000"' --limit 1500 --start-search-time "2022-01-01T00:00:00.000Z" --end-search-time "2022-01-08T00:00:00.000Z" --no-wait
Right now I can't even trigger the search job manually again, because then I get the error "A destination table with this name already exists".
Before you start the job as described above, remove the old result table using an api call:
DELETE https://management.azure.com/subscriptions/{subscriptionId}/resourcegroups/{resourceGroupName}/providers/Microsoft.OperationalInsights/workspaces/{workspaceName}/tables/{tableName}?api-version=2021-12-01-preview
Optionally, you could check the status of the job using this api before you delete it to make sure it is not InProgress
or Deleting