I have several Windows VMs running on Azure that are configured to collect performance counters and event logs.
All of this is configured in the "Diagnostic settings..." on the VM resource inside Azure Portal. There's a Windows Azure Diagnostics agent that collects this data on the VM and stores it into a storage account (inside Table Storage).
All of this collected data (performance counters, metrics, logs, etc.) doesn't have any retention policy and there doesn't seem to be any way of setting it up. So it just accumulates in the storage account's table storage forever.
This is where my problem is -- there's now too much data in these tables (several terabytes in my case) and it's costing a lot of money just to keep it. And it's only going to keep increasing over time.
The relevant storage account tables are tables like:
WADMetrics*
(Windows Azure Diagnostics Metrics Table)WADPerformanceCountersTable
(Windows Azure Diagnostics Performance Counters Table)WASWindowsEventLogsTable
(Windows Azure Diagnostics Windows Event Logs Table)Is there some way how to delete old data in these tables so it wouldn't break anything? Or even better, is there some way to configure retention policy or set it up so that it doesn't keep accumulating forever?
Is there some way how to delete old data in these tables so it wouldn't break anything?
You would need to do this manually. The way this would work is that you will first query the data that needs to be deleted and then once you get the data you will delete it. PartitionKey
attribute of the entities stored in these tables actually represents a date/time value (in ticks prepended with zeroes to make it an equal length string) so you would need to take the from and to date/time values, convert them to ticks, make it a 19 character long string (by prepending appropriate number of zeroes) and query the data. Once you get the data on the client side, you will send delete request back to table storage.
To speed up the whole process, there are a few things you could do:
PartitionKey
and RowKey
attributes as only these two attributes are needed for deletion.I wrote a blog post some time ago that you may find helpful: https://gauravmantri.com/2012/02/17/effective-way-of-fetching-diagnostics-data-from-windows-azure-diagnostics-table-hint-use-partitionkey/.
Or even better, is there some way to configure retention policy or set it up so that it doesn't keep accumulating forever?
Unfortunately there isn't at least as of today. There's a retention setting but that's only for blobs.