I'm building a few custom reports against a SQL Server 2005 database. The database belongs to a 3rd party management application we run. The data I'm pulling isn't among the primary purposes of the site, so the data is largely non-indexed except for the timestamp column. For now, there is only one table involved -- a table of some 700 million rows. So when I run a query against it that should return only 50 rows, it has to poll all 700mil.
I'm looking to speed this up, but don't want to index every column that I'm adding to the WHERE clause -- I don't know that adding that many indexes will end up improving the speed much (or am I wrong?). So I'm curious what the best practice would be if I couldn't add any new indexes to the table!
Stored procedure doesn't seem like the best fit. An indexed view might be the best idea? Thoughts?
Here's the table schema:
DeviceGuid (PK, uniqueidentifier, not null)
DeviceID (int, not null)
WindowsEventID (PK, int, not null) (indexed)
EventLog (varchar(64), not null)
EventSource (varchar(64), not null)
EventID (int, not null)
Severity (int, not null)
Description (nvarchar(max), not null)
TimeOfEvent (PK, datetime, not null) (indexed)
OccurrenceNbr (int, not null)
Here's a sample query:
SELECT COUNT(*) AS NumOcc, EventID, EventLog, EventSource, Severity, TimeOfEvent, Description
FROM WindowsEvent
WHERE DeviceID='34818'
AND Severity=1
AND TimeOfEvent >= DATEADD(hh, DATEDIFF(hh, GETDATE(), GETUTCDATE()), '2010/10/27 12:00:00 AM')
AND TimeOfEvent <= DATEADD(hh, DATEDIFF(hh, GETDATE(), GETUTCDATE()), '2010/11/3 12:00:00 AM')
AND EventID<>34113
AND EventID<>34114
AND EventID<>34112
AND EventID<>57755
AND EventSource<>'AutoImportSvc.exe'
AND EventLog='Application'
GROUP BY EventID, EventLog, EventSource, Severity, Description
ORDER BY NumOcc DESC
Maybe the query sucks ... it returns 53 rows in 4.5 minutes.
The final solution here was to run a query against the indexed fields, then filter them within the application running the query. Two fields ended up containing similar enough information that I could query against one index and get a very close approximation of the data I wanted. I looped back through and removed any non-matching entities from the result list. Took MUCH less time!