I have a table mytable with some columns including the column datekey
(which is a date
and has an index), a column contents
which is a varbinary(max)
, and a column stringhash
which is a varchar(100)
. The stringhash
and the datekey
together form the primary key of the table. Everything is running on my local machine.
Running
SELECT TOP 1 * FROM mytable where datekey='2012-12-05'
returns 0 rows and takes 0 seconds.
But if I add a datalength
condition:
SELECT TOP 1 * FROM mytable where datekey='2012-12-05' and datalength(contents)=0
it runs for a very long time and does not return anything before I give up waiting.
My question: Why? How do I find out why this takes such a long time?
Here is what I checked so far:
When I click "Display estimated execution plan" it also takes a very long time and does not return anything before I give up waiting.
If I do
SELECT TOP 1000 datalength(contents) FROM mytable order by datalength(contents) desc
it takes 7 seconds and returns a list 4228081, 4218689 etc.
exec sp_spaceused 'mytable'
returns
rows reserved data index_size unused
564019 50755752 KB 50705672 KB 42928 KB 7152 KB
So the table is quite large at 50 GB. Running
SELECT TOP 1000 * FROM mytable
takes 26 seconds.
The sqlservr.exe process is around 6 GB which is the limit I have set for the database.
It takes a long time because your query needs DATALENGTH to be evaluated for every row and then the results sorted before it can return the 1st record. If the DATALENGTH of the field (or whether it contains any value) is something you're likely to query repeatedly, I would suggest an additional indexed field (perhaps a persisted computed field) holding the result, and searching on that.