Currently I have two SQL query proposals which call DATALENGTH
on the same VARBINARY
column more than once, due to conditions in the WHERE
clause.
Do these multiple calls to DATALENGTH
in the same query interfere in the performance, or can the result of the DATALENGTH
calculation be optimized/cached by Microsoft SQL Server?
The goal is to build an EndsWith
SQL query for VARBINARY(MAX) FILESTREAM
columns.
In my case, I cannot add any extra indexes or columns to the table, however I'd like to understand what are the possible alternatives.
-- Table with BinData
DECLARE @foo TABLE(BinData VARBINARY(MAX));
INSERT INTO @foo (BinData) VALUES
(0x00001125), (0x00112500), (0x11250000),
(0x00000011), (0x00001100), (0x00110000), (0x11000000),
(0x112500001125);
-- BinKey
DECLARE @BinKey VARBINARY(MAX) = 0x1125;
-- Search for [BinData] ending with @BinKey --
What I have tried so far...
Proposal A - REVERSE
BinKey and BinData, and then use CHARINDEX
.
SELECT * FROM @foo WHERE
DATALENGTH(BinData) >= DATALENGTH(@BinKey) AND
CHARINDEX
(
CONVERT(VARCHAR(MAX), @BinKey),
CONVERT(VARCHAR(MAX), BinData)
) = 1 + DATALENGTH(BinData) - DATALENGTH(@BinKey);
Proposal B - Cut the ending part of the BinData and compare it to BinKey.
SELECT * FROM @foo WHERE
DATALENGTH(BinData) >= DATALENGTH(@BinKey) AND
SUBSTRING(
BinData,
1 + DATALENGTH(BinData) - DATALENGTH(@BinKey),
DATALENGTH(@BinKey)
) = @BinKey;
Proposal from James L. - Simply use the RIGHT
function.
SELECT * FROM @foo WHERE
RIGHT(BinData, DATALENGTH(@BinKey)) = @BinKey;
The result for the queries listed above must be:
0x00001125
0x112500001125
Is there any better algorithm for this? How well would it perform for larger blobs?
I'm not sure how to analyze how this will perform, but this does what you asked for:
-- Table with BinData
DECLARE @foo TABLE(BinData VARBINARY(MAX));
INSERT INTO @foo (BinData) VALUES
(0x00001125), (0x00112500), (0x11250000),
(0x00000011), (0x00001100), (0x00110000), (0x11000000),
(0x112500001125);
-- BinKey
DECLARE @BinKey VARBINARY(MAX) = 0x1125;
-- Search for [BinData] ending with @BinKey --
select *
from @foo
where right(BinData, datalength(@BinKey)) = @BinKey
Which returns the following:
BinData
----------------
0x00001125
0x112500001125
I added 10,000 records, each about 1kb in size. It took .020 seconds to query all of them. I also tried adding 10 records 1MB in size after the 10,000 1kb records. It took .037 seconds to run the query. When the 1kb or 1MB records end with 0x1125, then it increases the execution time by a little bit, because SSMS has to receive and format the record in the output window. 50 1kb rows that end in 0x1125 took .057 seconds to run. When one of the 1MB rows ended in 0x1125, it took .110 seconds to run. It's not very scientific, but it seems to perform adequately.