t-sqlquery-optimization

Why does TRY_CAST() / TRY_CONVERT() take so overly long for a view outer joined on itself but each side filtered on another String?


I have a daily query that runs in 5 seconds with an output of roughly 50,000 rows:

SELECT
  A.Id
, CAST(NULLIF(A.[value], '') AS DATE) AS MyDateColumn
, B.[value] AS MyStringColumn
FROM [MyServer].[MyDatabase].[MySchema].[MyView] A

FULL OUTER JOIN [MyServer].[MyDatabase].[MySchema].[MyView] B
ON B.Id = A.Id AND B.field_name = 'MyDummyTextToGetAStringInTheValueColumn'

WHERE A.field_name = 'MyDummyTextToGetADateInTheValueColumn'
ORDER BY Id

The query failed since there was a Date-String that led to a conversion error, it later turned out to be a '00000000' Date-String that threw the error. I needed to find out just that. To filter out only the mistaken Date-Strings that throw the error, a TRY_CAST() shall try to convert the String to a Date and if that fails, it throws out a NULL, which I then filtered out with is null. It later turned out that only two of the filled Date Strings were wrong ('00000000' instead of something like '20240720'). But at first, this query took far too long time:

SELECT DISTINCT
  A.Id
--, TRY_CAST(NULLIF(A.[value], '') AS DATE) AS MyDateColumn
, A.[value]
, B.[value] AS MyValueColumn
FROM [MyServer].[MyDatabase].[MySchema].[MyView] A

FULL OUTER JOIN [MyServer].[MyDatabase].[MySchema].[MyView] B
ON B.Id = A.Id AND B.field_name = 'MyDummyTextToGetAStringInTheValueColumn'

WHERE A.field_name = 'MyDummyTextToGetADateInTheValueColumn'
and TRY_CAST(NULLIF(A.[value], '') AS DATE) is null and A.[value] is not null
ORDER BY Id

I had to force stop it. Why does this query take so long for the outer join?

Table column A.b has:

The left "table" is a view filtered on a String and the right "table" is the same view but filtered on another String. Thus, it is a self-join.

The key in the view is unique, the output of the whole query will be just 50,000 rows. For the first 10,000 rows, it needs 20 minutes, and the rows come in slowly, I checked it many times with CtrlEnd in the result window. The outer join works, I can see the rows coming in slowly without any duplicates.

I first thought that it is since the TRY_CAST() is both in the SELECT and WHERE condition. Coding it like that did not make any sense anyway since I was searching for dates that could not be converted (CAST(NULLIF(A.b, '') AS DATE), it needs to be in the WHERE condition or the SELECT, not in both, since I either want to see only the failed conversions or all rows with a NULL for each failed conversion, but not both, if I only try to find the Id:s of the failed conversions, see the examples at Conversion failed when converting date and/or time from character string while inserting datetime.

I dropped the TRY_CAST() from the SELECT, and the query was still too slow. Here is the query with table constructrors instead of views so that it can be run for a quick check since I cannot share the connection to the data.

WITH A as 
(
    SELECT
    a
    , b
    , c 
    FROM 
    (
    VALUES
    (1, '20240630', 'xyz') 
    ,(2, '20240631', 'xyz') -- which is not a date
    -- ... a dozen million rows which become only 50,000 when filtered by a String (the linked value to it is a date-String column)
    )
    AS MyTable(a, b, c)
)
,
B as
(
    SELECT
    a 
    , d
    FROM 
    (
    VALUES
      (1, 'test1')
    , (2, 'test2')
    -- ... a dozen million rows which become only 15,000 when filtered by a String (the linked value to it is a String column)
    )
    AS MyTable(a, d)
)
SELECT 
A.a
, NULLIF(A.b, '') as b
--, CAST(NULLIF(A.b, '') AS DATE) -- this throws: "Conversion failed when converting date and/or time from character string"
--, TRY_CAST(NULLIF(A.b, '') AS DATE) as b_check
, B.d
from A
FULL OUTER JOIN
B
ON A.a = B.a
WHERE A.c = 'xyz' -- this shrinks down the view A from 13 million to 50,000 rows
and B.d = 'test1' -- this shrinks down the view B from 13 million to 15,000 rows 
AND TRY_CAST(NULLIF(A.b, '') AS DATE) IS NULL AND A.b IS NOT NULL

The query is fast if I run it without the outer join. But it does not seem to be the outer join itself that makes it so slow since the same query, without the AND TRY_CAST(NULLIF(A.b, '') AS DATE) IS NULL AND A.b IS NOT NULL condition, runs in 8 seconds every day, and the only thing that is new here is that I try to find the few Id:s that throw the conversion error so that I put the TRY_CAST() condition.

Why does TRY_CAST() / TRY_CONVERT() take so overly long for a query with 50K rows (filtered from a dozen million) outer joined to one with 15K rows (filtered from a dozen million)?

More on A and B

Both A and B are the same view, which has a dozen million of rows unless filtered like it is done here. The view fetches data with a parametrised cursor (that gets the passed arguments) from indexed tables. It then gets its data from inner joins of quite a few indexed tables. A and B are just two filters on the same view. A and B themselves are not materialized, I only treat them as tables here to show it as an example with table constructors that can be run for a quick test. I cannot share the data.

Both are not indexed since they come from the filtered view, but filtering the view by the String runs in seconds, and after the filter, they are small: After filtering A and B by a String, they are shrunk down to A=50,000 rows and B=15,000 rows. That is why I do not think that indexes will change the speed of the query (unless the WHERE condition is run only after the parametrised view that runs a cursor, but again: if you filter the view with the String condition, it takes seconds to get the output).


Solution

  • The TRY_CAST() does not make the query so slow. It is the 80 percent empty String values ('') in column b of the left A table that are not filtered out if you filter against NULL. There are very few NULL values, perhaps a percent, which get filtered out by the WHERE condition, but roughly 40,000 empty Strings that do not get filtered if you check against a NULL.

    Instead of:

    WHERE TRY_CAST(NULLIF(A.b, '') AS DATE) IS NULL AND A.b IS NOT NULL

    ..you need to code:

    WHERE TRY_CAST(NULLIF(A.b, '') AS DATE) IS NULL AND NULLIF(A.b,'') IS NOT NULL

    Or, which might be shorter and faster, with the same output (thank you for the hint in the remarks):

    WHERE TRY_CAST(NULLIF(A.b, '') AS DATE) IS NULL AND A.b > ''

    I still do not know why TSQL takes such a long time if you look up the Id:s of 50,000 rows in the 15,000 rows. It might be the DISTINCT together with the TRY_CAST() and the cursor that runs in the view, but I can quickly shrink down the view to just 50,000 and 15,000 rows so that it is not about checking a dozen million rows with a cursor, and that cursor seems to run fast. But I know that if you do not set the roughly 40,000 emtpy Strings out of 50,000 rows in A.b to NULL in the WHERE condition before you filter against NULL, you still have those 80 percent unneeded lookups.

    With NULLIF(A.b,'') IS NOT NULL or A.b > '', the query takes 4 seconds.

    Takeaway

    The query time needs over an hour for 50,000 rows where I mistakenly filter against NULL. If I rightly filter against NULL and empty Strings (which together comes up to > ''), I need to lookup only 2 rows and the query is done in 2 seconds.