sql-servert-sqlcommon-table-expressionjson-query

In T-SQL, why is the ISJSON function not filtering out bad data when used in a CTE?


I have a table with a varchar(max) column that stores JSON data and one of the records has an extra comma at the end of an array in the JSON. The following statement results in an error

Unexpected character ',' is found...

Code:

SELECT JSON_QUERY(JsonField, '$.SomeProperty') AS JsonData
FROM MyTable

But the following code works because of the WHERE clause:

SELECT JSON_QUERY(JsonField, '$.SomeProperty') AS JsonData
FROM MyTable 
WHERE ISJSON(JsonField) = 1

I can use that statement as a CTE as follows:

WITH cte AS 
(
    SELECT JSON_QUERY(JsonField, '$.SomeProperty') AS JsonData
    FROM MyTable 
    WHERE ISJSON(JsonField) = 1
)
SELECT JsonData 
FROM cte

But when I try to filter on the cte, I get the same error:

WITH cte AS 
(
    SELECT JSON_QUERY(JsonField, '$.SomeProperty') AS JsonData 
    FROM MyTable 
    WHERE ISJSON(JsonField) = 1
)
SELECT JsonData 
FROM cte 
WHERE LEN(JsonData) > 500

Obviously, I can move the 2nd WHERE clause up to the definition of the CTE, but there are other filters I want to use and this is the the most straight forward filter I can use to provide this example.

Am I missing something or is this a bug? My workaround is to use a temp table instead.


Solution

  • You can use the following expression so that the JSON_QUERY is only evaluated if the ISJSON condition is met.

    SELECT CASE WHEN ISJSON(JsonField) = 1 THEN JSON_QUERY(JsonField, '$.SomeProperty') END AS JsonData
    FROM MyTable 
    WHERE ISJSON(JsonField) = 1
    

    This is mostly reliable. The documentation says

    You should only depend on order of evaluation of the WHEN conditions for scalar expressions (including non-correlated subqueries that return scalars), not for aggregate expressions.

    And this is a scalar expression - not an aggregate. (One other case to be aware of is if you test this with a single literal value you can also encounter the error at plan compilation time as SQL Server tries to constant fold it but this shouldn't affect you as you are selecting from a table)

    But really you need to be using SQL Server 2022 so you can test ISJSON(JsonField, OBJECT).

    Otherwise the string '1' will pass the ISJSON check (as ISJSON(JsonField, VALUE) is 1) but still barf at the JSON_QUERY.

    I suppose on earlier versions you could also add an additional condition (to both CASE and WHERE) that JsonField is LIKE '{%'