jsonsql-serverindexingquery-optimizationvirtual-column

Create a virtual column and index to optimize a filter for an array in a JSON column


Here is the microsoft document for how to optimize a json column with out using OpenJson and only using Json_Value: https://learn.microsoft.com/en-us/sql/relational-databases/json/index-json-data?view=sql-server-ver15

My issue is that I have a JSON column that contains an array where I am trying to grab all keys called Test_ID from each element in the array to compare with a joined statements testId and, while it works, its relatively slow. Takes about 9 seconds for 400 rows. I am trying to speed this up exponentially and it seems the only way to do so is through the indexing mentioned in that article but I can not seem to figure out how to do it for an array.

My JSON is similar to this: '{"Property":{"Label":"0"},"Tests":[{"Test_ID":"GUID_HERE","Type":{"Label":" "},"Name":{"Label":" "},"Value":null,{"Test_ID":"GUID_HERE","Type":{"Label":" "},"Name":{"Label":" "},"Value":" "}]}'

Here is my scrubbed query

SELECT DISTINCT w.W_ID, 
       'Proc' ProcHeaderName, p.ProcNumber ProcValue, 
       'Class' ClassHeaderName, p.Class ClassValue 
INTO #Procs
FROM proc p
LEFT JOIN (SELECT wt.W_ID, wt.TestId 
    from TestValue wt where wt.IsDeleted = 0) as wtRow on wtRow.W_ID in (SELECT ID FROM #tmp) 
LEFT JOIN TableNameHere c on c.IsDeleted = 0 and c.col_ID in (SELECT col_ID FROM tmp)
WHERE p.IsDeleted = 0 and [dbo].[GetTestIdJson](c.Json, wtRow.TestId) = wtRow.TestId
    AND p.ProcNumber + ',' + p.RNumber = JSON_VALUE(c.Json,'$.Property.Label') + ',' + JSON_VALUE(c.Json,'$.Property.Label')
GROUP BY wtRow.W_ID, p.ProcNumber, p.Class

Solution

  • … indexed view … fiddle

    create table dbo.a (id int primary key, json nvarchar(max));   
    insert into dbo.a values(1, '{"Property":{"Label":"0"},"Tests":[{"Test_ID":"GUID_HERE1","Type":{"Label":" "},"Name":{"Label":" "},"Value":null},{"Test_ID":"GUID_HERE2","Type":{"Label":" "},"Name":{"Label":" "},"Value":" "}]}');
    insert into dbo.a values(2, '{"Property":{"Label":"0"},"Tests":[{"Test_ID":"GUID_HERE21","Type":{"Label":" "},"Name":{"Label":" "},"Value":null},{"Test_ID":"GUID_HERE22","Type":{"Label":" "},"Name":{"Label":" "},"Value":" "}]}');
    GO
    --numbers table
    create table dbo.n(n int primary key);
    insert into dbo.n values(0),(1),(2),(3),(4),(5),(6),(7),(8),(9),(10); --assume max 11 elements in Tests[]
    GO
    create view dbo.v
    with schemabinding
    as
    select a.id, n.n, json_value(a.json, concat('$.Tests[', n.n,'].Test_ID')) as Test_Id
    from dbo.a as a
    cross join dbo.n as n
    where json_value(a.json, concat('$.Tests[', n.n,'].Test_ID')) is not null;
    GO
    create unique clustered index vuidx on dbo.v(id,n);
    create index idTestId on dbo.v(Test_Id);
    GO
    select * from dbo.v
    GO
    set statistics xml on;
    select *
    from dbo.v with(noexpand)
    where Test_Id = 'GUID_HERE2';
    GO
    drop view if exists dbo.v;
    GO
    drop table if exists dbo.n;
    GO
    drop table if exists dbo.a;