sql-serverindexingcomposite-index

Index for two columns that will be used in an operation on where clause?


In SQL Server, I have a table order_lines with ordered and served columns.

What should I do in order to make the following query not scan the whole table?

SELECT *
FROM order_lines
WHERE ordered - served > 0

Option 1: index both columns. This shouldn't work as SQL Server should still need to scan all the index to perform the subtract and check if the row is valid for the query, right?

Option 2: create a "pending" column and index that

Option 3: any other option?


Solution

  • ordered - served > 0 isn't generally sargable.

    Nor can it be rearranged to be sargable by expressing it as ordered > served. Column to column comparisons are not sargable.

    You can't sensibly use an index seek to find a row if the value to use in the seek is contained in the row itself.

    ordered - served > 0 can be made sargable by using a computed column however.

    Assuming your original table is something like

    CREATE TABLE order_lines
    (
    order_line_id INT NOT NULL PRIMARY KEY,
    order_id INT NOT NULL,
    product_id INT NOT NULL,
    /*Other columns*/
    ordered  INT NOT NULL,
    served  INT NOT NULL
    )
    

    You can create a computed column...

    ALTER TABLE order_lines ADD pending AS ordered - served;
    

    And then index that computed column

    CREATE INDEX IX_pending ON order_lines(pending);
    

    In order to "cover" the query given in the question the index would need an INCLUDE clause containing every column in the table which is very unlikely to be justified by a cost/benefit analysis here.

    So if the query uses the above index it will also need to do lookups back to the base table to get the missing columns and this query plan will only be chosen if the predicate is fairly selective. The missing index warning in the plan below is complaining about the non included columns.

    execution plan showing filtered index seek, key lookups, and missing index warning

    Do you really need to use * here or can you select fewer columns and then include those in the index?

    Assuming unfulfilled order lines are only a small proportion of the table and you are primarily interested in finding these then a potential middle ground could have been

    CREATE INDEX IX_pending ON order_lines(pending)
    INCLUDE (order_line_id,order_id,product_id, ordered, served)
    WHERE pending > 0; 
    

    ⚠️ But this is not a valid index creation as the filter predicate can not be a computed column (or be an expression like ordered > served)

    An indexed view would allow you to do much the same thing though

    CREATE VIEW dbo.pending_order_lines
    WITH SCHEMABINDING
    AS
    SELECT order_line_id, order_id, product_id, ordered, served
    FROM   dbo.order_lines 
    WHERE ordered > served; 
    
    GO
    
    CREATE UNIQUE CLUSTERED INDEX UCIX ON dbo.pending_order_lines(order_line_id)
    

    And then with example usage

    SELECT *
    FROM dbo.pending_order_lines WITH(NOEXPAND)
    

    Depending on the edition of the engine you are using the NOEXPAND hint may or may not be required in order to use indexed views effectively.