sql-server indexing query-optimization azure-sql-managed-instance composite-index

Composite index on three columns

We have got 93M rows mapping table which is keeping mapping information of three tables and corresponding three tables. We are facing performance issues in accessing data from the mapping table.

TableName	PK information	Number of Rows
Seller	SellerId Primary Key	3000 rows
Store	StoreId Primary Key	20000 rows
Product	ProductId Primary Key	200k rows
SellerStoreProductMapping	SellerId,ProductId,StoreId Composite non-clustered index. There is also one surrogate key in this table: SellerStoreProductMappingId which is used as Primary Key clustered	93M rows

Our queries can access any of the three combination : Seller, Product, Store in the 93M rows table.

My actual query is like this:

SELECT < many columns from four tables>
FROM SellerStoreProductMapping
INNER JOIN Store
INNER JOIN Seller
INNER JOIN Product ...
WHERE SellerId = 123

But, what is happening is the non-clustered index: SellerId,ProductId,StoreId is rarely used, in our queries, even if we are having filter on SellerId. It goes for index scan for storeId.

Our doubt is, for these three column combination,

should we go for separate non-clustered indexes (3 indexes) ?
should we go for two column non-clustered composite indexes (4 indexes) ?
should we go for three column composite non-clustered indexes (9 indexes). Instead of 9, We will limit to specific usage scenarios. ?

Note: We cannot create clustered column store index, as we are having ROWVERSION datatype as one of the datatypes in the mapping table.

Solution

Your index will be used systematically if this index is covering all the information in the query and if predicate is sargable.

As an example, let us see those queries :

--1
SELECT *
FROM   SellerStoreProductMapping
WHERE  Seller = 1 AND Product = 2 AND Store = 1
-- 2
SELECT Seller, Product, Store
FROM   SellerStoreProductMapping
WHERE  Seller = 1 AND Product = 2 AND Store = 1
-- 3
SELECT anyOtherColumns
FROM   SellerStoreProductMapping
WHERE  Seller = 1 AND Product = 2 AND Store = 1
-- 4
SELECT Seller, Product, Store
FROM   SellerStoreProductMapping
WHERE  Seller = 1 AND Product = 2 AND Store = 1
ORDER  BY anyOtherColumns

Only query 2 will systematicaly use the index. All the other queries (1, 3, 4) does not have all the columns used in the query, in the index key... So they must use a double read :

first read (seek in the index) to find candidate rows that are qualified under the predicate
second read in the table to find the columns that the index does not have

The cost of the two reads is compare to the cost of other strategies like scanning the table. If the cost of the scan is lower, the index won't be used...