sql-server-2008performancespatialspatial-index

How can I speed up this SQL Server Spatial query?


I have (what I think) is a simple SQL Server spatial query:

Grab all the USA States that exist inside some 4 sided polygon (ie. the viewport/bounding box of a web page's Google/Bing map)

SELECT CAST(2 AS TINYINT) AS LocationType, a.Name AS FullName, 
    StateId, a.Name, Boundary.STAsText() AS Boundary, 
    CentrePoint.STAsText() AS CentrePoint
FROM [dbo].[States] a
WHERE @BoundingBox.STIntersects(a.Boundary) = 1

It takes 6 seconds to run, which is too slow.

How can I debug this, or what do I need to fine tune it? Do I have any spatial indexes? I believe so.

/****** Object:  Index [SPATIAL_States_Boundary]    
        Script Date: 07/28/2010 18:03:17 ******/
CREATE SPATIAL INDEX [SPATIAL_States_Boundary] ON [dbo].[States] 
(
    [Boundary]
)USING  GEOGRAPHY_GRID 
WITH (
    GRIDS =(LEVEL_1 = HIGH,LEVEL_2 = HIGH,LEVEL_3 = HIGH,LEVEL_4 = HIGH), 
    CELLS_PER_OBJECT = 1024, PAD_INDEX  = OFF, SORT_IN_TEMPDB = OFF, 
    DROP_EXISTING = OFF, ALLOW_ROW_LOCKS  = ON, 
    ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
GO

Do I need to provide some more information on the GEOGRAPHY data which is returned? eg. number of points, etc? Or do I need to run profiler and give some stats from there?

Or are my Cells_per_object / Grids set incorrectly ( I really have no idea what I should be setting those values to, to be honest).

Update

After the first reply from @Bobs below confirming that the spatial index was not getting used because the Primary Key (clustered Index) would be faster than a non-clustered index on a table with 50 odd rows, I then tried to force the Spatial Index:

SELECT CAST(2 AS TINYINT) AS LocationType, a.Name AS FullName, 
    StateId, a.Name, Boundary.STAsText() AS Boundary, 
    CentrePoint.STAsText() AS CentrePoint
FROM [dbo].[States] a WITH (INDEX(SPATIAL_States_Boundary))
WHERE @BoundingBox.STIntersects(a.Boundary) = 1

Now the query runs instantly. So my new question is, why did that fix it?


Solution

  • It appears that you have an optimal plan for running the query. It’s going to be tough to improve on that. Here are some observations.

    The query is doing a Clustered Index Scan on the PK_States index. It’s not using the spatial index. This is because the query optimizer thinks it will be better to use the clustered index instead of any other index. Why? Probably because there are few rows in the States table (50 plus maybe a few others for Washington, D.C., Puerto Rico, etc.).

    SQL Server stores and retrieves data on 8KB pages. The row size (see Estimate Row Size) for the filter operation is 8052 bytes, which means there is one row per page and about 50 pages in the entire table. The query plan estimates that it will process about 18 of those rows (See Estimated Number of Rows). This is not a significant number of rows to process. My explanation doesn’t address extra pages that are part of the table, but the point is that the number is around 50 and not 50,000 pages.

    So, back to why it uses the PK_States index instead of the SPATIAL_States_Boundry index. The clustered index, by definition, contains the actual data for the table. A non-clustered index points to the page where the data exists, so there are more pages to retrieve. So, the non-clustered index becomes useful only when there are larger amounts of data.

    There may be things you can do to reduce the number of pages processes (e.g., use a covering index), but your current query is already well optimized and you won’t see much performance improvement.