sql-serverperformanceentity-framework-4sql-execution-plansql-server-2008r2-express

Entity Framework cached query plan performance degrades with different parameters


I have the following problem.

Background

I'm trying to implement an autocomplete selector with MVC3, EF4 and jquery over a table wit 4.5 million records.

This is the table:

CREATE TABLE [dbo].[CONSTA] (
  [afpCUIT] nvarchar(11) COLLATE Modern_Spanish_CI_AS NOT NULL,
  [afpNombre] nvarchar(30) COLLATE Modern_Spanish_CI_AS NULL,
  [afpGanancias] varchar(2) COLLATE Modern_Spanish_CI_AS NULL,
  [afpIVA] varchar(2) COLLATE Modern_Spanish_CI_AS NULL,
  [afpMonot] varchar(2) COLLATE Modern_Spanish_CI_AS NULL,
  [afpIntSoc] varchar(1) COLLATE Modern_Spanish_CI_AS NULL,
  [afpEmpl] varchar(1) COLLATE Modern_Spanish_CI_AS NULL,
  [afpAct] varchar(2) COLLATE Modern_Spanish_CI_AS NULL,
  CONSTRAINT [CONSTA_pk] PRIMARY KEY CLUSTERED ([afpCUIT])
)
ON [PRIMARY]
GO

CREATE NONCLUSTERED INDEX [CONSTA_Nombre_idx] ON [dbo].[CONSTA]
  ([afpNombre])
WITH (
  PAD_INDEX = OFF,
  DROP_EXISTING = OFF,
  STATISTICS_NORECOMPUTE = OFF,
  SORT_IN_TEMPDB = OFF,
  ONLINE = OFF,
  ALLOW_ROW_LOCKS = OFF,
  ALLOW_PAGE_LOCKS = OFF)
ON [PRIMARY]
GO

The table is pretty static (it only needs a monthly batch update) and read only.

if somebody cares to download the records (54MB) this is the URL:

http://www.afip.gob.ar/genericos/cInscripcion/22102011.zip

and here is the record description:

http://www.afip.gob.ar/genericos/cInscripcion/archivoCompleto.asp

Here is the code of the app:

CONTROLLER:

public class AltaMasivaController : Controller
{
    //
    // GET: /AltaMasiva/

    public ActionResult Index()
    {
        return View();
    }

    public JsonResult GetUsers(string query)
    {
        CENT2Entities db = new CENT2Entities();
        bool isCUIT = true;

        for(int j = 0; j < query.Length; j++)
            if (! Char.IsDigit(query, j))
            {
                isCUIT = false;
                break;
            }

        if (isCUIT)
        {
            // nvarchar search
            var x = from u in db.CONSTA
                    where u.afpCUIT.StartsWith(query)
                    orderby u.afpNombre
                    select new { label = u.afpNombre.TrimEnd(), id = u.afpCUIT };

            return Json(x.Take(50), JsonRequestBehavior.AllowGet);
        }
        else
        {
            // nvarchar search
            var x = from u in db.CONSTA
                    where u.afpNombre.StartsWith(query)
                    orderby u.afpNombre
                    select new { label = u.afpNombre.TrimEnd(), id = u.afpCUIT };

            return Json(x.Take(50), JsonRequestBehavior.AllowGet);
        }
    } 
}

VIEW:

@{
    viewbag.title = "index";
}

<h2>index</h2>
@html.textbox("user", "", new { style="width: 400px;" })

<script type="text/javascript">

$("input#user").autocomplete(
{ 
    source: function (request, response) 
    { 
        // define a function to call your action (assuming usercontroller) 
        $.ajax(
        { 
            url: '/altamasiva/getusers', type: "post", datatype: "json", 

            // query will be the param used by your action method 
            data: { query: request.term }, 

            success: function(data){ 
                response( $.map(data, function (item){ return { label: item.label + " (" + item.id + ")", value: item.label, id: item.id }; })); 
            } 
        }) 
    }, 
    minlength: 1, // require at least one character from the user
});

</script>

And now:

THE PROBLEM

As you can see, The code follows different paths if the query string contains only numbers.

When all the characters of the controller parameter are numbers (where u.afpCUIT.StartsWith(query) ), the query optimizer "is supposed to" perform a clustered index seek (which it does) and return the first 50 rows it finds. When the first "autocomplete" string arrives (usually one or two characters at most) the query performs extraordinarily fast, but, when the lenght of the string increases, the performance degrades notably (it takes almost between 20 seconds to 2 minutes with 9 or more chars). Surprisingly, after "restarting" the SQL Server Service, if the initial string contains 10 chars, it performs great too, but the performance degrades when we delete chars from the "query" string, the complete opposite.

Why is this happening?

When SQL server compiles the first execution plan, it optimizes it to perform really fast with a large result set (or viceversa). Subsequent queries, which narrows (or expands) the resultset, require a different execution plan ... BUT ... EF generated SQL uses commad parameters to (precisely) avoid statement recompiling ...

Cleaning the Execution Plan Cache by executing:

db.ExecuteStoreCommand("DBCC FREEPROCCACHE");

restores the performance to excellent response times ... BUT ... it kills al plans in all databases, thus degrading the performance of all other cached plans (which generally perform OK).

After doing some profiling on the EF sql statements, I executed DBCC FREEPROCCACHE in Query Analyzer prior to the sql EF generates, which turned out to generate different execution plans, all performing in the 250ms range, independently of the parameter length:

DBCC FREEPROCCACHE

exec sp_executesql N'SELECT TOP (50) 
[Project1].[C1] AS [C1], 
[Project1].[C2] AS [C2], 
[Project1].[afpCUIT] AS [afpCUIT]
FROM ( SELECT 
    [Extent1].[afpCUIT] AS [afpCUIT], 
    [Extent1].[afpNombre] AS [afpNombre], 
    1 AS [C1], 
    RTRIM([Extent1].[afpNombre]) AS [C2]
    FROM [dbo].[CONSTA] AS [Extent1]
    WHERE [Extent1].[afpCUIT] LIKE @p__linq__0 ESCAPE N''~''
)  AS [Project1]
ORDER BY [Project1].[afpNombre] ASC',N'@p__linq__0 nvarchar(4000)',@p__linq__0=N'2023291%'

THE QUESTION

Is there a more ellegant alternative to

db.ExecuteStoreCommand("DBCC FREEPROCCACHE");

?

Surprisingly, the second path of the query ( where u.afpNombre.StartsWith(query) ) are not affected by the same problem and performs great. Obviously, execution plans do not change when the lenght of the string changes...

I found an ObjectContext parameter in older versions of EF:

System.Data.EntityClient.EntityCommand.EnablePlanCaching

but I couldn't find it in EF4, and I'm not sure if the global results would be the same.

I'm really puzzled with this problem, and I don't know where the real problem lies

Poor index design? Lack of partitions? SQL SERVER 2008 Express edition? EF generated SQL? Lousy luck?

Any help would be great. Thanx in advance!


Solution

  • There's a way to remove a single plan from SQL Server's cache. It's explained in detail here: http://sqlblog.com/blogs/kalen_delaney/archive/2007/09/29/geek-city-clearing-a-single-plan-from-cache.aspx

    Also, you can create a Stored Procedure, and map it with Entity Framework instead of using LINQ2Entities, and in this way make spcific changes to the SQL syntax, and make sure it's always the same.