linqoptimizationentity-framework-4.1navigation-properties

Writing LINQ queries. Joins VS navigational properties


I am trying gain more understanding of Linq queries and Entity Framework (4.1). Please take a look at following two queries. Both queries returns car type name (CarType.Name)

In first query I used join and ignored navigational property CarType

from c in Cars.AsEnumerable().
Where(e => e.CarId == Guid.Parse("0501cc96-5610-465d-bafc-16b30890c224"))
join ct in CarTypes on c.CarTypeId equals ct.CarTypeId
select new CarType {
    Name = ct.Name
}

In second, I used navigational property CarType

from c in Cars.AsEnumerable()
where c.CarId == Guid.Parse("0501cc96-5610-465d-bafc-16b30890c224")
select new CarType {
    Name = c.CarType.Name
}

I ran both in LinqPad therefore there is Guid.Parse function.

When I run these, first statement runs faster. LinqPad reports 00:00:036. Second statement runs slower and LinqPad reports 00:00:103

Looking at results it seems that Linq queries that use joins instead of navigational properties are faster. Is that really so? Please somebody shead some light to this. Are there any general guidances, the best practices that I should follow when writing Linq queries?

Thanks


Solution

  • Since you are calling .AsEnumerable(), the queries are not evaluated using LINQ to Entities, but rather LINQ to Objects.

    This means that the first one is likely doing two round-trips: one to pull all the Cars and one to pull all the CarTypes. Then it performs a join locally, using whatever algorithm LINQ to Objects uses for such operations.

    The second one is probably doing N + 1 round-trips, where N is the number of CarTypes. You do a round-trip to grab all the cars, and then each time one of those cars has a CarTypeId that Entity Framework hasn't already loaded in, it goes back to the database to select that CarType.

    If you use the SQL tab in LINQPad, you can see all the LINQ queries that are being performed by your program.

    The best practice that you should apply in this case is to not call .AsEnumerable() on an Entity Framework object set. Instead, compose your entire query and then call .ToList() at the end to capture the results. You are probably calling .AsEnumerable() as a workaround because Guid.Parse() doesn't work inside of a LINQ to Entities query, but you can easily remove that part from the query. In LINQPad, press Ctrl-2 to switch to C# Statement(s) mode, and then run a query like this:

    var guid = Guid.Parse("0501cc96-5610-465d-bafc-16b30890c224");
    var carTypeNames = 
        (from c in Cars
        where c.CarId == guid
        select new CarType {
            Name = c.CarType.Name
        }).ToList();
    carTypeNames.Dump();
    

    The two queries given should have roughly equivalent performance when done right, so you should prefer Navigation Properties, since they are more concise and easier to read. Or, according to your preference, you could turn the query around and make it be based on the CarType collection:

    var guid = Guid.Parse("0501cc96-5610-465d-bafc-16b30890c224");
    var carTypeNames = 
        (from ct in CarTypes
        where ct.Cars.Any(c => c.CarId == guid)
        select new CarType {
            Name = c.CarType.Name
        }).ToList();
    carTypeNames.Dump();
    

    Update

    Avoid creating an entity object like this:

    public class CarTypeSummary
    {
        public string Name{get;set;}
    }
    
    void Main()
    {
        var guid = Guid.Parse("0501cc96-5610-465d-bafc-16b30890c224");
        var carTypeNames = 
            (from ct in CarTypes
            where ct.Cars.Any(c => c.CarId == guid)
            select new CarTypeSummary {
                Name = c.CarType.Name
            }).ToList();
        carTypeNames.Dump();
    }
    

    In production code, it's often a good idea to decouple your API from the underlying data type, to give you more flexibility to change things without having to modify code anywhere.

    public interface ICarTypeSummary{string Name{get;}}
    public class CarTypeSummary : ICarTypeSummary
    {
        public string Name{get;set;}
    }
    public ICarTypeSummary GetCarTypeSummaryForCar(Guid guid) 
    {
        return (from ct in CarTypes
                where ct.Cars.Any(c => c.CarId == guid)
                select new CarTypeSummary {
                    Name = c.CarType.Name
                }).FirstOrDefault();
    }
    

    This way, if you decide in the future that you would rather just return an actual CarType, to take advantage of Entity Framework's caching mechanisms, you could change your implementation without messing with the API:

    // Make the Entity class implement the role interface
    public partial class CarType : ICarTypeSummary {}
    
    public ICarTypeSummary GetCarTypeSummaryForCar(Guid guid) 
    {
        return CarTypes.FirstOrDefault(
            ct => ct.Cars.Any(c => c.CarId == guid));
    }