asp.net-coreentity-framework-corecross-database

CrossDB queries with EFCore = no / instantiating context entites as async lists = works - What is the true approach should I be taking?


I have a system with two database servers I am working with:

One of them is database first - a database managed by a legacy enterprise application and I don't have full control over changing the database structure.

The second is code first and I have full control in the code first database to make changes.

Security policies prevent me from making a view that joins tables from the two database servers in the code first DB which might be a way to make this better according to what i've seen on SO posts.

I have one context for each database since they are separate.

The data and structure in the code first tables is designed to be able to join to the non-code first database as if they were all in one database.

I CAN get what I need working using this set of queries:

        // Set up EF tables
        var person = await _context1.Person.ToListAsync();
        var labor = await _context1.Labor.ToListAsync();
        var laborCraftRate = await context1.LaborCraftRate.ToListAsync();
        var webUsers = await context2.WebUsers.ToListAsync();
        var workOrders = await _context1.Workorder
           .Where(r => r.Status == "LAPPR" || r.Status == "APPR" || r.Status == "REC")
           .ToListAsync();
        var specialRequests = await _context1.SwSpecialRequest
           .Where(r => r.Requestdate > DateTime.Now)
           .ToListAsync();

        var distributionListQuery = (
                 from l in labor
                 from p in person.Where(p => p.Personid == l.Laborcode).DefaultIfEmpty()
                 from wu in webUsers.Where(wu => wu.Laborcode == l.Laborcode).DefaultIfEmpty()
                 from lcr in laborCraftRate.Where(lcr => lcr.Laborcode == l.Laborcode).DefaultIfEmpty()
                 select new
                 {
                     Laborcode = l.Laborcode,
                     Displayname = p.Displayname,
                     Craft = lcr.Craft,
                     Crew = l.Crewid,
                     Active = wu.Active,
                     Admin = wu.FrIsAdmin,
                     FrDistLocation = wu.FrDistLocation,
                 }).Where(r => r.Active == "Y" && (r.FrDistLocation == "IPC" || r.FrDistLocation == "IPC2" || r.FrDistLocation == "both"))
                    .OrderBy(r => r.Craft)
                    .ThenBy(r => r.Displayname);

        // Build a subquery for the next query to use
        var ptoSubQuery =
           from webUser in webUsers
           join workOrder in workOrders on webUser.Laborcode equals workOrder.Wolablnk
           join specialRequest in specialRequests on workOrder.Wonum equals specialRequest.Wonum
           select new
           {
               workOrder.Wonum,
               Laborcode = workOrder.Wolablnk,
               specialRequest.Requestdate
           };

        // Build the PTO query to join with the distribution list
        var ptoQuery =
           from a in ptoSubQuery
           group a by a.Wonum into g
           select new
           {
               Wonum = g.Key,
               StartDate = g.Min(x => x.Requestdate),
               EndDate = g.Max(x => x.Requestdate),
               Laborcode = g.Min(x => x.Laborcode)
           };

        // Join the distribution list and the object list to return
        // list items with PTO information
        var joinedQuery = from dl in distributionListQuery
                          join fl in ptoQuery on dl.Laborcode equals fl.Laborcode
                          select new
                          {
                              dl.Laborcode,
                              dl.Displayname,
                              dl.Craft,
                              dl.Crew,
                              dl.Active,
                              dl.Admin,
                              dl.FrDistLocation,
                              fl.StartDate,
                              fl.EndDate
                          };

        // There are multiple records that result from the join,
        // strip out all but the first instance of PTO for all users
        var distributionList = joinedQuery.GroupBy(r => r.Laborcode)
                            .Select(r => r.FirstOrDefault()).OrderByDescending(r => r.Laborcode).ToList();

Again, this works and gets my data back in a reasonable but clearly not optimal timeframe that I can work with in my UI that needs this by preloading the data before it is needed. Not the best, but works.

If I change the variable declarations to not be async which I was told I should do in another SO post, this turns into a cross db query and netcore says no:

        // Set up EF tables
        var person = _context1.Person;
        var labor = _context1.Labor;
        var laborCraftRate = context1.LaborCraftRate;
        var webUsers = context2.WebUsers;
        var workOrders = _context1.Workorder
           .Where(r => r.Status == "LAPPR" || r.Status == "APPR" || r.Status == "REC");
        var specialRequests = _context1.SwSpecialRequest
           .Where(r => r.Requestdate > DateTime.Now);

Adding ToListAsync() is what allows the join functionality I need to work.

Q - Can anyone elaborate on possible downsides and problems with what I am doing?

Thank you for helping me understand!


Solution

  • It's not that calling ToList() "doesn't work." The problem is that it materializes (I think that's the right word) the query and returns a potentially larger than intended amount of data to the client. Any further LINQ operations are done on the client side. This can increase the load on the database and network. In your case, it works because you're bringing all that data to the client side. At that point, it no longer matters that it was a cross-database query.

    This was a frequent concern during the transition from .NET Core 2.x to 3.x. If an operation could not be performed server side, .NET Core 2.x would silently insert something like ToList(). (Well, not completely silently. I think it was logged somewhere. But many developers weren't aware of it.) 3.x stopped doing that and would give you an error. When people tried to upgrade to 3.x, they often found it difficult to convert the queries into something that could run server side. And people resisted throwing in an explicit ToList() because muh performance. But remember, that's what it was always doing. If there wasn't a performance issue before, there isn't one now. And at least now you're aware of what it's actually doing, and can fix it if you really need to.