Entity Framework’s Include method which eager loads related data in a query does not allow sorting and filtering. People ask for this feature frequently.
There is a way around the problem, I wrote about it in the June Data Points column in MSDN Magazine (Loading Related Data: http://msdn.microsoft.com/en-us/magazine/hh205756.aspx) but since I just saw another comment on twitter about this, I thought I would stop working on Chapter 7 of the DbContext book that is already delayed and write a quick blog post about using a projection instead.
One caveat…you cannot do this if you have a long running context and you want the objects to remain attached to the context. If there are other related entities in the context, the context will always give them to you. More on this at the end of the post.
Okay, today’s model will be based on a roads and trees on that road. I need some type of guide because we’ve been socked in with fog for two days and I can barely see the trees across my road.
We’ll have two classes (and no I’m not trying to rub in the fact that we don’t have geo support yet, but it’s coming in .NET 4.5.)
public class Tree { public int Id { get; set; } public string Description { get; set; } public decimal Lat { get; set; } public decimal Long { get; set; } public int RoadId { get; set; } } public class Road { public Road() { Trees = new List<Tree>(); } public int Id { get; set; } public string Name { get; set; } public string Town { get; set; } public List<Tree> Trees { get; set; } }
The way loading works in Entity Framework is an all or nothing scenario with one exception.
The DbContext lets you do a filter in an explicit load (i.e. after the root is already in memory). Here’s a passing test which loads only maple trees and checks to see that the count of non-maple trees is NOT greater than 0.
(MyInitializer seeds the database with one road that has three related Trees (a maple, a maple and a pine)).
[TestMethod()] public void CanFilterOnExplicitLoad() { Database.SetInitializer(new MyInitializer()); var context = new DataAccess(); var road = context.Roads.FirstOrDefault(); context.Entry(road).Collection(r => r.Trees) .Query().Where(t => t.Description.Contains("Maple")) .Load(); Assert.IsFalse(context.Trees.Local
.Where(t=>!t.Description.Contains("Maple"))
.Count()>0);
}
But that’s beside the point, since the question is about eager loading.
You can’t filter when eager loading with Include.
But you can get the results you want eagerly if you project.
[TestMethod()] public void CanFilterOnProjection() { Database.SetInitializer(new MyInitializer()); var context = new DataAccess(); var road = context.Roads .Select(r=>new{ r, r.Trees.Where(t=>t.Description.Contains("Maple")}) .FirstOrDefault(); Assert.IsFalse(context.Trees.Local
.Where(t => !t.Description.Contains("Maple"))
.Count() > 0); } }
That’s nice but whine whine whine, I returned an anonymous type.
If you use a repository, you can hide that.
public class DumbRepositoryButGoodEnoughForThisDemo { DataAccess _context=new DataAccess(); public List<Road> GetRoadsWithFilteredTrees(string treeFilter) { var roadAndTrees = _context.Roads .Select(r=>new{ Road=r, Trees=r.Trees.Where(t=>t.Description.Contains("Maple"))}) .ToList(); return roadAndTrees.Select(rAt=>rAt.Road).ToList(); } }
When I return the list of roads from the projection, the Trees will be attached thanks to the context recognizing the relationships.
Here’s another test that does pass:
[TestMethod()] public void RepositoryFilteredRoadsReturnsRoadWithTrees() { Database.SetInitializer(new MyInitializer()); var rep = new DumbRepositoryButGoodEnoughForThisDemo(); var roads = rep.GetRoadsWithFilteredTrees("Maple"); Assert.IsTrue(roads.FirstOrDefault().Trees.Any()); Assert.IsFalse(roads.FirstOrDefault()
.Trees
.Where(t => !t.Description.Contains("Maple"))
.Count() > 0); }
And a screenshot as further proof:
Scenario When This May Not Work As Expected
As Brian points out in the comments, there is one BIG draw back that is also a problem with the filtered explicit load. If there are already related entities tracked by the context, they will automatically be attached to any related entity in the context. That means if the Pine tree was already being tracked then as long as the Road is attached to the context, it will see ALL of the related trees in the context, including the Pine that was already there.
If you are using a pattern where you have a short-lived context that is instantiated just to execute the query, then it’s not a problem. Most of my architectures are like this. But if you are writing a Windows Form or WPF form and have a context that hangs around, you could run into this problem.
You’ll never avoid the problem when the entities are attached to the context.
But here’s a twist on the repository method that will return disconnected objects from a context that is managing multiple object.
public List<Road> GetDisconnectedRoadsWithFilteredTrees(string treeFilter) { var roadAndTrees = _context.Roads.AsNoTracking() .Select(r => new { Road = r, Trees = r.Trees.Where(t => t.Description.Contains("Maple")) }) .ToList(); var rt = new List<Road>(); foreach (var r in roadAndTrees) { r.Road.Trees.AddRange(r.Trees); } return roadAndTrees.Select(rAt => rAt.Road).ToList(); }