Entity Framework’s Include method which eager loads related data in a query does not allow sorting and filtering. People ask for this feature frequently.
There is a way around the problem, I wrote about it in the June Data Points column in MSDN Magazine (Loading Related Data: http://msdn.microsoft.com/en-us/magazine/hh205756.aspx) but since I just saw another comment on twitter about this, I thought I would stop working on Chapter 7 of the DbContext book that is already delayed and write a quick blog post about using a projection instead.
One caveat…you cannot do this if you have a long running context and you want the objects to remain attached to the context. If there are other related entities in the context, the context will always give them to you. More on this at the end of the post.
Okay, today’s model will be based on a roads and trees on that road. I need some type of guide because we’ve been socked in with fog for two days and I can barely see the trees across my road.
We’ll have two classes (and no I’m not trying to rub in the fact that we don’t have geo support yet, but it’s coming in .NET 4.5.)
public class Tree { public int Id { get; set; } public string Description { get; set; } public decimal Lat { get; set; } public decimal Long { get; set; } public int RoadId { get; set; } } public class Road { public Road() { Trees = new List<Tree>(); } public int Id { get; set; } public string Name { get; set; } public string Town { get; set; } public List<Tree> Trees { get; set; } }
The way loading works in Entity Framework is an all or nothing scenario with one exception.
The DbContext lets you do a filter in an explicit load (i.e. after the root is already in memory). Here’s a passing test which loads only maple trees and checks to see that the count of non-maple trees is NOT greater than 0.
(MyInitializer seeds the database with one road that has three related Trees (a maple, a maple and a pine)).
[TestMethod()] public void CanFilterOnExplicitLoad() { Database.SetInitializer(new MyInitializer()); var context = new DataAccess(); var road = context.Roads.FirstOrDefault(); context.Entry(road).Collection(r => r.Trees) .Query().Where(t => t.Description.Contains("Maple")) .Load(); Assert.IsFalse(context.Trees.Local
.Where(t=>!t.Description.Contains("Maple"))
.Count()>0);
}
But that’s beside the point, since the question is about eager loading.
You can’t filter when eager loading with Include.
But you can get the results you want eagerly if you project.
[TestMethod()] public void CanFilterOnProjection() { Database.SetInitializer(new MyInitializer()); var context = new DataAccess(); var road = context.Roads .Select(r=>new{ r, r.Trees.Where(t=>t.Description.Contains("Maple")}) .FirstOrDefault(); Assert.IsFalse(context.Trees.Local
.Where(t => !t.Description.Contains("Maple"))
.Count() > 0); } }
That’s nice but whine whine whine, I returned an anonymous type.
If you use a repository, you can hide that.
public class DumbRepositoryButGoodEnoughForThisDemo { DataAccess _context=new DataAccess(); public List<Road> GetRoadsWithFilteredTrees(string treeFilter) { var roadAndTrees = _context.Roads .Select(r=>new{ Road=r, Trees=r.Trees.Where(t=>t.Description.Contains("Maple"))}) .ToList(); return roadAndTrees.Select(rAt=>rAt.Road).ToList(); } }
When I return the list of roads from the projection, the Trees will be attached thanks to the context recognizing the relationships.
Here’s another test that does pass:
[TestMethod()] public void RepositoryFilteredRoadsReturnsRoadWithTrees() { Database.SetInitializer(new MyInitializer()); var rep = new DumbRepositoryButGoodEnoughForThisDemo(); var roads = rep.GetRoadsWithFilteredTrees("Maple"); Assert.IsTrue(roads.FirstOrDefault().Trees.Any()); Assert.IsFalse(roads.FirstOrDefault()
.Trees
.Where(t => !t.Description.Contains("Maple"))
.Count() > 0); }
And a screenshot as further proof:
Scenario When This May Not Work As Expected
As Brian points out in the comments, there is one BIG draw back that is also a problem with the filtered explicit load. If there are already related entities tracked by the context, they will automatically be attached to any related entity in the context. That means if the Pine tree was already being tracked then as long as the Road is attached to the context, it will see ALL of the related trees in the context, including the Pine that was already there.
If you are using a pattern where you have a short-lived context that is instantiated just to execute the query, then it’s not a problem. Most of my architectures are like this. But if you are writing a Windows Form or WPF form and have a context that hangs around, you could run into this problem.
You’ll never avoid the problem when the entities are attached to the context.
But here’s a twist on the repository method that will return disconnected objects from a context that is managing multiple object.
public List<Road> GetDisconnectedRoadsWithFilteredTrees(string treeFilter) { var roadAndTrees = _context.Roads.AsNoTracking() .Select(r => new { Road = r, Trees = r.Trees.Where(t => t.Description.Contains("Maple")) }) .ToList(); var rt = new List<Road>(); foreach (var r in roadAndTrees) { r.Road.Trees.AddRange(r.Trees); } return roadAndTrees.Select(rAt => rAt.Road).ToList(); }
Sign up for my newsletter so you don't miss my conference & Pluralsight course announcements!
great example. now, must refactor 🙂
I haven’t tried this, but is the magic that lets it recognize the relationship and return the (now filtered) related entities in any way related to the ".Local" first level cache? I’m just wondering if you could accidentally get more than you bargained for if more related entites happened to be known to the first level cache) due to previous queries to the same Entity in that context instance.
@Brian.Hmm interesting question. This is behavior that we’ve had since EFV1 so it’s not due to the DbSet.Local property. Because the context is tracking the related entities, it automatically builds up it’s own comprehension of the graphs via the navigations. The team refers to this as "relationship span". If the road was already in the context and it had a 2 related maple trees also in the context that were not in the database, then those would be returned as well. When the db retrieves the road and other trees from the database, and sees that the road is already in the context, by default it will use the existing road (not the one from the db). But that affect of returning ALL of the Maple trees (those known by the context and those that just came from the database)…. I believe you would get the same affect from an Include query when the road and some maple trees unknown to the db were already in the context.
Yeah, sorry…didn’t word that very well since I see the world through DbContext colored glasses 🙂
Specifically I was wondering if you used your setup above to query for a Road with its Maple trees and then query for that same Road and, say, its Birch trees, then after the second call to the repository would the tree collection you got back also unintentionally include the Maple trees (because even though they wouldn’t have come from the database on the second query, the context already knew about them from the previous query)?
I hadn’t even considered the addition of entities that didn’t come from the database but that’s an interesting wrinkle as well.
@Brian: are you asking about duplicates?
oh! Now I see. Ahhh, yes, a second filter is needed! I’ll add that. Thanks!
@Brian,
Dang!Harder than I thought. Will modify the post to indicate this flaw until I can find a way around it.
Sorry I had to be the bearer of the monkey wrench… Love the column and the blog. Keep up the great work (and enjoy your trip next month to glorious Sandusky…simultaneously the city of my birth and most despised name in recent popular media).
@Brian…I’m grateful you thought of that scenario! 🙂
Best name for a repository ever: DumbRepositoryButGoodEnoughForThisDemo
@Mike… hahaha 🙂 Glad someone got a laugh out of it.
Hi Julie,
You think we should use the default repository in conjunction with the first code 4.2?
Veranildo,
I don’t know what you mean by the default repository
In 2009 I wrote about similar idea for Load method: blog.cincura.net/…/229660-load-wit .
Guess I was a little late to the game, Jiri 😉
Well, your solution is about different problem. But using same idea. I think it’s valid. But OTOH you could have been a little faster. 😉
One trick here, you must have LazyLoading explicitly turned off, and changeTracking on, for this to work as described.
Thank you.
This was working perfectly for me with my ObjectContext but failed when I converted to a dbContext.
Setting db.Configuration.LazyLoadingEnabled = false; fixed the problem
Nice article Julie.
Britt thanks for the tip. It helped me out.
Whoah, this strategy just blew my mind. I had all but given up on getting a set of EF statements that would build a reasonable DB query for my situation. I was just about ready to write a stored procedure and drop back to good (bad?) ol’ ADO when I found this.
I don’t quite understand why it works fine when projecting to an anon type, but they can’t make it work when working directly with the models. Maybe they just haven’t gotten around to it yet.
Anyway, thanks for posting this
Hi Julie,
I like this solution, thank you for sharing!!! I’m wondering if you could suggest how to use this method for loading & filtering deeper relationships like grandchild collections etc?
Hey, I found your blog post in a search on how to use projection with EF6.
I have used the example from your DumbRepositoryButGoodEnoughForThisDemo, exactly as you put it. Okay, with different types, but they’re a one-to-many relationship, so it should work. Right? Right!?
Instead, my returned Road.Trees is null. Just, fricking null, while there really really should be data. I can’t for the life of me figure out why.
Are there any hidden preconditions I’ve missed, that would make projection totally and utterly fail? Hope you have suggestions… Thanks!
I’m an idiot… Huge huge typo in the “Trees” property name. Ugh..
Does that mean you got it working? I had totally forgotten about this workaround so thanks for bringing it back to my attention 🙂
currently does not have that filter feature. You can try using explicit loading instead. However with this way the query does not look nice, I would use a
True, EF6 will never have a way to filter includes. however, EF Core 5 now has this feature and EF Core 5 will be released in November. Details at https://docs.microsoft.com/en-us/ef/core/what-is-new/ef-core-5.0/whatsnew#filtered-include