Category Archives: Data Access

Use Projections and a Repository to Fake a Filtered Eager Load

December 7, 2011Data AccessJulie

Entity Framework’s Include method which eager loads related data in a query does not allow sorting and filtering. People ask for this feature frequently.

There is a way around the problem, I wrote about it in the June Data Points column in MSDN Magazine (Loading Related Data: http://msdn.microsoft.com/en-us/magazine/hh205756.aspx) but since I just saw another comment on twitter about this, I thought I would stop working on Chapter 7 of the DbContext book that is already delayed and write a quick blog post about using a projection instead.

One caveat…you cannot do this if you have a long running context and you want the objects to remain attached to the context. If there are other related entities in the context, the context will always give them to you. More on this at the end of the post.

Okay, today’s model will be based on a roads and trees on that road. I need some type of guide because we’ve been socked in with fog for two days and I can barely see the trees across my road.

We’ll have two classes (and no I’m not trying to rub in the fact that we don’t have geo support yet, but it’s coming in .NET 4.5.)

public class Tree
  {
    public int Id { get; set; }
    public string Description { get; set; }
    public decimal Lat { get; set; }
    public decimal Long { get; set; }
    public int RoadId { get; set; }
  }
  public class Road
  {
    public Road()
    {
      Trees = new List<Tree>();
    }
    public int Id { get; set; }
    public string Name { get; set; }
    public string Town { get; set; }
    public List<Tree> Trees { get; set; }
  }

The way loading works in Entity Framework is an all or nothing scenario with one exception.

The DbContext lets you do a filter in an explicit load (i.e. after the root is already in memory). Here’s a passing test which loads only maple trees and checks to see that the count of non-maple trees is NOT greater than 0.

(MyInitializer seeds the database with one road that has three related Trees (a maple, a maple and a pine)).

    [TestMethod()]
    public void CanFilterOnExplicitLoad()
    {
      Database.SetInitializer(new MyInitializer());
      var context = new DataAccess();
      var road = context.Roads.FirstOrDefault();
      context.Entry(road).Collection(r => r.Trees)
        .Query().Where(t => t.Description.Contains("Maple"))
        .Load();
      Assert.IsFalse(context.Trees.Local
                            .Where(t=>!t.Description.Contains("Maple"))
                            .Count()>0);

But that’s beside the point, since the question is about eager loading.

You can’t filter when eager loading with Include.

But you can get the results you want eagerly if you project.

  [TestMethod()]
    public void CanFilterOnProjection()
    {
      Database.SetInitializer(new MyInitializer());
      var context = new DataAccess();
      var road = context.Roads
        .Select(r=>new{
          r,
          r.Trees.Where(t=>t.Description.Contains("Maple")})
        .FirstOrDefault();
      Assert.IsFalse(context.Trees.Local
                            .Where(t => !t.Description.Contains("Maple"))
                            .Count() > 0);
    }
  }

That’s nice but whine whine whine, I returned an anonymous type.

If you use a repository, you can hide that.

  public class DumbRepositoryButGoodEnoughForThisDemo
  {
    DataAccess _context=new DataAccess();

    public List<Road> GetRoadsWithFilteredTrees(string treeFilter)
    {
      var roadAndTrees = _context.Roads
       .Select(r=>new{
          Road=r,
          Trees=r.Trees.Where(t=>t.Description.Contains("Maple"))})
        .ToList();
      return roadAndTrees.Select(rAt=>rAt.Road).ToList();
    }
  }

When I return the list of roads from the projection, the Trees will be attached thanks to the context recognizing the relationships.

Here’s another test that does pass:

   [TestMethod()]
    public void RepositoryFilteredRoadsReturnsRoadWithTrees()
    {
      Database.SetInitializer(new MyInitializer());
      var rep = new DumbRepositoryButGoodEnoughForThisDemo();
      var roads = rep.GetRoadsWithFilteredTrees("Maple");
      Assert.IsTrue(roads.FirstOrDefault().Trees.Any());
      Assert.IsFalse(roads.FirstOrDefault()
                          .Trees
                          .Where(t => !t.Description.Contains("Maple"))
                          .Count() > 0);
      
    }

And a screenshot as further proof:

Scenario When This May Not Work As Expected

As Brian points out in the comments, there is one BIG draw back that is also a problem with the filtered explicit load. If there are already related entities tracked by the context, they will automatically be attached to any related entity in the context. That means if the Pine tree was already being tracked then as long as the Road is attached to the context, it will see ALL of the related trees in the context, including the Pine that was already there.

If you are using a pattern where you have a short-lived context that is instantiated just to execute the query, then it’s not a problem. Most of my architectures are like this. But if you are writing a Windows Form or WPF form and have a context that hangs around, you could run into this problem.

You’ll never avoid the problem when the entities are attached to the context.

But here’s a twist on the repository method that will return disconnected objects from a context that is managing multiple object.

   public List<Road> GetDisconnectedRoadsWithFilteredTrees(string treeFilter)
    {
      var roadAndTrees = _context.Roads.AsNoTracking()
       .Select(r => new
       {
         Road = r,
         Trees = r.Trees.Where(t => t.Description.Contains("Maple"))
       })
        .ToList();

      var rt = new List<Road>();
      foreach (var r in roadAndTrees)
      {
        r.Road.Trees.AddRange(r.Trees);
      }
      return roadAndTrees.Select(rAt => rAt.Road).ToList();
    }

Code First and DbContext are now “The Entity Framework”

October 21, 2011Data AccessJulie

In the recent blog post, How We Talk about EF and its Future Versions, the EF team announced that going forward, the stuff of EF that’s in .NET, will be referred to as the EF Core Libraries. For example, .NET 4 contains EF 4…that’s now core libraries. When .NET 4.5 is released, whatever is contained within System.Data.Entity.dll will be EF 4.5 Core Libraries.

As you may know by now, Code First and DbContext were released out of band of the .NET release schedule and is contained in the EntityFramework.dll assembly which is distributed via Nuget. Code First and DbContext rely on the core libraries to do their job. In the blog post, the team says “we are going to focus on the EntityFramework NuGet package as the primary deliverable that we refer to as “The Entity Framework” or “EF”.

This allows the team to be more flexible in releasing new features that leverage all of the work that’s gone into Entity Framework – features such as Code First and the DbContext API.

(For anyone writing or blogging etc about EF, please keep these differences in mind and be attentive to how you express EF.)

However, there’s something more important that I took away from the team’s blog post based on the fact that what’s in the NuGet package (DbContext and Code First) is what the team now refers to as “Entity Framework”.

What this means to me is that DbContext and Code First are the first features you should consider when approaching Entity Framework.

This makes a lot of sense to me. DbContext is much simpler to use than ObjectContext and will serve the most common development needs. if you need more, you can drop down into the ObjectContext. Remember that DbContext sits on top of the ObjectContext. The ObjectContext is always there in the background doing it’s work. So if you need to do something very granular, DbContext provides a hook to its underlying ObjectContext. Check out this blog post for how to do that: Accessing ObjectContext Features from EF 4.1 DbContext. Or, if you are already committed to working directly with the ObjectContext, you can still do that.

What about database first and model first? The designer is still inside of Visual Studio (and getting improvements) and there are 3rd party designers as well. If Code First doesn’t do the trick for you, you really want a visual model — or you’re already committed to EDMX — no worries. The feature is still there and you will not be a pariah for using the designer. Use the best tool for the job … for *your* job! (Here’s an article I wrote about choosing between db/model & code first)

The core EF Libraries know how to use the XML based metadata that come from the EDMX. At runtime, EF uses an in-memory representation of your model. If your model is expressed in XML (from the designer), the ObjectContext knows how to read the XML to create what it needs at runtime. (Even if you’re using DbContext + EDMX, this works…because DbContext has an ObjectContext behind it). If your model is expressed in classes plus Code First configurations, the context knows how to use those pieces to build the in-memory metadata at runtime. After that, EF doesn’t care where the model came from.

So the primary focus of Entity Framework going forward will be building models with your classes plus Code First and managing your entities with the DbContext. When the team talks about Entity Framework, (according to their blog post) that’s most likely what they’ll be referring to.

There’s ANOTHER take away here. Code First leverages the POCO support that was built into the core in .NET 4. (see what I just did there? Smile ) EntityObject is now the ugly step-child of Entity Framework. After working mostly with POCOs in EF for over a year now, I feel the same way about EntityObject. No love lost there!

When the team makes changes to how the core EF libraries (those which are part of .NET), again …popular examples are upcoming enum support and spatial data support … they’ll be clear that this is part of the core. Changes to the core benefit all of the ways to use EF …whether you build your model with a designer or with code and whether you use the ObjectContext or DbContext.

I’ve just finished up a book with Rowan Miller that is something of an extension to my book, Programming Entity Framework Second Edition (which is focused on Entity Framework 4). The new book is about 150 pages and is called “Programming Entity Framework Code First: Creating and Configuring Data Models from your Classes”. Very specifically about the modeling/mapping and DB initialization. We are now embarking on another short book that will be Programming Entity Framework DbContext. This second one will be focused on the DbContext APIs, validation and how to use them in various application patterns.

Teaching a 5-day Workshop (Entity Framework Boot Camp) Oct 3-7 in Boston

August 15, 2011Data AccessJulie

I’m excited about embarking on my first full-week workshop. Until now, I have done a number of one-day workshops but I have always felt that I needed more time. There’s so much to share! Maybe five days will be enough?

The workshop will be in Boston (Waltham, to be exact) the first week of October. It is being coordinated by www.dataeducation.com.

Below is the course outline.

There is currently a $400 early bird discount until the end of August.

Details and registration at http://dataeducation.com/entity-framework-bootcamp/

Day 1: Introducing Code First

Why Code First?
How does Code First work at runtime?
Configuring Code First with Data Annotations and with the Fluent API
Configuring for validation, data attributes, relationships, database mappings and hierarchies
- Understand impact on database
- Understand impact on your application at runtime
Code First database initialization
- Understand default and optional behavior and workarounds

Day 2: Introducing EF 4.1 DbContext /DbSet

How do DbContext and DbSet compare to ObjectContext/Object Set
Explore features of DbContext/DbSet that streamline EF coding
Integrate DbContext with your apps
Validation API
Fun with MVC 3 and EF 4.1

Days 3-5: Hard Core EF 4 (and 4.1)

Architecting maintainable and testable enterprise apps with EF4/4.1
- Repositories, unit of work, testing
EF in distributed architectures: ASP.NET, WCF Services, WCF Data Services
EF in the Cloud: Windows Azure and SQL Azure
EF performance tips and tricks
Working with large models and multiple contexts
Explore EF core API additions and improvements introduced in the June 2011 CTP
- Enum support, spatial, TPT improvements, designer improvements and more

Entity Framework/WCF Data Services June 2011 CTP: Derived Entities with Related Data

July 2, 2011Data AccessJulie

Well, this change really came in the March 2011 CTP but I didn’t realize it until the June CTP was out, so I’ll call it a wash.

WCF Data Services has had a bad problem with inherited types where the derived type had a relationship to yet another type. For example, in this model where TPTTableA is a derived type (from Base)and has related data (BaseChildren).

If you expose just the Bases EntitySet (along with its derived type) in a WCF Data service, that was fine. I can browse to http://localhost:3958/WcfDataService1.svc/Bases easily.

But if I also exposed the related type (BaseChild) then the data service would have a hissy fit when you tried to access the hierarchy base (same URI as before). If you dig into the error it tells you:

“Navigation Properties are not supported on derived entity types. Entity Set ‘Bases’ has a instance of type ‘cf_model.ContextModel.TPTTableA’, which is an derived entity type and has navigation properties. Please remove all the navigation properties from type ‘cf_model.ContextModel.TPTTableA’”

Paul Mehner blogged about this and I wrote up a request on Microsoft Connect. Here’s Paul’s post: Windows Communication Foundation Data Services (Astoria) – The Stuff They Should Have Told You Before You Started

But it’s now fixed!

Using the same model, I can expose all of the entity sets in the data service and I’m still able to access the types in the inheritance hierarchy.

Here is an example where I am looking at a single Base entity and in fact this happens to be one of the derived types, notice in the category the term is cf_model.TPTA. The type is strongly typed.

You can see that strong typing in the link to the related data:

Bases(1)/cf_model.TPTA/BaseChildren

That’s how the data service is able to see the relationship, only off of the specific type that owns the relationship.

So accessing the relationship is a little different than normal. I struggled with this but was grateful for some help from data wizard, Pablo Castro.

The Uri to access the navigation requires that type specification:

http://localhost:1888/WcfDataService1.svc/Bases(1)/cf_model.TPTA/BaseChildren

You also need that type if you want to eager load the type along with it’s related data:

http://localhost:1888/WcfDataService1.svc/Bases(1)/cf_model.TPTA?$expand=BaseChildren

Note that I was still having a little trouble with the navigation (the first of these two Uris). It turns out that cassini (i.e. the asp.net Web Development Server) was having a problem with the period (.) in between cf_model and TPTA.

Once I switched the service to use IIS Express (which was my first time using IIS Express and shockingly easy!), it was fine. (Thanks again to Pablo for setting me straight on this problem.)

So it’s come a long way and if this is how it is finalized, I can live with it though I guess it would be nice to have the URIs cleaned up.

Of course you’re more likely to use one of the client libraries that hide much of the Uri writing from us, so maybe in the end it will not be so bad. I have not yet played with the new client library that came with this CTP so I can’t say quite yet.

Entity Framework & WCF Data Services June 2011 CTP : Auto Compiled LINQ Queries

July 2, 2011Data AccessJulie

Ahh another one of the very awesome features of the new CTP!

Pre-compiled LINQ to Entities queries (LINQ to SQL has them too) are an incredible performance boost. Entity Framework has to a bit of work to read your LINQ to Entities query, then scour through the metadata to figure out what tables & columns are involved, then pass all of that info to the provider (e.g., System.Data.SqlClient) to get a properly constructed store query. If you have queries that you use frequently, even if they have parameters, it is a big benefit to do this once and then reuse the store query.

I’ve written about this a lot. It’s a bit of a PIA to do especially once you start adding repositories or other abstractions into your application. And the worst part is that they are tied to ObjectContext and you cannot even trick a DbContext into leveraging CompiledQuery. (Believe me I tried and I know the team tried too.)

So, they’ve come up with a whole new way to pre-compile and invoke these queries and the best part is that it all happens under the covers by default. Yahoo!

Plus you can easily turn the feature off (and back on) as needed. With CompiledQuery in .NET 3.5 & .NET 4.0, the cost of compiling a pre-compiling a query that can be invoked later is more expensive than the one time cost of the effort to transform a L2E query into a store query. Auto-Compiled queries work very differently so I don’t know if you need to have the same concern about turning it off for single-use queries. My educated guess is that it’s the same. EF still has to work out the query,then it has to cache it then it has to look for it in the cache. So if won’t benefit from having the store query cached, then why pay the cost of storing and reading from the cache?

I highly recommend reading the EF team’s walk-through on the Auto-Compiled Queries for information on performance and more details about this feature and how it works. Especially the note that this paper says CompiledQuery is still faster.

A bit of testing

I did a small test where I ran a simple query 10 times using the default (compiled) and 10 times where I’ve turned off the compilation. I also started with a set up warmup queries to make sure that none of the queries I was timing would be impacted by EF application startup costs. Here you can see the key parts of my test. Note that I’m using ObjectContext here and that’s where the ContextOptions property lives (same options where you find LazyLoadingEnabled, etc). You can get there from an EF 4.1 DbContext by casting back to an ObjectContext.

  public static void CompiledQueries()
    {
      using (var context = new BAEntities())
      {
        FilteredQuery(context, 3).FirstOrDefault();
        FilteredQuery(context, 100).FirstOrDefault();
        FilteredQuery(context, 113).FirstOrDefault();
        FilteredQuery(context, 196).FirstOrDefault();
      }
    }
    public static void NonCompiledQueries()
    {
      using (var context = new BAEntities())
      {
        context.ContextOptions.DefaultQueryPlanCachingSetting = false;
        FilteredQuery(context, 3).FirstOrDefault();
        FilteredQuery(context, 100).FirstOrDefault();
        FilteredQuery(context, 113).FirstOrDefault();
        FilteredQuery(context, 196).FirstOrDefault();
      }
    }

    internal static IQueryable<Contact> FilteredQuery(BAEntities context, int id)
    {
      var query= from c in context.Contacts.Include("Addresses") where c.ContactID == id select c;
      return query;
    }

I used Visual Studio’s profiling tools to get the time taken for running the compiled queries and for running the queries with the compilation set off.

When I executed each method (CompiledQueries and NonCompiledQueries) 3 times, I found that the total time for the compiled queries ran about 5 times faster than the total time for the non-compiled queries.

When I executed each method 10 times, the compiled queries total time was about 3 times faster than the non-compiled.

Note that these are not benchmark numbers to be used, but just a general idea of the performance savings. The performance gain from using the pre-compiling queries is not news – although again, auto-compiled queries are not as fast as invoking a CompiledQuery. What’s news is that you now get the performance savings for free. Many developers aren’t even aware of the compiled queries. Some are daunted by the code that it takes to create them. And some scenarios are just too hard or in the case of DbContext, impossible, to leverage them.

WCF Data Services

I mentioned data services in the blog post title. Because this compilation is the default, that means that when you build WCF Data Services on top of an Entity Framework model, the services will also automatically get the performance savings as well.

Enjoy!

Entity Framework June 2011 CTP: TPT Inheritance Query Improvements

July 1, 2011Data AccessJulie

I want to look at some of the vast array of great improvements coming to EF that are part of the June 2011 CTP that was released yesterday.

Everyone’s going on and on about the enums. Smile

There’s a lot more in there. Not sure how many I’ll cover but first up will be the TPT store query improvements.

I’ll use a ridiculously simple model with code first to demonstrate and I’ll share with you a surprise discovery I made (which began with some head scratching this morning).

Beginning with one base class and a single derived class.

 public class Base
 {

   public int Id { get; set; }
   public string Name { get; set; }

 }

  public class TPTA: Base
  {
    public string PropA { get; set; }
  }

By default, code first does TPH and would create a single database table with both types represented. So I use a fluent mapping to force this to TPT. (Note that you can’t do this configuration with Data Annotations).

  public class Context: DbContext
  {
    public DbSet<Base> Bases { get; set; }

    protected override void OnModelCreating(DbModelBuilder modelBuilder)
    {
      modelBuilder.Entity<Base>().Map<TPTA>(m => m.ToTable("TPTTableA"));
    }

  }

Here’s my query:

 from c in context.Bases select new {c.Id, c.Name};

Notice I’m projecting only fields from the base type.

And the store query that results:

SELECT 
[Extent1].[Id] AS [Id], 
[Extent1].[Name] AS [Name]
FROM [dbo].[Bases] AS [Extent1]

Not really anything wrong there. So what’s the big deal? (This is where I got confused… Smile )

Now I’ll add in another derived type and I’ll modify the configuration to accommodate that as well.

public class TPTB: Base
{
  public string PropB { get; set; }
}

modelBuilder.Entity<Base>().Map<TPTA>(m => m.ToTable("TPTTableA"))
                           .Map<TPTB>(m => m.ToTable("TPTTableB"));

Execute the query again and look at the store query now!

SELECT 
[Extent1].[Id] AS [Id], 
[Extent1].[Name] AS [Name]
FROM  [dbo].[Bases] AS [Extent1]
LEFT OUTER JOIN  (SELECT 
    [Extent2].[Id] AS [Id]
    FROM [dbo].[TPTTableA] AS [Extent2]
UNION ALL
    SELECT 
    [Extent3].[Id] AS [Id]
    FROM [dbo].[TPTTableB] AS [Extent3]) AS [UnionAll1] ON [Extent1].[Id] = [UnionAll1].[Id]

Egad!

Now, after switching to the new bits (retargeting the project and removing EntityFramework.dll (4.1) and referencing System.Data.Entity.dll (4.2) instead)

The query against the new model (with two derived types) is trimmed back to all that’s truly necessary:

SELECT 
[Extent1].[Id] AS [Id], 
[Extent1].[Name] AS [Name]
FROM [dbo].[Bases] AS [Extent1]

Here’s some worse ugliness in EF 4.0. Forgetting the projection, I’m now querying for all Bases including the two derived types. I.e. “context.Bases”.

SELECT 
CASE WHEN (( NOT (([UnionAll1].[C2] = 1) AND ([UnionAll1].[C2] IS NOT NULL))) 
     AND ( NOT (([UnionAll1].[C3] = 1) AND ([UnionAll1].[C3] IS NOT NULL)))) THEN '0X' WHEN (([UnionAll1].[C2] = 1) 
     AND ([UnionAll1].[C2] IS NOT NULL)) THEN '0X0X' ELSE '0X1X' END AS [C1], 
[Extent1].[Id] AS [Id], 
[Extent1].[Name] AS [Name], 
CASE WHEN (( NOT (([UnionAll1].[C2] = 1) AND ([UnionAll1].[C2] IS NOT NULL))) AND ( NOT (([UnionAll1].[C3] = 1) 
     AND ([UnionAll1].[C3] IS NOT NULL)))) THEN CAST(NULL AS varchar(1)) WHEN (([UnionAll1].[C2] = 1) 
     AND ([UnionAll1].[C2] IS NOT NULL)) THEN [UnionAll1].[PropA] END AS [C2], 
CASE WHEN (( NOT (([UnionAll1].[C2] = 1) AND ([UnionAll1].[C2] IS NOT NULL))) AND ( NOT (([UnionAll1].[C3] = 1) 
     AND ([UnionAll1].[C3] IS NOT NULL)))) THEN CAST(NULL AS varchar(1)) WHEN (([UnionAll1].[C2] = 1) 
     AND ([UnionAll1].[C2] IS NOT NULL)) THEN CAST(NULL AS varchar(1)) ELSE [UnionAll1].[C1] END AS [C3]
FROM  [dbo].[Bases] AS [Extent1]
LEFT OUTER JOIN  (SELECT 
    [Extent2].[Id] AS [Id], 
    [Extent2].[PropA] AS [PropA], 
    CAST(NULL AS varchar(1)) AS [C1], 
    cast(1 as bit) AS [C2], 
    cast(0 as bit) AS [C3]
    FROM [dbo].[TPTTableA] AS [Extent2]
UNION ALL
    SELECT 
    [Extent3].[Id] AS [Id], 
    CAST(NULL AS varchar(1)) AS [C1], 
    [Extent3].[PropB] AS [PropB], 
    cast(0 as bit) AS [C2], 
    cast(1 as bit) AS [C3]
    FROM [dbo].[TPTTableB] AS [Extent3]) AS [UnionAll1] ON [Extent1].[Id] = [UnionAll1].[Id]

And now with the new CTP:

SELECT 
CASE WHEN (( NOT (([Project1].[C1] = 1) AND ([Project1].[C1] IS NOT NULL)))
     AND ( NOT (([Project2].[C1] = 1) AND ([Project2].[C1] IS NOT NULL)))) THEN '0X' WHEN (([Project1].[C1] = 1)
     AND ([Project1].[C1] IS NOT NULL)) THEN '0X0X' ELSE '0X1X' END AS [C1], 
[Extent1].[Id] AS [Id], 
[Extent1].[Name] AS [Name], 
CASE WHEN (( NOT (([Project1].[C1] = 1) AND ([Project1].[C1] IS NOT NULL))) AND ( NOT (([Project2].[C1] = 1)
     AND ([Project2].[C1] IS NOT NULL)))) THEN CAST(NULL AS varchar(1)) WHEN (([Project1].[C1] = 1)
     AND ([Project1].[C1] IS NOT NULL)) THEN [Project1].[PropA] END AS [C2], 
CASE WHEN (( NOT (([Project1].[C1] = 1) AND ([Project1].[C1] IS NOT NULL))) AND ( NOT (([Project2].[C1] = 1)
     AND ([Project2].[C1] IS NOT NULL)))) THEN CAST(NULL AS varchar(1)) WHEN (([Project1].[C1] = 1)
     AND ([Project1].[C1] IS NOT NULL)) THEN CAST(NULL AS varchar(1)) ELSE [Project2].[PropB] END AS [C3]
FROM   [dbo].[Bases] AS [Extent1]
LEFT OUTER JOIN  (SELECT 
    [Extent2].[Id] AS [Id], 
    [Extent2].[PropA] AS [PropA], 
    cast(1 as bit) AS [C1]
    FROM [dbo].[TPTTableA] AS [Extent2] ) AS [Project1] ON [Extent1].[Id] = [Project1].[Id]
LEFT OUTER JOIN  (SELECT 
    [Extent3].[Id] AS [Id], 
    [Extent3].[PropB] AS [PropB], 
    cast(1 as bit) AS [C1]
    FROM [dbo].[TPTTableB] AS [Extent3] ) AS [Project2] ON [Extent1].[Id] = [Project2].[Id]

At first glance you may think “but it’s just as long and just as ugly” but look more closely:

Notice that the first query uses a UNION for the 2nd derived type but the second uses another LEFT OUTER JOIN. Also the “Cast 0 as bit” is gone from the 2nd query. I am not a database performance guru but I’m hoping/guessing that all of the work involved to make this change was oriented towards better performance. Perhaps a DB guru can confirm. Google wasn’t able to. Winking smile

July 6th Update: I talked with Kati Iceva who is the EF query compiler (among other things) goddess on the EF team. She told me that the first query (projecting from the base) is where to look for benefits. The second one is not there yet. The fact that they’ve got the outer joins now rather than the Union and that they lost the extra cast is a setup for some future improvements to TPT queries that they’ll be able to implement. There is a TPT blog post forthcoming on the team blog so but if you’re interested enough in EF to read my blog, you probably read that one (and their EF Design blog) already.

I did look at the query execution plans in SSMS. They are different but I’m not qualified to understand the impact.

Here’s the plan from the first query (from EF 4 with the unions)

and the one from the CTP generated query:

EF4 books and EF 4.1

June 22, 2011Data AccessJulie

I’ve been asked repeatedly about the viability of my book, Programming Entity Framework, and the other EF4 books out there now that EF 4.1 has been released.

The EF4 books are still totally viable and important if you truly want to learn Entity Framework for the purpose of writing enterprise applications.

Entity Framework 4.1 simply adds two new features to Entity Framework. The first is an additional way to create a model, called “Code First”, the second is a stripped down version of the ObjectContext. They are *awesome* additions for those who like to program this way. And you can always hook back to the ObjectContext if/when you want to get the granular control you will likely need if you’re writing an enterprise app. I love code first and I love the DbContext. But I also love Database First for many scenarios, Model First for many scenarios and the extreme power that the ObjectContext provides me when I want it. And when I’m writing real apps, not just demos, I definitely want the ObjectContext available to me!

But…

The core APIs don’t change.

The need to understand the ins and outs of the conceptual model and mappings doesn’t change.

The need to understand how to use EF in an application shifts at the surface, but if you need to do more than simple drag & drop applications, you’ll need to understand how to work with the underlying API.

The need to understand how EF works and how you can manipulate to improve performance, to do threading, to control transactions does not change.

The need to understand how changes are tracked and how relationships are managed and how to affect those features so that you can make application back ends work the way you want them to does not change.

Currently my recommendation is that if you think you want to use Code First and the DbContext, take a look at what’s in there (great blog posts by EF team at blogs.msdn.com/adonet, a series of 11 short videos (with accompanying articles and download samples) that I created for MSDN at msdn.com/data/ef, a new EF 4.1 course (and more coming) on Pluralsight and many articles & blog posts written by very knowledgeable community members) to learn how to use code first and DbContext when working with EF.

But if you are writing applications with Entity Framework, you’ll still want to learn how to program with it. And for that, you have three great books that you can learn from – all take a different approach and they are quite complementary to each other. Yes, mine is one of them and of course I recommend it. I spent a year on the first edition and another year on the second edition. I worked very hard to learn EF inside and out to help you learn it, too. I also recommend Entity Framework 4.0 Recipes and Entity Framework 4 in Action. Again, these books take a very different approach than mine and in my opinion, the three books complement each other nicely.

In case you think this might be a desperate attempt to sell more books, it is hardly that. Tech book authors get a *very* small royalty per book sold. Tech book writing is not typically a financially rewarding endeavor. Instead, the desperate attempt I’m making is to offer a better explanation for the many people who keep asking me when I will be updating my book to EF 4.1 – a better response than I can provide in the 140 characters on twitter.

The out of band release of EF 4.1 may be followed by others so right now EF is a moving target. Updating Programming Entity Framework to incorporate the new features just doesn’t make sense. I think using the core books to learn EF and then supplementing with the EF 4.1 content that can be produced and shared more quickly (online articles & videos from trusted resources) is your best bet.

Turning off Code First Database Initialization completely

June 21, 2011Data AccessJulie

I was using a hack to turn off the Database Initialization for code first. I didn’t want code first to do the model/database comparison, and found the best way I could figure out…I deleted the EDMMetadata table in the database.

Yes, a total hack.

Sergey Barskiy and Rowan Miller straightened me out.

Even though the custom conventions did not make it into EF 4.1, there are still some controls for the default conventions. I had totally forgotten about this. But this gives you one of the ways to turn of the initialization.

You do this in the OnModelCreating override in the context class:

     protected override void OnModelCreating(DbModelBuilder modelBuilder)
     {
        modelBuilder.Conventions.Remove<IncludeMetadataConvention>();
     }

That’s System.Data.Entity.Infrastructure.IncludeMetadataConvention, if you are curious about which namespace I’m using.

The other way is to set the DbInitialization strategy go null (as opposed to letting code first just use the default “CreateDatabaseWhenNotExist”, or using one of the others e.g. “DropCreateDatabaseWhenModelChanges” or even a custom strategy that you’ve created based on one of the built in strategies.

You typically set DbInitialization strategies at application startup, for example in global.asax for web apps and you do the same even if you are setting it to null.

Here’s an example of how to do that given that I have a context class called BlogContext that inherits from DbContext:

 Database.SetInitializer<BlogContext>(null);

MVC3.1 Scaffolding Magic with Database (or Model) First , Not Just Code First

June 12, 2011Data AccessJulie

The MVC3.1 scaffolding that was released at Mix can auto-magically create an EF 4.1 DBContext class, the basic CRUD code in your controller and the relevant views all in one fell swoop. (Don’t forget the additional scaffolding tools that will build things more intelligently, i.e. with a Repository (http://blog.stevensanderson.com/2011/01/13/scaffold-your-aspnet-mvc-3-project-with-the-mvcscaffolding-package/).

All of the demos of this, including my own [MVC 3 and EF 4.1 Code First: Here are my classes, now you do the rest, kthxbai] demonstrate the new scaffolding using Code First. In other words, just provide some classes and the scaffold will do the rest, including build the context class.

I saw a note on twitter from someone asking about using this feature with an EDMX file instead of going the code first way. You can absolutely do that. Here’s a simple demo of how that works and I’m using the in-the-box template in the MVC 3.1 toolkit — though admittedly, for my own purposes, I’m more likely start with the template that creates a repository.

1) Start with a class library project where I’ve created a database first EDMX file.

2) Use the designer’s “Add Code Generation Item” and select the DbContext T4 template included with Entity Framework 4.1. That will add two templates to the project. The first creates a DbContext and the second creates a set of very simple classes – one for each of the entities in your model.

3) Add an ASP.NET MVC3 project to the solution.

4) Add a reference to the project with the model.

5) BUILD THE SOLUTION! This way the Controller wizard will find the classes in the other project.

6) Copy the connection string from the model project’s app.config into the mvc project’s web.config. Step 7 will complain otherwise.

7) Add a new controller to the project. Use the new scaffolding template (Controller with read/write actions and views using EF) and choose the generated class you want the controller for AND the generated context from the other project.

That will create the controller and the views. The controller has all of the super-basic data access code for read/write/update/delete of the class you selected. It’s a start. Smile

Almost ready to run but first a bit of house keeping.

8) The most common step to forget! Modify the global.asax which will look for the Home controller by default. Change it so that the routing looks for your new controller.

9) Time to test out the app. Here’s the home page. I did some editing also. It all just works.

I highly recommend checking out the alternate scaffolding templates available in the mvc3 scaffolding package linked to above.

Code First’s Column DataAnnotation and the ORDER attribute

June 10, 2011Data AccessJulie

The ColumnAttribute lets you change three things about property mapping to database columns.

The name
The order in which it appears in the field list in the table
The type

If you look at the documentation for ORDER on ColumnAttribute it simply tells you that its “zero-based” and I comletely misundestood that.

I set a property’s order to “1”

[Column("DateStarted",Order=1,TypeName ="date")]
public DateTime CreateDate { get; set; }

expecting it to magically become the 2nd column listed in the table.

But it was the first.

Here’s why with thanks to David DeWinter for pointing me in the right direction.

The value is relative to all other properties.

And all of the other properties are not 0, 1, 2, 3, etc.

The default is Int32.Max, i.e. 2,147,483,647.

So my DateStarted field had the lowest value after all. It was Order=1 while all of the others were Order=2,147,483,647.

You’d see the effect better if you set the order on a number of the properties. The lowest # you can use is zero, which is where “zero-based” comes in.

Might have been obvious to others, but I was awfully confused. But you know, it’s Friday afternoon….

And I’ve been told the docs will get a clarification. 🙂

The Data Farm

Julie Lerman's World of Data