I want to look at some of the vast array of great improvements coming to EF that are part of the June 2011 CTP that was released yesterday.
Everyone’s going on and on about the enums.
There’s a lot more in there. Not sure how many I’ll cover but first up will be the TPT store query improvements.
I’ll use a ridiculously simple model with code first to demonstrate and I’ll share with you a surprise discovery I made (which began with some head scratching this morning).
Beginning with one base class and a single derived class.
public class Base
{
public int Id { get; set; }
public string Name { get; set; }
}
public class TPTA: Base
{
public string PropA { get; set; }
}
By default, code first does TPH and would create a single database table with both types represented. So I use a fluent mapping to force this to TPT. (Note that you can’t do this configuration with Data Annotations).
public class Context: DbContext
{
public DbSet<Base> Bases { get; set; }
protected override void OnModelCreating(DbModelBuilder modelBuilder)
{
modelBuilder.Entity<Base>().Map<TPTA>(m => m.ToTable("TPTTableA"));
}
}
Here’s my query:
from c in context.Bases select new {c.Id, c.Name};
Notice I’m projecting only fields from the base type.
And the store query that results:
SELECT
[Extent1].[Id] AS [Id],
[Extent1].[Name] AS [Name]
FROM [dbo].[Bases] AS [Extent1]
Not really anything wrong there. So what’s the big deal? (This is where I got confused… )
Now I’ll add in another derived type and I’ll modify the configuration to accommodate that as well.
public class TPTB: Base
{
public string PropB { get; set; }
}
modelBuilder.Entity<Base>().Map<TPTA>(m => m.ToTable("TPTTableA"))
.Map<TPTB>(m => m.ToTable("TPTTableB"));
Execute the query again and look at the store query now!
SELECT
[Extent1].[Id] AS [Id],
[Extent1].[Name] AS [Name]
FROM [dbo].[Bases] AS [Extent1]
LEFT OUTER JOIN (SELECT
[Extent2].[Id] AS [Id]
FROM [dbo].[TPTTableA] AS [Extent2]
UNION ALL
SELECT
[Extent3].[Id] AS [Id]
FROM [dbo].[TPTTableB] AS [Extent3]) AS [UnionAll1] ON [Extent1].[Id] = [UnionAll1].[Id]
Egad!
Now, after switching to the new bits (retargeting the project and removing EntityFramework.dll (4.1) and referencing System.Data.Entity.dll (4.2) instead)
The query against the new model (with two derived types) is trimmed back to all that’s truly necessary:
SELECT
[Extent1].[Id] AS [Id],
[Extent1].[Name] AS [Name]
FROM [dbo].[Bases] AS [Extent1]
Here’s some worse ugliness in EF 4.0. Forgetting the projection, I’m now querying for all Bases including the two derived types. I.e. “context.Bases”.
SELECT
CASE WHEN (( NOT (([UnionAll1].[C2] = 1) AND ([UnionAll1].[C2] IS NOT NULL)))
AND ( NOT (([UnionAll1].[C3] = 1) AND ([UnionAll1].[C3] IS NOT NULL)))) THEN '0X' WHEN (([UnionAll1].[C2] = 1)
AND ([UnionAll1].[C2] IS NOT NULL)) THEN '0X0X' ELSE '0X1X' END AS [C1],
[Extent1].[Id] AS [Id],
[Extent1].[Name] AS [Name],
CASE WHEN (( NOT (([UnionAll1].[C2] = 1) AND ([UnionAll1].[C2] IS NOT NULL))) AND ( NOT (([UnionAll1].[C3] = 1)
AND ([UnionAll1].[C3] IS NOT NULL)))) THEN CAST(NULL AS varchar(1)) WHEN (([UnionAll1].[C2] = 1)
AND ([UnionAll1].[C2] IS NOT NULL)) THEN [UnionAll1].[PropA] END AS [C2],
CASE WHEN (( NOT (([UnionAll1].[C2] = 1) AND ([UnionAll1].[C2] IS NOT NULL))) AND ( NOT (([UnionAll1].[C3] = 1)
AND ([UnionAll1].[C3] IS NOT NULL)))) THEN CAST(NULL AS varchar(1)) WHEN (([UnionAll1].[C2] = 1)
AND ([UnionAll1].[C2] IS NOT NULL)) THEN CAST(NULL AS varchar(1)) ELSE [UnionAll1].[C1] END AS [C3]
FROM [dbo].[Bases] AS [Extent1]
LEFT OUTER JOIN (SELECT
[Extent2].[Id] AS [Id],
[Extent2].[PropA] AS [PropA],
CAST(NULL AS varchar(1)) AS [C1],
cast(1 as bit) AS [C2],
cast(0 as bit) AS [C3]
FROM [dbo].[TPTTableA] AS [Extent2]
UNION ALL
SELECT
[Extent3].[Id] AS [Id],
CAST(NULL AS varchar(1)) AS [C1],
[Extent3].[PropB] AS [PropB],
cast(0 as bit) AS [C2],
cast(1 as bit) AS [C3]
FROM [dbo].[TPTTableB] AS [Extent3]) AS [UnionAll1] ON [Extent1].[Id] = [UnionAll1].[Id]
And now with the new CTP:
SELECT
CASE WHEN (( NOT (([Project1].[C1] = 1) AND ([Project1].[C1] IS NOT NULL)))
AND ( NOT (([Project2].[C1] = 1) AND ([Project2].[C1] IS NOT NULL)))) THEN '0X' WHEN (([Project1].[C1] = 1)
AND ([Project1].[C1] IS NOT NULL)) THEN '0X0X' ELSE '0X1X' END AS [C1],
[Extent1].[Id] AS [Id],
[Extent1].[Name] AS [Name],
CASE WHEN (( NOT (([Project1].[C1] = 1) AND ([Project1].[C1] IS NOT NULL))) AND ( NOT (([Project2].[C1] = 1)
AND ([Project2].[C1] IS NOT NULL)))) THEN CAST(NULL AS varchar(1)) WHEN (([Project1].[C1] = 1)
AND ([Project1].[C1] IS NOT NULL)) THEN [Project1].[PropA] END AS [C2],
CASE WHEN (( NOT (([Project1].[C1] = 1) AND ([Project1].[C1] IS NOT NULL))) AND ( NOT (([Project2].[C1] = 1)
AND ([Project2].[C1] IS NOT NULL)))) THEN CAST(NULL AS varchar(1)) WHEN (([Project1].[C1] = 1)
AND ([Project1].[C1] IS NOT NULL)) THEN CAST(NULL AS varchar(1)) ELSE [Project2].[PropB] END AS [C3]
FROM [dbo].[Bases] AS [Extent1]
LEFT OUTER JOIN (SELECT
[Extent2].[Id] AS [Id],
[Extent2].[PropA] AS [PropA],
cast(1 as bit) AS [C1]
FROM [dbo].[TPTTableA] AS [Extent2] ) AS [Project1] ON [Extent1].[Id] = [Project1].[Id]
LEFT OUTER JOIN (SELECT
[Extent3].[Id] AS [Id],
[Extent3].[PropB] AS [PropB],
cast(1 as bit) AS [C1]
FROM [dbo].[TPTTableB] AS [Extent3] ) AS [Project2] ON [Extent1].[Id] = [Project2].[Id]
At first glance you may think “but it’s just as long and just as ugly” but look more closely:
Notice that the first query uses a UNION for the 2nd derived type but the second uses another LEFT OUTER JOIN. Also the “Cast 0 as bit” is gone from the 2nd query. I am not a database performance guru but I’m hoping/guessing that all of the work involved to make this change was oriented towards better performance. Perhaps a DB guru can confirm. Google wasn’t able to.
July 6th Update: I talked with Kati Iceva who is the EF query compiler (among other things) goddess on the EF team. She told me that the first query (projecting from the base) is where to look for benefits. The second one is not there yet. The fact that they’ve got the outer joins now rather than the Union and that they lost the extra cast is a setup for some future improvements to TPT queries that they’ll be able to implement. There is a TPT blog post forthcoming on the team blog so but if you’re interested enough in EF to read my blog, you probably read that one (and their EF Design blog) already.
I did look at the query execution plans in SSMS. They are different but I’m not qualified to understand the impact.
Here’s the plan from the first query (from EF 4 with the unions)
and the one from the CTP generated query: