EF Table Splitting – The Opposite of Entity Splitting

There’s an Entity Framework mapping that I have paid very little attention to because I didn’t realize it was possible. It was possible, just not obvious or easy in VS2008 SP1.  Thanks to Adam Cogan, who asked a question about this feature, I’ve discovered that it gets a lot more usable in EF4.

The mapping is Table Splitting.

You may be familiar with Entity Splitting, where one Entity maps to two different tables.

My classic example of Entity Splitting begins with these two tables in my database:

entitysplittables

There is a one to one relationship, a matching primary key (PersonID) and PersonalDetails merely extends the Person table.

In the model I can combine the fields into a single entity by mapping both tables to the entity.

entiitysplitmodel

Table Splitting is just the opposite. I want to create two entities from a single table.

Take a look at Noam Ben-Ami’s December 2008 blog post about table splitting in EFv1. The only missing piece was that this requires a constraint that you had to add manually in the XML. That’s no longer the case with EF4 (in teh Beta 1 that I’m using). I’ll walk you through it.

Here’s the AdventureWorksLT SalesOrderHeader table along with the default entity created from it. It has a LOT of fields/properties.

tablesplitting_table tablesplitting_modela

I will create a separate entity just to house all of the monetary data (SubTotal, TaxAmt, Frieght and TotalDue).

Following Noam’s walkthrough, I copy & paste teh entity, then remove the unwanted fields from each entity.

tablesplitting_modelb

The new entity needs to be mapped. It maps to SalesOrderHeader just like the original entity.

tablesplitting_modelab

The next step is to create a 1:1 association between the two.

tablesplitting_modelc

And then map the association. (See how much work the wizard saves you when you are bringing pre-existing relationships in from a database?)

tablesplitting_modeld1

This is the big difference with EF4, the constraint is automatically created for me whereas in v1 you had to build it yourself. Even if you had to build it yourself, it’s a lot easier now with the constraint being in the designer.

tablesplitting_modeld

Here’s the detail window behind the constraint.

tablesplitting_modele

Now the model validates. What about some code?

I wrote a bit of code that grabs just the header, modifies and saves that, then loads a related MoneyDetail and modifies and saves that entity.

                List<SalesOrderHeader> orders = context.SalesOrderHeaders.ToList();
                SalesOrderHeader order = orders[0];
                order.SalesOrderNumber = "JS321";
                context.SaveChanges();
                order.OrderMoneyDetailsReference.Load();
                order.OrderMoneyDetails.TaxAmt=46;
                context.SaveChanges();
  
SQL Profiler 

SQL Profiler shows me the individual database requests.

On the first query, SQL Server queries for just the fields needed for the SalesOrderHeader entities. None of the money fields were requested.

SELECT
1 AS [C1],
[Extent1].[SalesOrderID] AS [SalesOrderID],
[Extent1].[RevisionNumber] AS [RevisionNumber],
[Extent1].[OrderDate] AS [OrderDate],
[Extent1].[DueDate] AS [DueDate],
[Extent1].[ShipDate] AS [ShipDate],
[Extent1].[Status] AS [Status],
[Extent1].[OnlineOrderFlag] AS [OnlineOrderFlag],
[Extent1].[SalesOrderNumber] AS [SalesOrderNumber],
[Extent1].[PurchaseOrderNumber] AS [PurchaseOrderNumber],
[Extent1].[AccountNumber] AS [AccountNumber],
[Extent1].[ShipMethod] AS [ShipMethod],
[Extent1].[CreditCardApprovalCode] AS [CreditCardApprovalCode],
[Extent1].[Comment] AS [Comment],
[Extent1].[rowguid] AS [rowguid],
[Extent1].[ModifiedDate] AS [ModifiedDate],
[Extent1].[BillToAddressID] AS [BillToAddressID],
[Extent1].[ShipToAddressID] AS [ShipToAddressID],
[Extent1].[CustomerID] AS [CustomerID]
FROM [SalesLT].[SalesOrderHeader] AS [Extent1]

Then I get the update

exec sp_executesql N'declare @p int
update [SalesLT].[SalesOrderHeader]
set @p = 0
where ([SalesOrderID] = @0)
select [SalesOrderNumber], [TotalDue]
from [SalesLT].[SalesOrderHeader]
where @@ROWCOUNT > 0 and [SalesOrderID] = @0',N'@0 int',@0=71774

Next I get the query for the other fields, as a result of the Load.

exec sp_executesql N'SELECT 
1 AS [C1], 
[Extent1].[SalesOrderID] AS [SalesOrderID], 
[Extent1].[SubTotal] AS [SubTotal], 
[Extent1].[TaxAmt] AS [TaxAmt], 
[Extent1].[Freight] AS [Freight], 
[Extent1].[TotalDue] AS [TotalDue]
FROM [SalesLT].[SalesOrderHeader] AS [Extent1]
WHERE [Extent1].[SalesOrderID] = @EntityKeyValue1',N'@EntityKeyValue1 int',@EntityKeyValue1=71774

and finally, the second update that pushes in the new Tax Amount value.

exec sp_executesql N'update [SalesLT].[SalesOrderHeader]
set [TaxAmt] = @0
where ([SalesOrderID] = @1)
select [SalesOrderNumber], [TotalDue]
from [SalesLT].[SalesOrderHeader]
where @@ROWCOUNT > 0 and [SalesOrderID] = @1',N'@0 decimal(19,4),@1 int',@0=46.0000,@1=71774

So..Table Splitting is a totally viable (and useful) type of mapping that has been around since EF was released with VS2008 SP1, but most people probably thought it wasn’t possible because of the error you would get if you didn’t know to set the constraint.

You’ll still get the error, but setting the constraint is a heck of a lot easier now.

I will find out why the constraint

#1 Dave on 8.12.2009 at 5:07 AM

...What happens if you "insert" one entity and the other entity has not-null limitations? In your example, create a SalesOrderHeader and have a not-null constraight on Taxamt.

#2 David D. on 10.14.2009 at 1:17 PM

Hi Julie,

I am currently reading your book and enjoying it so far. I'm having a problem with persistance in table splitting. I have a table thats split into two entities and it retrieves and updates fine. But I get the following error when I try to save a new one: "A dependent property in a ReferentialConstraint is mapped to a storage-generated column.". Any help would be appreciated.

#3 Leonard Lobel on 8.15.2010 at 2:18 PM

Hi Julie,

First let me say I'm a big fan and love your work. Thanks for sharing your knowledge!

I have a simple entity-splitting scenario, which I (so far) haven’t been able to properly implement with EF4.

Two tables: Customer and CustomerPreference. One row for every customer in the Customer table, which has a CustomerID primary key that’s an int identity. Some (but not all) customers have a single (one-to-one) related row in the CustomerPreference table, which also has a CustomerID primary key that’s an int (but not an identity, naturally). Essentially, some customers have “extended” information in a secondary table, which I want to abstract in EF4.

I have created the model with a single Customer entity that includes properties for the columns in both the Customer and CustomerPreference tables. If every customer has a row in both tables, everything works fine. But that’s not the case here—nor is it the case in many typical scenarios where vertical table partitioning is used to prevent needless nulls across many columns for rows that don’t required the “extended” information.

To get this to work, I’m encountering two problems. I have found a workaround for the first problem, but certainly hope to hear there’s a better way. And I’m completely stumped on the second problem.

Problem #1: The order of table names JOINed in the dynamically-generated FROM clause

Using SQL Profiler to examining the dynamic SQL generated by the EF runtime, it appears that the order in which you map tables to the entity in the EF designer directly impacts the order in which table names are JOINed in the dynamically-generated FROM clause. I certainly want to select from Customer first, and then join on CustomerPreference, because doing it in the opposite order could never return all customers; it would only return customers that had preferences. But the generated SQL was nevertheless JOINing the tables in the wrong order: CustomerPreference to Customer.

I “encouraged” EF to generate the correct SQL with this workaround: In particular, it seems that the order of tables in the generated SQL gets reversed(!) from the order in which they’re mapped in the designer. That is, if you map entity properties in the designer to the CustomerPreference table first, and then to the Customer table, the dynamically generated SQL JOINs the Customer table to the CustomerPreference as desired, rather than the undesired behavior of JOINing CustomerPreference to Customer, when entity properties are mapped in order of Customer, CustomerPreference.

So reversing the order of the mappings in the designer did the trick, but that's certainly not intuitive! I’m wondering why the runtime can’t figure out that the Customer table should always be specified first in the FROM clause (regardless of the order of mappings in the designer), since that’s the "master" table—being that it’s the only one with the “identity” PK.

Problem #2: The SQL always uses INNER JOIN when I really need a LEFT JOIN

Again, not every “master” row in Customer has an “extended” row in CustomerPreference. That means I need a LEFT JOIN and not an INNER JOIN. So even though I got the order of the tables right in the FROM clause, I'm still not retrieving customers that have no preferences! I’m hoping this is something simple, and I’m just being dense for not finding it myself, but where can I tell EF to use LEFT rather than INNER? This is an absolute EF deal-breaker. If I can't, then I can't use EF for entity-splitting.

Thanks in advance for your help...

~ Lenni