Entity Framework: Filtering Related Entity Collections

I’ve been using Entity Framework with SQL Server for several years but today I ran into a scenario that I can’t believe I’ve never encountered before. I was writing a LINQ query to pull back a single entity along with its related entity collections. The catch is that I only need a portion of the results for one of the related entity collections. Getting this to work proved to be a little less intuitive than I would have thought.  So, I thought I would write up a quick post to show you the issue and how I was able to get it to work.

I created a simplified example of the scenario to illustrate what I was trying to do.  Below is a database diagram that shows the tables for the query.  Basically, there is a product catalog that contains 1-n products from 1-n vendors.  Each vendor should only be able to edit products that belong to them.

database diagram

Eagerly loading with .Include()

I’ve always used the .Include() method to eagerly load any related entity collections that I needed populated by the query.

var catalog = (from c in context.Catalogs.Include(p => p.Products)
	       where c.CatalogId == 1
	       select c).FirstOrDefault();

This method works great when you want all of the related rows returned  as part of the query.  Executing the query above against the sample dataset in LINQPad returns the 6 rows shown below.  Notice that there are 2 vendors with 3 products each.

eagerly loaded query

But what if I only want to return the rows where VendorId = 1?  Using .Include(), there is no way to accomplish this.  As of Entity Framework 6, there isn’t a way to filter which related entities are loaded when using .Include().

Explicit Loading

The answer to this issue is to explicitly load the related collections.  With explicit loading you get access to the underlying query that will be used to load the collection and apply some filtering criteria to it.  Once you have adjusted the query to meet your needs, calling .Load() on the query will fill the collection.  Below is our modified LINQ query.  It isn’t as clean or concise since it is now 2 statements, but it does work.

var catalog = (from c in context.Catalogs
	       where c.CatalogId == 1
	       select c).FirstOrDefault();

context.Entry(catalog)
	.Collection(p => p.Products)
	.Query()
	.Where(p => p.VendorId == 1)
	.Load();

Notes: Make sure that you are “Using” the System.Data.Entity namespace.  Some of the extension methods above are in that namespace and won’t resolve without the using directive. Also, I found that lazy loading must be disabled on the context for explicit loading to work (more on lazy loading below).

Executing the LINQ statement(s) above now returns the desired results.

explicitly loaded query

Lazy Loading

As mentioned above, lazy loading must be disabled on the context for explicit loading to work. If lazy loading is enabled, an additional query will be executed against the database the first time your collection is accessed and your explicitly loaded results will be overwritten.

using(var context = new TestContext())
{
    context.Configuration.LazyLoadingEnabled = false;

    var catalog = (from c in context.Catalogs
		   where c.CatalogId == 1
		   select c).FirstOrDefault();


    context.Entry(catalog)
           .Collection(p => p.Products)
	   .Query()
	   .Where(p => p.VendorId == 1)
	   .Load();
}

If you aren’t disabling lazy loading on the context for the entire application and are trying to manage it on a method by method basis the lifetime and scope of the context and whether lazy loading is enabled / disabled at the time your related entity collection is accessed are key points to keep in mind. If you find you are getting unexpected results, open SQL Server Profiler and watch what queries are being executed against the database and when.