lazy load bug? customer.orders is null

Jul 4, 2010 at 6:18 PM

first thank you for your great code.

i found the lazy load does not work perfectly, here is my code:

var policy = new EntityPolicy();

policy.IncludeWith<Customer>(c => c.Orders, true);
policy.IncludeWith<Order>(o => o.Customer, true);

Northwind nw = new Northwind(this.provider.New(policy));

var cust = nw.Customers.Where(c => c.CustomerID == "ALFKI").First();
var order = cust.Orders.First();

var cust2 = order.Customer;
var orders2 = cust2.Orders;

AssertNotValue(null, orders2); //fail

is this a bug? or by design?

thank you!

Coordinator
Aug 11, 2010 at 10:15 PM

Are you using the customers and orders definitions from the toolkit tests?

Sep 7, 2010 at 8:28 AM

yes, i just added a new Test,

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using IQToolkit.Data;

namespace Test
{
    public class NorthwindAssociationTests : NorthwindTestHarness
    {
        public static void Run(Northwind db)
        {
            new NorthwindAssociationTests().RunTests(db, null, null, true);
        }

        public static void Run(Northwind db, string testName)
        {
            new NorthwindAssociationTests().RunTest(db, null, true, testName);
        }

        public void TestMultiLevelAssociation()
        {
            var policy = new EntityPolicy();

            policy.IncludeWith<Customer>(c => c.Orders, true);
            policy.IncludeWith<Order>(o => o.Customer, true);

            Northwind nw = new Northwind(this.provider.New(policy));

            var cust = nw.Customers.Where(c => c.CustomerID == "ALFKI").First();
            var order = cust.Orders.First();

            var cust2 = order.Customer;
            var orders2 = cust2.Orders;

            AssertNotValue(null, orders2); //fail
        }
    }
}

 

and then call it in the main function:

try
{
    var db = new Northwind(provider);

    //NorthwindTranslationTests.Run(db, true);
    //NorthwindExecutionTests.Run(db);
    //NorthwindCUDTests.Run(db);
    //MultiTableTests.Run(new MultiTableContext(provider.New(new AttributeMapping(typeof(MultiTableContext)))));
    //NorthwindPerfTests.Run(db, "TestStandardQuery");
    NorthwindAssociationTests.Run(db);
}
finally
{
    provider.Connection.Close();
}

 

Jan 3, 2012 at 3:33 AM

IQToolkit attempts to build the entire query expression at once, and was missing a critical ability to apply policy on other policy pieces. Simply applying the policy recursively like this would have led to other problems too. The main issue being that circular references between objects will never completely be satisfied. In the northwind database example, each order must query for the customer and each customer must query for that order. Applying the customer policy for each order and applying the order policy for each customer would be infinitely recursive. There are two problems to solve here.

What's necessary is the ability defer the creation of relationship query expressions, converting references into queries at the last-possible-point before they are actually going to be needed.

Even with deferred expression building, there will still be endlessly recursive relationships if neither lazy loading is not being used. This will be a particularly troublesome problem in the case of circular references.

To solve this problem, entities need to be cached. After an entity is materialized from the database, a copy of it needs to be known to the query provider, and in subsequent queries which would materialize the same entity, the cached copy should be used instead of repeatedly querying the database. 

With both of these features, it should be possible to always completely resolve all entity relationships. The worst case scenario is that to do so, you need to load the entire database into memory (and if you have a large database, this may be impossible).

To work with these features, you should really be cautious about how you define your policy and entitiies. If your policy has many relationships, you are going to be at risk of loading your entire database into memory. This can be mitigated by creating your entities with defer loaded properties (e.g. IDeferLoadable, IEnumerable, IList) and setting up policies that will use defered loading (second parameter must be set to true, the default is false).

My implementation uses a closure to propogate the cache to subqueries. A new entity cache is created for each linq expression you write. All subqueries will resolve relationships using the same cache. There may be some memory use drawbacks to this approach, but the alternatives seemed worse.

I decided against a global caching solution because it would eventually load the entire database, and then updates / deletes would never show up, or some cache invalidation strategy would be necesarry, or even worse the user would be responsible for manually invalidating the cache.

This closure based solution means that each linq query will create a dictionary, and special entity-key instances for each cached entity. Once all defer loading is completed i would expect that the dictionary could be garbage collected normally. For non-defer loaded properties the cache will disappear automatically. For defer loaded, you may be able to null out unnecesarry properties to kill the cache.

Another piece that became necesarry as a result of this change was that the context instance is important. All of the unit tests had to be refactored as highlighted in the code sample below.

        public void TestCustomersIncludeOrders()
        {
            var policy = new EntityPolicy();
            policy.IncludeWith<Customer>(c => c.Orders);
            // Northwind nw = new Northwind(this.provider.New(policy));
            NorthwindWithAttributes nw = new NorthwindWithAttributes(this.provider.New(policy));

            var custs = nw.Customers.Where(c => c.CustomerID == "ALFKI").ToList();
            AssertValue(1, custs.Count);
            AssertNotValue(null, custs[0].Orders);
            AssertValue(6, custs[0].Orders.Count);
        }

In the tests, each of the Northwind instances were changed to NorthwindWithAttributes instances. The reason is that the query provider needs to have an instance of the context. Before, the provider only needed to know the type of the context, and the instance was never used.

This is necesarry because the context (with it’s properties of type IEntityTable<T>) are used to generate the subqueries. Relationships are translated into LINQ expressions, and then the provider will be used to convert the LINQ expression into a DBEntityExpression when necesarry (e.g. when the member is assigned, or when defer loading is taking place).

There didn’t seem to be any great way to initially create this circular reference situation via constructors / etc...

public class Northwind
{
    public Northwind(IEntityProvider provider)
    {
        this.provider = provider;
        this.provider.BindContext(this);
    }
}
To work with this new system you must bind the provider to a context instance. For the unit tests i accomplished this in the context constructor.

I think there is a good chance that both the XmlMapping and ImplicitMapping are broken as a result of this patch. I did not have time to test the FindQueryable() implementations i wrote for either of these, and i’m almost completely unfamiliar with these mapping schemes. Fixing both should be a mostly trivial task though.

I’ve also not written as many test cases as i would have like to have written. All of the existing test cases pass, which satisfied me, however there are some additional considerations that are worth testing.

  • Multi-key relationships are untested and have a good chance of not working
  • I’m not clear whether there is any way that you could evaluate a relationship in an anonymous type, but i’d be worried about how that works too.
  • Anything clever about how primary key needs to be accessed could cause problems, i do have a test case for multi-table queries but there are a lot of things that might need to be consider that i just haven’t considered.

I would have also liked to allow the policy to control the entity-caching mechanism. It should be fairly trivial to let the policy decide whether entity caching should take place at the query level or for the lifetime of the database context.

There is probably a better way to initially setup the provider / context relationships. Ideally the context would be created first, and then the provider would require the context as a parameter. The context would implement some interface so that the binding relationship is setup correctly automatically.

public interface IIQToolkitContext
{
  IEntityProvider provider { get; set; }
}

For the time being the patch takes the easy-way out.

Patch can be found in the newly created issue tracker ticket: http://iqtoolkit.codeplex.com/workitem/17176