Use of Expression Trees in .NET for Lambda Decomposition

Expressions are the building block of any Lambda Expression. In C# 3.0, .NET framework has introduced a new technique to make use of Anonymous methods a better way. The "Little Gem" LINQ, uses Lambda expression extensively to invoke filter statements to IEnumerable objects and hence making the life for a programmer very easy. Simple lambda expressions like

x => x<5 

may come very handy as we do not need to define a method extensively for such a small purpose. C# also introduced a couple of Generic delegates like Func<...>, Action<...> etc. in Base Class Library which lets you to pass any type of argument to a method body. The one with Func are for methods which require something to return while ones with Action dont need to return anything.

To learn more about LINQ please have a quick pick on my article on LINQ and Lambda Expressions.But if you have already read about the basics of LINQ, it is not all that is needed. Lambda expressions are cool and extensive use of Anonymous methods are truly awesome, but basically when I think of using Lambda Expression in my own application, I always think of extending the Lambda expressions a bit further so that I could use it dynamically during runtime and later on compile them to produce the actual anonymous delegates. In this article I will put forward the story of Linq and Lambda Expressions further and introduce you with Expression Trees and also show you to create Expression Trees dynamically during runtime.

What is an Expression Tree?

Simply speaking, an expression tree is nothing but the representation of a Lambda Expression in terms of .NET objects. The Expression is the main class that holds any expression in it and it lets us to divide the whole body of the Expression into smaller individual units. For instance : 

Func<int, bool> mydelegate = x => x < 5;
Expression<Func<int, bool>> myexpressiondelegate = x => x < 5;

In the above code the delegate Func can take the expression x=>x<5 which represents an anonymous method that takes an integer argument and returns if the value is less than 5.
Download Sample - 38 KB




Decomposing an Expression ?

Now to decompose the method into an Expression Tree, lets wrap the Func into an Expression body.  The Expression has overloaded the assignment operator to decompose the body of the Function.

Now if you see the body of Expression you can see there are three parts in the whole Expression :


  1. ParameterExpression : An external parameter to the expression. Here it is X.
  2. BinaryExpression  : As the inner expression x<5 is Binary, it produces a BinaryExpression. Each of the Binary Expression has two Expressions body within it. The properties Left and Right. In our case the Expression has one ParameterExpression and another ConstantExpression.
    1. Left : Produces the left hand side of the BinaryExpression. In our case the left hand side represents the ParameterExpression the same object which is passed as X. 
    2. Right : Right represents the other side of the expression which is in our case is a constant term. So Right represents a ConstantExpression for us.
    3. NodeType : The nodetype gives you the idea what the BinaryExpression does. There are a lot of Binary types available. In our case it is LessThan. 
 Hence the entire decomposition will look like :

ParameterExpression externalParam = myexpressiondelegate.Parameters[0];
BinaryExpression bbody = myexpressiondelegate.Body as BinaryExpression;

ParameterExpression parambodyleft = bbody.Left as ParameterExpression;
ConstantExpression constRight = bbody.Right as ConstantExpression;
ExpressionType type = bbody.NodeType;

Each Expression has a ReadonlyCollection of Parameters that are passed within the body of the expression. In our case the Expression has one parameter. As you can see I have used Parameters[0] to get the Parameter that is passed to the Expression. The externalParameter represents X here and its Type is Int32.

The Expression has a body element which shows you the method body of the expression. In our case the body will hold only the part x<5. As it is a Binary expression, I can easily translate it into a reference of BinaryExpression. So in our case the Body represents the BinaryExpression.
Finally we parsed the BinaryExpression to get the Parameter X in the Left and ConstantExpression 5 in the Right property.

Now if you want to recreate the whole expression from these objects, you can do so using :
BlockExpression body = Expression.Block(
Expression.LessThan(parambodyleft, constRight));

Expression<Func<int, bool>> returnexpression = Expression.Lambda<Func<int, bool>>(body, param1);
Func<int, bool> returnFunc = returnexpression.Compile();


bool returnvalue = returnFunc(30);
The BlockExpression represents the actual body of the Expression. Based on the complexity of the code you need to define the Expression Block.

Finally the call to Compile() generates the IL code dynamically at runtime and basically gives you the example of Compiler as Service.

Now if you run the code, you will find the returnvalue holds false for 30 and true for 3.

So basically the Expression library has in built support to run expressions dynamically at runtime. This is basically the start of DLR (Dynamic Language Runtime) in .NET languages.

Story of IQueryable in context to Expressions

If you look closely into IQueryable interface or if you look into any of the Extension methods Where, Select etc. they are basically storing the annonymous method into an Expression and which eventually loads the whole Expression tree within it. So any minor change to the lambda expression will not eventually create a completely new anonymous method rather it will add up to the Expression tree and will be compiled dynamically on demand when the actual evaluation of the statement occurs.

Lets look at the structure of IQuerable :
public interface IQueryable : IEnumerable
{
    Type ElementType { get; }
    Expression Expression { get; }
    IQueryProvider Provider { get; }
}

You can see IQueryable holds an Expession, so when an expression is passed to an IQueryable it just holds it in a complete Expression Tree which it will evaluate using IQueryProvider at runtime. IQueryProvider actually calls CreateQuery to get the QueryEvaluation from the Expression and which eventually be evaluaed during runtime.

Expression Tree Object Hierarchy



Expression Trees are created based on a number of objects. When you define an Expression, you will be having internally created a number of Expression objects each of which bears a special meaning to the compiler.

The entire expression is built using those building blocks, like BinaryExpression based on its Type contains Left and Right properties which itself are Expression individually.  Any Logical Expression contains one UnaryExpression and a Constant Term. etc.

Practical Example

Now let us define a complex lambda expression and try to create the equivalent Expression Tree for the same :

The Lambda Expression :
Func<IEnumerable<int>, int, bool> dividesectionmethod = (x, y) =>
{
    int nos1 = 0;
    int nos2 = 0;
    foreach (int i in x)
    {
        if (i <= y)
            nos1++;
        else
            nos2++;
    }
    return nos1 > nos2;
}; 

In the above Expression Tree, there are two arguments x and y where X is actually an IEnumerable of integers and y is an intger value. The method Body actually creates an instance of two local variables and calculate the nos which are greater than y and less than y and finally returns which is greater.

Lambda looks like simple, but I took this to show a complex Expression Tree. Here we go :

ParameterExpression enumerableExpression = Expression.Parameter(typeof(IEnumerable<int>), "x");
ParameterExpression intexpression = Expression.Parameter(typeof(int), "y");

ParameterExpression localvarnos1 = Expression.Variable(typeof(int), "nos1");
ParameterExpression localvarnos2 = Expression.Variable(typeof(int), "nos2");
ConstantExpression zeroConstantintval = Expression.Constant(0);
BinaryExpression bexplocalnos1 = Expression.Assign(localvarnos1, zeroConstantintval);
BinaryExpression bexplocalnos2 = Expression.Assign(localvarnos2, zeroConstantintval);

//As Expression does not support Foreach we need to get Enumerator before doing loop

ParameterExpression enumerator = Expression.Variable(typeof(IEnumerator<int>), "enumerator");
BinaryExpression assignenumerator = Expression.Assign(enumerator, Expression.Call(enumerableExpression, typeof(IEnumerable<int>).GetMethod("GetEnumerator")));


var currentelement = Expression.Parameter(typeof(int), "i");
var callCurrent = Expression.Assign(currentelement, Expression.Property(enumerator, "Current"));

BinaryExpression firstlessequalsecond = Expression.LessThanOrEqual(currentelement, intexpression);

MethodCallExpression movenext = Expression.Call(enumerator, typeof(IEnumerator).GetMethod("MoveNext"));

LabelTarget looplabel = Expression.Label("looplabel");
LabelTarget returnLabel = Expression.Label(typeof(bool), "retval");

      

BlockExpression block = Expression.Block(
        new ParameterExpression[] { 
            localvarnos1, localvarnos2, enumerator, currentelement },
        bexplocalnos1,
        bexplocalnos2,
        assignenumerator,
        Expression.Loop(
            Expression.IfThenElse(
                Expression.NotEqual(movenext, Expression.Constant(false)),
                Expression.Block(
                    callCurrent,
                    Expression.IfThenElse(
                        firstlessequalsecond,
                        Expression.Assign(
                            localvarnos1,
                            Expression.Increment(localvarnos1)),
                        Expression.Assign(
                            localvarnos2,
                            Expression.Increment(localvarnos2)))),
                Expression.Break(looplabel)),
                looplabel),
                Expression.LessThan(localvarnos1, localvarnos2));

Expression<Func<IEnumerable<int>, int, bool>> lambda =
    Expression.Lambda<Func<IEnumerable<int>, int, bool>>(
        block,
        enumerableExpression,
        intexpression);
            

Func<IEnumerable<int>, int, bool> dividesectionmethod = lambda.Compile();

If you see the code above you can see we have actually mimic the process of building the Expression using the Tree.  Lets follow the Steps :

  1. In the first two lines, we declare the ParameterExpression to pass IEnumerable<int> and int as argument. We call them as x and y.



    ParameterExpression enumerableExpression = Expression.Parameter(typeof(IEnumerable<int>), "x");
    ParameterExpression intexpression = Expression.Parameter(typeof(int), "y");
    
  2. Next, we declare two objects of nos1 and nos2 and  assigned 0 as initial values. Expression.Constant(0); lets me to define a ConstantExpression which  is eventually assigned to both the variables. 
  3. As you must know, there is no support for foreach in Expression as we do for normal language, we need to use Enumerator to Move each objects in IEnumerable.  We declare a ParameterExpression which we would use later for the loop and assign the IEnumerator that we get by calling GetEnumerator method of IEnumerable.
  4. We define the CurrentElement using the Assignment call to Current property in IEnumerator. 
  5. The MethodCallExpression is used to call a method. We used it to call MoveNext in the Loop. 
Putting all things together : 

So eventually when building the actual expression I would use BlockExpression to define the Expression.Block.  When defining the Block, always remember to specify all the external parameters that you need to use while creating the object.

BlockExpression block = Expression.Block(
    new ParameterExpression[] { 
        localvarnos1, localvarnos2, enumerator, currentelement },
    bexplocalnos1,
    bexplocalnos2,
    assignenumerator,
    Expression.Loop(
        Expression.IfThenElse(
            Expression.NotEqual(movenext, Expression.Constant(false)),
            Expression.Block(
                callCurrent,
                Expression.IfThenElse(
                    firstlessequalsecond,
                    Expression.Assign(
                        localvarnos1,
                        Expression.Increment(localvarnos1)),
                    Expression.Assign(
                        localvarnos2,
                        Expression.Increment(localvarnos2)))),
            Expression.Break(looplabel)),
            looplabel),
            Expression.LessThan(localvarnos1, localvarnos2));

Here you can see I have first declared the local variables and assigned the value of Enumerator. Expression.Loop is used to create  a loop inside an expression. You must notice every Expression.Loop has two parts.
  • Body of the Loop.
  • Label to Break the Loop.
For the break on the loop we declared a Label.  Label indicates a GoTo label statement which we call eventually from inside of the loop to create a break on the loop. Expression.Break(label) is used to break the Expression.Loop to a certain Label.

Every Expression also have few methods for Logical Statements like IfThen, IfThenElse etc. We have used them to increment or decrement the local variables and Eventually we return the boolean expression using LessThan.

So we have just created the BlockExpression that eventually holds the entire Expression tree.  If you see the block it look like :

.Block(
    System.Int32 $nos1,
    System.Int32 $nos2,
    System.Collections.Generic.IEnumerator`1[System.Int32] $enumerator,
    System.Int32 $i) {
    $nos1 = 0;
    $nos2 = 0;
    $enumerator = .Call $x.GetEnumerator();
    .Loop  {
        .If (.Call $enumerator.MoveNext() != False) {
            .Block() {
                $i = $enumerator.Current;
                .If ($i <= $y) {
                    $nos1 = .Increment($nos1)
                } .Else {
                    $nos2 = .Increment($nos2)
                }
            }
        } .Else {
            .Break looplabel { }
        }
    }
    .LabelTarget looplabel:;
    $nos1 < $nos2
}

Hence it seems to be the same expression that I defined just above.
Now to get the Expression object we use Expression.Lambda call. This will cast the BlockExpression to actual Lambda expression .

Expression<Func<IEnumerable<int>, int, bool>> lambda =
    Expression.Lambda<Func<IEnumerable<int>, int, bool>>(
        block,
        enumerableExpression,
        intexpression);

The Lambda call takes passes the BlockExpression for its body and all the parameters that you need to pass into the BlockExpression.

Func<IEnumerable<int>, int, bool> dividesectionmethod = lambda.Compile();

Finally we use Compile method to generate the actual anonymous method during runtime.

Download Sample - 38 KB

Conclusion

I found a lot of fun while creating these expression. Its cool and also gives me a flavour of DLR capabilities that were introduced with C# lately. I am still learning a lot regarding this.
I would like you to criticize me if I did any mistake so that I could make this more worthy for others.

For further reading :
Expression Trees, Take Two

Thanks for reading. Like to see your feedbacks.
Shout it Submit this story to DotNetKicks Bookmark and Share
Read Disclaimer Notice

4 comments:

Anonymous said...

Good explanation and images.

cool

Anonymous said...

Very usefull

Anonymous said...

So where this advanced example would actually be usefull? Why to want to create a lamda on the runtime?

Abhishek Sur said...

@Anonymous

Expression Trees are an attempt to add dynamic compile capabilities like Reflection.Emit classes. There is very strict requirement, when you want your own application to generate IL at runtime. Actually sometimes you need your enduser to give the logic and which is not finite. May be in such cases. Lambda expression in IQueryable generally maintains an Expression tree and avoids unnecessary creation of types during compilation.

Post a Comment

Please make sure that the question you ask is somehow related to the post you choose. Otherwise you post your general question in Forum section.

Author's new book

Abhishek authored one of the best selling book of .NET. It covers ASP.NET, WPF, Windows 8, Threading, Memory Management, Internals, Visual Studio, HTML5, JQuery and many more...
Grab it now !!!