DLR Expression Trees status #3296
-
I wanted to ask what is the relation/status between "Expression Trees v2" described in [1] and changes proposed in #158 and partially implemented by @bartdesmet? I wasn't able to read nearly 200 pages of that document so maybe someone knows - are some of the Expression Trees described in there related only to DLR or all of them (in theory) could be implemented for C#? [1] https://github.com/IronLanguages/dlr/blob/master/Docs/expr-tree-spec.pdf |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments
-
It's complicated :-). Some historical context may be useful. Expression trees were initially introduced in .NET Framework 3.5 with C# 3.0 and VB 9.0 as part of the LINQ feature set. There are two parts to it:
For example: Expression<Func<int, int>> f = x => x * 2; is turned into var x = Expression.Parameter(typeof(int), "x");
Expression<Func<int, int>> f = Expression.Lambda<Func<int, int>>(Expression.Multiply(x, Expression.Constant(2)), x); With very few exceptions (multi-dimensional array initializers and assignment expressions come to mind), pretty much all expressions in the C# language were supported for use in expression tree lambdas. Examples include literals, variables, unary operators, binary operators, the conditional operator, member access, indexing, method and delegate invocations, new operators, etc. Statements, on the other hand, were not supported, so the following would not work: Expression<Func<int, int>> f = x => { return x * 2; }; Thus, no blocks, if, switch, while, for, do, foreach, using, lock, goto, etc. statements. This limitation both existed at the level of the API and the language support. Arguably, statements were not needed for LINQ query providers, where these lambdas being converted to expression trees originate from lowering of query expressions, or appear in fluent interface patterns with LINQ standard query operators: from x in xs where x > 0 select x * 2 becomes xs.Where(x => x > 0).Select(x => x * 2) where the lambdas may get converted to a delegate type or an expression tree type, depending on the query source (e.g. xs.Where(x => { return x > 0; }).Select(x => { return x * 2; }) Prior to continuing the story, I should point out that expression trees not only represent code as data structures, but also support runtime compilation (using Either way, this support is available behind the Expression<Func<int, int>> f = x => x * 2;
Func<int, int> g = f.Compile();
int answer = g(21); So far for C# 3.0, on to C# 4.0 and .NET Framework 4.0. Two things happened here:
The DLR project extended Moreover, the nodes that were added turned out to be more primitive building blocks rather than capturing higher-level intent. So, they don't model the union of language constructs available in all target languages, they rather look at a common intersection. That is, there is no such thing as a node representing a This is quite different from the .NET 3.5 expression trees which were modeled closer to language constructs in a WYSIWYG fashion. If a language like C# were to target these expression tree APIs to support statement bodies, something like: Expression<Action<int>> f = (int i) =>
{
while (i > 0)
{
Console.WriteLine(i--);
}
}; would not look like a Expression<Action<int>> f = (int i) =>
{
C:
{
if (i > 0)
goto B;
Console.WriteLine(i--);
goto C;
}
B:
;
}; represented by a That is, expression trees have become a code generation target, rather than a quotation mechanism that preserves the original user intent. If C# were to support statement trees using just the new APIs, In fact, in C# 3.0, there was already some lossy conversion, so expression trees were never really "pure" quotations. Two examples come to mind: int x = 1;
Expression<Func<int>> f = () => x; won't look at all like a The second example is nested query expressions: Expression<Func<IEnumerable<int>, IEnumerable<int>> f = xs => xs where x > 0 select x + 1; This won't look like some Expression<Func<IEnumerable<int>, IEnumerable<int>> f = xs => xs.Where(x => x > 0).Select(x => x + 1); While this is straightforward for simple queries, it becomes more cumbersome when operators like There are more examples to do with implicit conversions sneaking into expression trees as More recently, support for interpolated strings ended up in expression trees, and shows up as a lowered Expression<Func<int, string>> f = () => $"The answer is {x}"; looks like Expression<Func<int, string>> f = () => string.Format("The answer is {0}", x); This may not look like a big deal if the primary goal is to compile and evaluate the expression at runtime. However, if you're writing a query provider or some other form of expression tree transpiler (e.g. to some DSL) at runtime, where the target language supports interpolation, you're now faced with a decompilation task, at runtime, to turn The DLR extensions were primarily meant to support Iron* languages, producing expression trees at runtime, and having them get compiled - at runtime - into efficient IL code. They're the runtime backend for such languages. At the same time, this enabled C# 4.0 to introduce dynamic Add(dynamic a, dynamic b)
{
return a + b;
} the C# compiler generates code using Microsoft.CSharp.RuntimeBinder.BinaryOperation(flags, Expression.Add, context, new[] { arg1, arg2 }) where This whole binder business gets wrapped in a // I got two objects, a and b, please ask the binder (here, the C# compiler, at runtime) what to do When invoked (because Expression<Func<object, object, bool>> test = (a, b) => a is int && b is int;
Expression<Func<object, object, object>> action = (a, b) => (int)a + (int)b; The if (a is int && b is int)
{
return (int)a + int(b);
}
else
{
// I got two objects, a and b, please ask the binder (here, the C# compiler, at runtime) what to do
} When called again, e.g. with two Thus, C# 4.0 In all of this, the support for converting lambda expressions to expression trees was not enhanced in C# 4.0 (or beyond), even though new expression types were added. For example, C# 4.0 added support for named and optional parameters. "Support" for this feature in expression trees was implemented using a diagnostic check, rejecting the use in expression trees (see https://github.com/dotnet/roslyn/blob/master/src/Compilers/CSharp/Portable/Lowering/DiagnosticsPass_ExpressionTrees.cs#L304). So, we ended up with C# moving along (async lambdas, conditional access operators, pattern matching, switch expressions, etc. etc.), while expression tree support effectively continued to capture the state of "expressions" in the C# 3.0 timeframe. So, an older version of the language is buried within the language. One argument that was made early on, in C# 4.0 days, was compatibility concerns for LINQ providers when they'd end up seeing new node types that they were not prepared to handle. For those interested in gory details, the introduction of a Expression<Func<int, int, int>> f => (a, b) => Foo(bar: 100 / a, qux: 200 / b); When named parameters are used, we need a new way to represent a var tmp1 = Expression.Parameter(typeof(int));
var tmp2 = Expression.Parameter(typeof(int));
Expression.Block(
new[] { tmp1, tmp2 },
Expression.Assign(tmp1, Expression.Divide(Expression.Constant(100), a),
Expression.Assign(tmp2, Expression.Divide(Expression.Constant(200), b),
Expression.Call(/*Foo*/, tmp1, tmp2)
); This is needed to preserve the evaluation order of the arguments. Obviously, existing libraries would never have seen a And that's where we stand today. An API that was originally intended to support some form of translation of (query) expressions to DSLs, with runtime compilation support, that further evolved into a backend for the DLR and dynamic operations. But without co-evolution with front-end languages such as C# and VB. The work I've been doing over at https://github.com/bartdesmet/roslyn/tree/ExpressionTrees is showcasing one way expression trees could capture a bigger set of language constructs (new expression types, but also statements), targeting a library built at https://github.com/bartdesmet/ExpressionFutures/tree/master/CSharpExpressions. Furthermore, the work over at https://github.com/bartdesmet/roslyn/blob/ExpressionTreeLikeTypes/docs/features/expression-types.md shows another approach where custom quotation types could be supported (similar to there being "task-like" types for async methods). While there have been discussions with the LDM on and off, it's unclear at this point whether expression tree evolution or more general quotation mechanisms are features worth pursuing, relative to other areas of investment. Examples of new killer applications beyond query providers would be useful to consider this once again (e.g. "code shipping" mechanisms, translation to other DSLs, maybe scientific computing, ML, etc.). FWIW, my original interest in this originates from an "expression shipping" big data event processing system built internally at Microsoft; it needed support for newer language features, async lambdas, etc. so I decided to prototype the work at the library and compiler level to unblock that effort. |
Beta Was this translation helpful? Give feedback.
-
In context of LoopExpression this repo by @jbevain may also interest someone https://github.com/jbevain/mono.linq.expressions |
Beta Was this translation helpful? Give feedback.
-
Sure, that's one of the possible implementations of additional expression or statement nodes, just like the work at https://github.com/bartdesmet/ExpressionFutures/tree/master/CSharpExpressions/Microsoft.CSharp.Expressions/Microsoft/CSharp/Expressions which covers all expressions and statements up to C# 6.0, and some of the C# 7.0 and 8.0 features as a work in progress. The nodes types enum pretty much reflects the status, see https://github.com/bartdesmet/ExpressionFutures/blob/master/CSharpExpressions/Microsoft.CSharp.Expressions/Microsoft/CSharp/Expressions/CSharpExpressionType.cs. The corresponding Roslyn fork over at https://github.com/bartdesmet/roslyn/tree/ExpressionTrees binds to these node types when the library is referenced. |
Beta Was this translation helpful? Give feedback.
It's complicated :-). Some historical context may be useful.
Expression trees were initially introduced in .NET Framework 3.5 with C# 3.0 and VB 9.0 as part of the LINQ feature set. There are two parts to it:
System.Linq.Expressions
.Expression<TDelegate>
(quotation).For example:
is turned into
With very few exceptions (multi-dimensional array initializers and assignment expressions come to mind), pretty m…