Expression Trees Composition
You may have noted that the structured text composers are missing tree structures, one very important data structure in computer science. A simple text tree data structure can be used to compose and manipulate semantic-free expression trees very similar in function to XML markup, but with a more readable text form. The ITextExpression interface, under the TextComposerLib.TextExpression.Ast namespace, is the base of all text expression tree nodes. Two kinds of expressions can be used:
- Simple: The TeLiteralNumber, TeLiteralString, and TeIdentifier classes implement the ISimpleTextExpression interface to represent simple number literals, string literals, and identifiers.
- Composite: The TeList and TeDictionary classes implement the ICompositeTextExpression interface to represent lists of sub-expressions and dictionaries of sub-expressions.
By nesting instances of these classes we can construct arbitrary text expression trees that can be eventually composed, by traversing, into text.
We can use classes that implement from the ITextExpressionVisitor, ITextExpressionVisitor<T> interfaces, found under the TextComposerLib.TextExpression namespace, to implement tree traversing algorithms on the text expression trees. For example the TextExpressionComposer class has the static Generate() method that formats a given text expression tree as a string. Many other tree traversal algorithms can be implemented similarly like search, replacement, or summary creation algorithms.
The shown code can be used to construct a simple expression tree using normal class constructors. This approach is useful when the tree represents information generated internally by the program, and the tree is used to pass the information to other text composition stages for example. The output of this code is as follows:
1 2 3 4 5 |
List[2.7, Var1, "Testing 1 2 3", { A : 1, B : 2, C : 3 }] |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
var dict = new TeDictionary { {"A", new TeLiteralNumber(1)}, {"B", new TeLiteralNumber(2)}, {"C", new TeLiteralNumber(3)} }; var listExpr = new TeList { new TeLiteralNumber(2.7D), new TeIdentifier("Var1"), new TeLiteralString("Testing 1 2 3"), dict }; listExpr.Name = new TeIdentifier("List"); return TextExpressionComposer.Generate(listExpr); |
The second code example parses a given text into a text expression tree using the static ParseToTextExpression() method of the TextExpressionParser static class. Then a custom class called Task1Class derived from TextExpressionVisitor is used to replace the heads of composite sub-expressions in the tree according to a given dictionary.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
//Parse text into an expression tree var textExpr = "Add{ V1 : Add[-2.5, Vector[2, 3, -4]], V2 : Vector[x, 2, -4.7] }".ParseToTextExpression(); var composer = new LinearComposer(); composer .AppendLine("Before replacement: ") .AppendLine(TextExpressionComposer.Generate(textExpr)) .AppendLine(); //Replace "Add" by "Plus" and "Vector" by "List" in the expression tree Task1Class.ReplaceNames( textExpr, new Dictionary<string, string>{{"Add","Plus"},{"Vector","List"}} ); composer .AppendLine("After replacement: ") .AppendLine(TextExpressionComposer.Generate(textExpr)); return composer.ToString(); |
The definition of the Task1Class is as follows:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 |
/// <summary> /// A simple class that replaces the names of composite expressions using a dictionary /// </summary> internal class Task1Class : TextExpressionVisitor { private readonly Dictionary<string, string> _namesDictionary; private Task1Class(Dictionary<string, string> namesDictionary) { _namesDictionary = namesDictionary; } public override void Visit(TeLiteralNumber textExpr) { } public override void Visit(TeLiteralString textExpr) { } public override void Visit(TeIdentifier textExpr) { } public override void Visit(TeList textExpr) { string newName; if (textExpr.IsNamed && _namesDictionary.TryGetValue(textExpr.Name.ToString(), out newName)) textExpr.Name.Text = newName; foreach (var subExpr in textExpr.ChildExpressions) Visit(subExpr); } public override void Visit(TeDictionary textExpr) { string newName; if (textExpr.IsNamed && _namesDictionary.TryGetValue(textExpr.Name.ToString(), out newName)) textExpr.Name.Text = newName; foreach (var subExpr in textExpr.ChildExpressions) Visit(subExpr); } public static void ReplaceNames(ITextExpression textExpr, Dictionary<string, string> namesDictionary) { var task1Object = new Task1Class(namesDictionary); task1Object.Visit(textExpr); } } |
The output of this code displays the expression tree before and after the replacement happens:
1 2 3 4 5 6 7 8 9 10 11 |
Before replacement: Add { V1 : Add[-2.5, Vector[2, 3, -4]], V2 : Vector[x, 2, -4.7] } After replacement: Plus { V1 : Plus[-2.5, List[2, 3, -4]], V2 : List[x, 2, -4.7] } |
The syntax of expressions is as follows:
- Number literals can be any familiar integer of floating point number literal like -42 or 2.456e-2 for example.
- String literals can be any C#-like string literal but can also be enclosed by single or double quotes. For example "Me and them", and 'Me and them' are the same string literal.
- An identifier is a sequence of alphanumeric or underscore characters not starting with a number. For example x, Var12, and Rotation_Z are all legal identifiers.
- List expressions optionally start with an identifier called the name of the list and are a comma separated listing of sub-expressions, possibly of mixed kinds, surrounded by square brackets. For example Max[1, 2, 3], Plus[-2.3, 5, 7.9], ["A", "B", 8, R1] are all legal list expressions. There is no specific meaning associated with the lists.
- Dictionary expressions are similar to list expressions but each item gets a key and curly braces are used. For example {X:3, Y:8, Z:-1} and Ray{ Direction : Point[2.3, 7.6], Origin : Vector[3.1, -2.7] } are typical dictionary expressions.
Using this simple Text Expression Tree DSL the user can create arbitrary complex trees of basic numbers, strings, and identifiers. Such trees are similar in structure to the XML trees but much easier to manipulate, process, and store in typical code generation scenarios.