20 Nov, 2010, JohnnyStarr wrote in the 1st comment:
Votes: 0
I'm still keen on writing an AST for my micro language. It's been very
helpful to study up stacks and trees. After understanding reverse polish
notation (RPN), it makes much more sense to me. Now that I understand the
nature of computation on a virtual stack, I understand the order in which
the AST nodes are generated. I'm curious though how tree traversal would
work.

Assuming all we have to deal with so far is mathmatic expressions, consider
the following:
infix:   7 + 13 - 6
postfix: 13 7 + 6 -

exp equals '14'
|
[-]
/ \
[+] 6
/ \
13 7


To build this tree, you could use a stack for basic operations:

push 13
push 7
create '+' node, add 13 and 7 as children
pop 7
pop 13
push '+' node
push 6
create '-' node, add '+' node and 6 as children
pop 6
pop '+' node
push '-' node


This makes sense for both building the tree, and later traversal. I'm not sure
how you would start from the bottom of the tree, pre-order perhaps?

A token stream is easy for me to understand because it is just a linear list of tokens. I'm afraid I'm not certain about tree traversal. If you start at the root node
in this case '-' and start operations you would be going backwards.

Would it make more sense to start from the bottom up? Could someone provide a clear
example of how this is done? Thanks.
21 Nov, 2010, quixadhal wrote in the 2nd comment:
Votes: 0
Typically, you'd write a recursive routine that traverses all the deeper elements and then prints the current node on the way out.
21 Nov, 2010, David Haley wrote in the 3rd comment:
Votes: 0
The thing here is that your grammar is potentially ambiguous, unless you impose some precedence on + and -, or you force evaluation from left-most or right-most direction for elements of equal precedence.

For example, x + y - z could be (x+y) - z or x + (y-z).

In either case, the resulting tree is unambiguous, and all traversal operations are unambiguous.

I'll agree with Quix but phrase it slightly differently. When you're evaluating the node OP(x,y,…,z) you need to evaluate the nodes x, y, …, z and finally use OP to combine those expressions into the result of the current node. So, if you're evaluating PLUS(x, y) you would first get the value of x, then get the value of y, then add them, and use that as the result of the current node.

You typically don't think of traversing operation trees in other orders. Unlike a traversal where you're printing out nodes, for example, for this case you simply don't know the result of a parent node until you have the results of some or all of its children nodes. Therefore, you would never emit the result of a parent before a child. Basically, you have a fairly clear dependency graph here of what needs to be evaluated before what other things, and the only question for you is whether you evaluate children right-to-left or left-to-right.
22 Nov, 2010, Runter wrote in the 4th comment:
Votes: 0
class Node
attr_accessor :left, :right

def initialize(v)
@value = v
end

def evaluate()
@left.evaluate().send(@value, @right.evaluate()) rescue @value
end
end

tree = Node.new(:-)
tree.right = Node.new(6)
tree.left = Node.new(:+)
tree.left.right = Node.new(7)
tree.left.left = Node.new(13)

puts tree.evaluate()

tree = Node.new(:join)
tree.right = Node.new(" and ")
tree.left = Node.new([1, "two", :three])
puts tree.evaluate()
22 Nov, 2010, Kaz wrote in the 5th comment:
Votes: 0
ASTs are unambiguous: the expressions are already grouped as brackets would. It's clear that, whatever code you used to generate the AST, it was equivalent to (13 + 7) - 6.

After generating the AST, performing a postorder traversal will yield the (also) unambiguous RPN for the expression or program. It's my experience that it's easier to interpret/generate code from RPN, as it directly corresponds:

In this case:

Push 13
Push 7
Operator + (Pop 13, Pop 7, Add -> Push 20)
Push 6
Operator - (Pop 6, Pop 20, Subtract -> Push 14)

(Equivalent to "postfix: 13 7 + 6 -" as you originally wrote)
23 Nov, 2010, David Haley wrote in the 6th comment:
Votes: 0
FWIW, I said grammar, not AST, when speaking of ambiguity. In fact, I further said that the resulting trees and traversals were unambiguous.
23 Nov, 2010, Kaz wrote in the 7th comment:
Votes: 0
David Haley said:
FWIW, I said grammar, not AST, when speaking of ambiguity. In fact, I further said that the resulting trees and traversals were unambiguous.


Congratulations.
23 Nov, 2010, Kaz wrote in the 8th comment:
Votes: 0
Anyway, JohnnyStar, you asked for an example of how tree traversal might work. Here's some simplistic code I knocked up that simulates your grammar having already been parsed into the tree form you specified, and a sample skeleton of how a postorder tree traversal might work in C.

#include <stdlib.h>
#include <stdio.h>

typedef enum element_type element_type;

enum element_type {
type_integer,
type_add,
type_subtract
};

typedef struct ast ast;

struct ast {
element_type type;
int element; /* only have ints; could be union of types */
ast *left;
ast *right;
};

static int stack[40];
static int pos;

void execute(ast *node)
{
int lhs, rhs;

/* Perform a postorder traversal */
if (node->left) execute(node->left);
if (node->right) execute(node->right);

/* All branches are executed, now execute this node*/
switch (node->type)
{
case type_integer :
printf("PUSH %d\n", node->element);
stack[pos++] = node->element;
break;

case type_add :
printf("ADD\n");
rhs = stack[–pos];
printf(" –> POP %d\n", rhs);
lhs = stack[–pos];
printf(" –> POP %d\n", lhs);
printf(" –> PUSH %d\n", lhs + rhs);
stack[pos++] = lhs + rhs;
break;

case type_subtract :
printf("SUB\n");
rhs = stack[–pos];
printf(" –> POP %d\n", rhs);
lhs = stack[–pos];
printf(" –> POP %d\n", lhs);
printf(" –> PUSH %d\n", lhs - rhs);
stack[pos++] = lhs - rhs;
break;
}
}

ast *create_node(element_type type, int value)
{
ast *node = malloc(sizeof (*node));
node->type = type;
node->element = value;
node->left = NULL;
node->right = NULL;
}

int main()
{
/* simulate some grammar being parsed */
ast *tree = create_node(type_subtract, 0);
tree->left = create_node(type_add, 0);
tree->left->left = create_node(type_integer, 13);
tree->left->right = create_node(type_integer, 7);
tree->right = create_node(type_integer, 6);

execute(tree);

printf("Stack[0] = %d\n", stack[0]);
return 0;
}


Using a stack isn't the only way to do this, of course, but it's one of the simplest. I do recommend writing yourself a proper stack abstraction instead of an array of 40 if you go that way, though. ;)
23 Nov, 2010, Runter wrote in the 9th comment:
Votes: 0
I prefer my example. It also didn't require any eye wink inoculation against lack of elegance.
23 Nov, 2010, David Haley wrote in the 10th comment:
Votes: 0
Kaz said:
David Haley said:
FWIW, I said grammar, not AST, when speaking of ambiguity. In fact, I further said that the resulting trees and traversals were unambiguous.


Congratulations.

It looked like you were disagreeing with something I said. There's no need to be an ass about it.
23 Nov, 2010, David Haley wrote in the 11th comment:
Votes: 0
Also, it's worth noting that the RPN-like traversal is not always appropriate, because it violates short-circuiting rules. It's usually better to change your traversal depending on the kind of node you're evaluating. Of course, in many cases (like most arithmetic) you can't finish evaluating the current node until all children are evaluated, but for very many forms of control flow, there's no need (and indeed in many cases it is quite incorrect) to evaluate all children. I recognize that in this thread we've been talking about just simple arithmetic for now, but I suspect that this is meant for more interesting things hence my jumping ahead.
0.0/11