1.1 Defining a data-structure
One of the most simple Tom program is the following:
import main.peano.types.*;
public class Main {
%gom {
module Peano
abstract syntax
Nat = zero()
| suc(pred:Nat)
| plus(x1:Nat, x2:Nat)
}
public final static void main(String[] args) {
Nat z = `zero();
Nat one = `suc(z);
System.out.println(z);
System.out.println(one);
}
}
The %gom{ ...} constructs defines a data-structure, also called
algebraic data-type. This data-structure declares a sort (Nat) that has
three operators (zero, suc, and plus).
These operators are called constructors, because they are used to construct the
data-structure.
zero is a constant (arity 0), whereas the arity of suc and
plus is respectively 1 and 2.
In Gom, a name has to be given for each slot (pred for suc, and
x1, x2 for plus).
A second construct provided by Tom is the ` (backquote), which builds an object.
The expression `zero() builds the object zero() of sort Nat.
Therefore, the variable z is a Java object that can be used to build the
object suc(zero()).
In other words, an algebraic data-structure definition can be seen
as a grammar. The ` construct allows to build trees (Abstract Syntax
Trees) over this grammar.
The implementation generated by Gom is such that objects can be converted
into readable strings, using the toString() method.
We assume that Tom is installed, as explained in chapter 9.
First we compile the file Main.t using the tom command,
then we compile the generated Java file Main.java using javac
and we run it with java, as shown below:
$ tom Main.t
$ javac Main.java
$ java Main
zero
suc(zero)
1.2 Retrieving information in a data-structure
In addition to %gom and `, Tom provides a third construct: %match.
Using the same program, we add a method evaluate(Nat n) and we modify
the method run():
public static Nat evaluate(Nat n) {
%match(n) {
plus(x, zero()) -> { return `x; }
plus(x, suc(y)) -> { return `suc(plus(x,y)); }
}
return n;
}
public final static void main(String[] args) {
Nat two = `plus(suc(zero()),suc(zero()));
System.out.println(two);
two = evaluate(two);
System.out.println(two);
}
The %match construct is similar to switch/case in the sense that
given a tree (generally called subject), the %match selects the first
pattern that matches the tree, and executes the associated action.
By matching, we mean: finding the instances of variables that make the
pattern and the subject identical.
In our example, the first pattern matches the subject n when n is
a tree rooted by plus, and whose second subterm corresponds to the tree
zero().
When it is the case, the variable x takes the first subterm as value.
We say that a tree is rooted by a symbol when its top-node corresponds
to this symbol. When a pattern matches, we say that its variables are instantiated. They can then be used in the action part (also called right-hand
side).
Similarly, the second rules can be applied when the second subterm is rooted by
a suc. In that case, x and y are instantiated.
The method evaluate defines the addition of two integers represented by
Peano successors (i.e. zero and suc).
The first case says that the addition of an integer with zero is evaluated into the integer itself.
The second case builds a new term (rooted by suc) whose subterm
(plus(x,y)) denotes the addition of x and y.
When executing the run() method, we obtain:
plus(suc(zero),suc(zero))
suc(plus(suc(zero),zero))
In a match construct, the variables should not be declared by the
programmer. They are written without (), and their type is inferred. A
variable introduced should be fresh: it should not be declared earlier in a
wrapping block, either in Java, either in a %match construct.
A Tom variable can be used in the corresponding action, but has to be wrapped in
a ` construct. Note that the scope of a back-quote expression is
delimited by well-parenthesis sentences (`zero(), `suc(zero()),
`suc(x)). The special case `x, corresponding to a variable alone, is
also correct. To write complex expressions, extra parenthesis can be added:
`(x), `(x == y) for example.
1.3 Combining Tom and Java
As illustrated below, Tom offers a simple way to retrieve information stored in subterms.
The previous example describes a single-step when adding two
integers. To get a canonical form (a term that can no longer be reduced), we
need to encode a loop, to reach a fix-point. We also have to allow reductions
in sub-trees (i.e. a plus which is not at a top-position):
public static Nat evaluate(Nat n) {
%match(n) {
plus(x, zero()) -> { return `x; }
plus(x, suc(y)) -> { return `suc(plus(x,y)); }
// congruence rules
suc(x) -> { return `suc(evaluate(x)); }
plus(x,y) -> { return `plus(evaluate(x),evaluate(y)); }
}
return n;
}
public final static void main(String[] args) {
Nat t = `plus(suc(zero()),suc(zero()));
System.out.println(t + " -> " + evaluate(t));
t = `plus(suc(zero()),plus(suc(zero()),suc(zero())));
System.out.println(t + " -> " + evaluate(t));
}
By adding two new congruence rules, a term that cannot be reduced by the first
two rule will be recursively inspected by the two last rules. Therefore, if a
transformation can be performed (at top position, or in sub-trees), it will be
done:
plus(suc(zero),suc(zero)) -> suc(plus(suc(zero),zero))
plus(suc(zero),plus(suc(zero),suc(zero))) -> plus(suc(zero),suc(plus(suc(zero),zero)))
We use Java (while-loop or recursion) to compute a fix-point:
public static Nat loop(Nat n) {
Nat tmp = `evaluate(n);
if(tmp==n) {
return n;
}
return loop(tmp);
}
public final static void main(String[] args) {
Nat t = `plus(suc(zero()),suc(zero()));
System.out.println(t + " -> " + loop(t));
t = `plus(suc(zero()),plus(suc(zero()),suc(zero())));
System.out.println(t + " -> " + loop(t));
}
And we get the expected results:
plus(suc(zero),suc(zero)) -> suc(suc(zero))
plus(suc(zero),plus(suc(zero),suc(zero))) -> suc(suc(suc(zero)))
1.4 Programming in Tom
When using Gom, the generated data-structure is particular: data are
maximally shared. This technique, also known as hash-consing ensures
that two identical subterms are implemented by the same objects. Therefore, the
memory footprint is minimal and the equality-check can be done is constant
time: two terms are identical if the pointers are equal.
As a consequence, a term (implemented using Gom) is not mutable: once
created, it cannot be modified. One can note that the generated API offers a
setx1(Nat t) and a setx2(Nat t) method. These methods can be
used to “modify” the first or the second child or a term rooted by a
plus. Note that the term on which you call these methods is never
modified: a copy where the first or the second child is modified is returned
instead.
By combining Tom, Java, and Gom, it is quite easy to quickly develop
applications which manipulates trees.
As an example, let us consider a tiny language composed of boolean expressions
(true, false, and, or, and not),
expressions (constant, variable, plus,
mult, and mod), and instructions (skip,
print, ;,
if, and while). The abstract syntax of this language can be
described using Gom:
import pico1.term.types.*;
import java.util.*;
class Pico1 {
%gom {
module Term
imports int String
abstract syntax
Bool = True()
| False()
| Not(b:Bool)
| Or(b1:Bool, b2:Bool)
| And(b1:Bool, b2:Bool)
| Eq(e1:Expr, e2:Expr)
Expr = Var(name:String)
| Cst(val:int)
| Plus(e1:Expr, e2:Expr)
| Mult(e1:Expr, e2:Expr)
| Mod(e1:Expr, e2:Expr)
Inst = Skip()
| Assign(name:String, e:Expr)
| Seq(i1:Inst, i2:Inst)
| If(cond:Bool, i1:Inst, i2:Inst)
| While(cond:Bool, i:Inst)
| Print(e:Expr)
}
...
}
Assume that we want to write an interpreter for this language, we need a notion
of environment to store the value assigned to a variable. A simple
solution is to use a Map which associate an expression (of sort
Expr) to a name of variable (of sort String). Given this
environment, the evaluation of an expression can be implemented as follows:
public static Expr evalExpr(Map env,Expr expr) {
%match(expr) {
Var(n) -> { return (Expr)env.get(`n); }
Plus(Cst(v1),Cst(v2)) -> { return `Cst(v1 + v2); }
Mult(Cst(v1),Cst(v2)) -> { return `Cst(v1 * v2); }
Mod(Cst(v1),Cst(v2)) -> { return `Cst(v1 % v2); }
// congruence rules
Plus(e1,e2) -> {
return `evalExpr(env,Plus(evalExpr(env,e1),evalExpr(env,e2)));
}
Mult(e1,e2) -> {
return `evalExpr(env,Mult(evalExpr(env,e1),evalExpr(env,e2)));
}
Mod(e1,e2) -> {
return `evalExpr(env,Mod(evalExpr(env,e1),evalExpr(env,e2)));
}
x -> { return `x; }
}
throw new RuntimeException("strange term: " + expr);
}
Similarly, the evaluation of boolean expressions can be implemented as follows:
public static Bool evalBool(Map env,Bool bool) {
%match(bool) {
Not(True()) -> { return `False(); }
Not(False()) -> { return `True(); }
Not(b) -> { return `evalBool(env,Not(evalBool(env,b))); }
Or(True(),b2) -> { return `True(); }
Or(b1,True()) -> { return `True(); }
Or(False(),b2) -> { return `b2; }
Or(b1,False()) -> { return `b1; }
And(True(),b2) -> { return `b2; }
And(b1,True()) -> { return `b1; }
And(False(),b2) -> { return `False(); }
And(b1,False()) -> { return `False(); }
Eq(e1,e2) -> {
Expr x=`evalExpr(env,e1);
Expr y=`evalExpr(env,e2);
return (x==y)?`True():`False();
}
x -> { return `x; }
}
throw new RuntimeException("strange term: " + bool);
}
Once defined the methods evalExpr and evalBool, it becomes easy
to define the interpreter:
public static void eval(Map env, Inst inst) {
%match(inst) {
Skip() -> {
return;
}
Assign(name,e) -> {
env.put(`name,evalExpr(env,`e));
return;
}
Seq(i1,i2) -> {
eval(env,`i1);
eval(env,`i2);
return;
}
Print(e) -> {
System.out.println(evalExpr(env,`e));
return;
}
If(b,i1,i2) -> {
if(evalBool(env,`b)==`True()) {
eval(env,`i1);
} else {
eval(env,`i2);
}
return;
}
w@While(b,i) -> {
Bool cond = evalBool(env,`b);
if(cond==`True()) {
eval(env,`i);
eval(env,`w);
}
return;
}
}
throw new RuntimeException("strange term: " + inst);
}
Note:
the last two rules use the conditional test if from Java to implement
a conditional rule. Note also that terms are compared using the ==
operator, thanks to maximal sharing provided by Gom.
To play with the Pico language, we just have to initialize the
environment (env), create programs (p1 and p3), and
evaluate them (eval):
public final static void main(String[] args) {
Map env = new HashMap();
Inst p1 = `Seq(Assign("a",Cst(1)) , Print(Var("a")));
System.out.println("p1: " + p1);
eval(env,p1);
Inst p3 = `Seq(Assign("i",Cst(0)),
While(Not(Eq(Var("i"),Cst(10))),
Seq(Print(Var("i")),
Assign("i",Plus(Var("i"),Cst(1))))));
System.out.println("p3: " + p3);
eval(env,p3);
}
Exercise:
write a Pico program that computes prime numbers up to 100.