# Tom-2.10:Language Basics – Level 1

 Tom-2.10 : Tutorial

In this first part, we introduce the basic notions provided by Tom: the definition of a data-type, the construction of data (i.e. trees), and the transformation of such data using pattern-matching.

# Defining a data-structure

One of the most simple Tom programs is the following. It defines a data-type to represent Peano integers and builds the integers 0=zero and 1=suc(zero):

```import main.peano.types.*;
public class Main {
%gom {
module Peano
abstract syntax
Nat = zero()
| suc(pred:Nat)
| plus(x1:Nat, x2:Nat)
}
public final static void main(String[] args) {
Nat z   = `zero();
Nat one = `suc(z);
System.out.println(z);
System.out.println(one);
}
}```

The `%gom {...}` construct defines a data-structure, also called signature or algebraic data-type. This data-structure declares a sort (`Nat`) that has three operators (`zero`, `suc`, and `plus`). These operators are called constructors, because they are used to construct the data-structure. `zero` is a constant (arity 0), whereas the arities of `suc` and `plus` are respectively 1 and 2. Note that a name has to be given for each slot (`pred` for suc, and `x1`, `x2` for plus).

 Note: to use the code generated by a `%gom {...}`, an import statement has to be added: `import main.peano.types.*;` The path corresponds to the name of the class (`Main`), followed by the name of the module (`Peano`), followed by `types.*`. These package names are in lowercase, even if the class name and the module name contain capitals.

# Retrieving information in a data-structure

In addition to `%gom` and ```, Tom provides a third construct: `%match`. Using the same program, we add a method `evaluate(Nat n)` and we modify the method `run()`:

```import main.peano.types.*;
public class Main {
%gom {
module Peano
abstract syntax
Nat = zero()
| suc(pred:Nat)
| plus(x1:Nat, x2:Nat)
}
public final static void main(String[] args) {
Nat two = `plus(suc(zero()),suc(zero()));
System.out.println(two);
two = evaluate(two);
System.out.println(two);
}
public static Nat evaluate(Nat n) {
%match(n) {
plus(x, zero()) -> { return `x; }
plus(x, suc(y)) -> { return `suc(plus(x,y)); }
}
return n;
}
}```

The `%match` construct is similar to `switch/case` in the sense that given a tree (`n`), the `%match` selects the first pattern that matches the tree, and executes the associated action. By matching, we mean: giving values to variables to make the pattern equal to the subject. In our example `n` has the value `plus(suc(zero()),suc(zero()))`. The first pattern does not match `n`, because the second subterm of `n` is not a `zero()`, but a `suc(zero())`. In this example, the second pattern matches `n` and gives the value `suc(zero())` to `x`, and `zero()` to `y`.

When a pattern matches a subject, we say that the variables (`x` and `y` in our case) are instantiated. They can then be used in the action part (also called right-hand side). Similarly, the second rules can be applied when the second subterm is rooted by a `suc`. In that case, `x` and `y` are instantiated.

The method `evaluate` defines the addition of two integers represented by Peano successors (i.e. `zero` and `suc`). The first case says that the addition of an integer with `zero` is evaluated into the integer itself. The second case builds a new term (rooted by `suc`) whose subterm (`plus(x,y)`) denotes the addition of `x` and `y`.

When executing the program we obtain:

```plus(suc(zero()),suc(zero()))
suc(plus(suc(zero()),zero()))
```
 Note: In a match construct, the variables should not be declared by the programmer. They are written without `()`, and their type is inferred. A variable introduced should be fresh: it should not be declared earlier in a wrapping block, either in Java, either in a `%match` construct. A Tom variable can be used in the corresponding action, but has to be introduced by a ``` construct. Note that the scope of a back-quote expression is delimited by well-parenthesized sentences (``zero()`, ``suc(zero())`, ``suc(x)`). The special case ``x`, corresponding to a variable alone, is also correct. To write complex expressions, extra parentheses can be added: ``(x)`, ``(x == y)` for example.

# Using rules to simplify expressions

In this section, we will show how to use the notion of rules to simplify expressions. Suppose that we want to simplify boolean expressions. It is frequent to define the relations between expressions using a set of simplification rules like the following:

 Not(a) → Nand(a,a) Or(a, b) → Nand(Not(a), Not(b)) And(a, b) → Not(Nand(a, b)) Xor(a, b) → Or(And(a, Not(b)), And(Not(a), b)) Nand(False, b) → True Nand(a, False) → True Nand(True, True) → False

To encode such a simplification system, we have to implement a function that simplifies an expression until no more reduction can be performed. This can of course be done using functions defined in Java, combined with the `%match` constructs, but it is more convenient to use an algebraic construct that ensures that a rule is applied whenever a reduction is possible. For this purpose, the `%gom` construct provides a `rule()` construct where the left and the right-hand sides are terms. The previous simplification system can be defined as follows:

```import gates.logic.types.*;
public class Gates {
%gom {
module Logic
imports int
abstract syntax
Bool = Input(n:int)
| True()
| False()
| Not(b:Bool)
| Or(b1:Bool, b2:Bool)
| And(b1:Bool, b2:Bool)
| Nand(b1:Bool, b2:Bool)
| Xor(b1:Bool, b2:Bool)

module Logic:rules() {
Not(a)   -> Nand(a,a)
Or(a,b)  -> Nand(Not(a),Not(b))
And(a,b) -> Not(Nand(a,b))
Xor(a,b) -> Or(And(a,Not(b)),And(Not(a),b))
Nand(False(),_)     -> True()
Nand(_,False())     -> True()
Nand(True(),True()) -> False()
}
}
public final static void main(String[] args) {
Bool b = `Xor(True(),False());
System.out.println("b = " + b);
}
}```
 Note: `_` can be used when a specific name for a variable is not needed.

When using the `module Logic:rules() { ... }` constructs, the simplification rules are integrated into the data-structure. This means that the rules are applied any time it is possible to do a reduction. The user does not have any control on them, and thus cannot prevent from applying a rule. Of course, the simplification system should be terminating, otherwise, infinite reductions may occur. When compiling and executing the previous program, we obtain `b = True()`.

# Separating Gom from Tom (*)

 Note: do not read this section if you are not interested in separating the data-type definition from the program itself.

When programming large applications, it may be more convenient to introduce several classes that share a common data-structure. In that case, the `%gom {...}` construct could be avoided and a Gom file (`Logic.gom` for example) could be used instead:

```module Logic
imports int
abstract syntax
Bool = Input(n:int)
| True()
| False()
| Not(b:Bool)
| Or(b1:Bool, b2:Bool)
| And(b1:Bool, b2:Bool)
| Nand(b1:Bool, b2:Bool)
| Xor(b1:Bool, b2:Bool)

module Logic:rules() {
Not(a)   -> Nand(a,a)
Or(a,b)  -> Nand(Not(a),Not(b))
And(a,b) -> Not(Nand(a,b))
Xor(a,b) -> Or(And(a,Not(b)),And(Not(a),b))
Nand(False(),_)     -> True()
Nand(_,False())     -> True()
Nand(True(),True()) -> False()
}```

This file can by compiled as follows:

`\$ gom Logic.gom`

This generates several Java classes, among them a particular file called `Logic.tom` which explains to Tom how the data-structure is implemented. This process is hidden when using the `%gom {...}` construct:

```\$ ls logic
_Logic.tom              Logic.tom               LogicAbstractType.java
strategy                types```

One of the simplest Tom program that uses the defined data-structure is the following:

```import logic.types.*;
public class Main {
%include{ logic/Logic.tom }
public final static void main(String[] args) {
Bool b = `Xor(True(),False());
System.out.println("b = " + b);
}
}```
 Note: since classes are generated in the `logic` directory, the statement `import logic.types.*;` is needed. The name comes from the name of the module (`Logic`). In the Tom program, `%include { logic/Logic.tom }` is necessary to include (like a copy and paste) the `Logic.tom` file generated by gom. The `%include` command is similar to the `#include` command provided by the C-preprocessor.

# Programming in Tom

In this section we present how Tom can be used to describe the abstract syntax and to implement an interpreter for a given language. We also give some tips and more information about the generated code.

When using `%gom { ... }`, the generated data-structure is particular: data are maximally shared. This technique, also known as hash-consing ensures that two identical subterms are implemented by the same objects. Therefore, the memory footprint is minimal and the equality-check can be done in constant time using the `==` construct (two terms are identical if the pointers are equal). As a consequence, a term is not mutable: once created, it cannot be modified.

Even if the generated API offers getters and setters for each defined slot (`Bool getb1()`, `Bool getb2()`, `Bool setb1(Bool t)` and `Bool setb2(Bool t)` for the `And(b1:Bool,b2:Bool)` defined below). These methods do not really modify the term on which they are applied: `t.setb1(‘True())` will return a copy of `t` where the first child is set to `True()`, but the previous `t` is not modified. There is no side-effect in a maximally shared setting.

In Tom, it is quite easy to quickly develop applications which manipulates trees. As an example, let us consider a tiny language composed of boolean expressions (True, False, And, Or, and Not), expressions (constant, variable, Plus, Mult, and Mod), and instructions (Skip, Print, ;, If, and While). The abstract syntax of this language can be described as follows:

```import pico1.term.types.*;
import java.util.*;
class Pico1 {
%gom {
module Term
imports int String
abstract syntax
Bool = True()
| False()
| Not(b:Bool)
| Or(b1:Bool, b2:Bool)
| And(b1:Bool, b2:Bool)
| Eq(e1:Expr, e2:Expr)
Expr = Var(name:String)
| Cst(val:int)
| Plus(e1:Expr, e2:Expr)
| Mult(e1:Expr, e2:Expr)
| Mod(e1:Expr, e2:Expr)
Inst = Skip()
| Assign(name:String, e:Expr)
| Seq(i1:Inst, i2:Inst)
| If(cond:Bool, i1:Inst, i2:Inst)
| While(cond:Bool, i:Inst)
| Print(e:Expr)
}
...
}```

Assume that we want to write an interpreter for this language, we need a notion of environment to store the value assigned to a variable. A simple solution is to use a `Map` which associate an expression (of sort `Expr`) to a name of variable (of sort `String`). Given this environment, the evaluation of an expression can be implemented as follows:

```public static Expr evalExpr(Map env,Expr expr) {
%match(expr) {
Var(n)                -> { return (Expr)env.get(`n); }
Plus(Cst(v1),Cst(v2)) -> { return `Cst(v1 + v2); }
Mult(Cst(v1),Cst(v2)) -> { return `Cst(v1 * v2); }
Mod(Cst(v1),Cst(v2))  -> { return `Cst(v1 % v2); }
// congruence rules
Plus(e1,e2) -> { return `evalExpr(env,Plus(evalExpr(env,e1),evalExpr(env,e2))); }
Mult(e1,e2) -> { return `evalExpr(env,Mult(evalExpr(env,e1),evalExpr(env,e2))); }
Mod(e1,e2)  -> { return `evalExpr(env,Mod(evalExpr(env,e1),evalExpr(env,e2)));  }
x -> { return `x; }
}
throw new RuntimeException("should not be there: " + expr);
}```
 Note: the congruence rules are needed to enforce reductions in sub-expressions of `Plus`, `Mult`, and `Mod`. We will see later how they can be avoided using the notion of strategy.

Similarly, the evaluation of boolean expressions can be implemented as follows:

```public static Bool evalBool(Map env,Bool bool) {
%match(bool) {
Not(True())     -> { return `False(); }
Not(False())    -> { return `True(); }
Not(b)          -> { return `evalBool(env,Not(evalBool(env,b))); }
Or(True(),_)    -> { return `True(); }
Or(_,True())    -> { return `True(); }
Or(False(),b2)  -> { return `evalBool(env, b2); }
Or(b1,False())  -> { return `evalBool(env, b1); }
Or(b1,b2)       -> { return `evalBool(env, Or(evalBool(env,b1),evalBool(env,b2))); }
And(True(),b2)  -> { return `evalBool(env, b2); }
And(b1,True())  -> { return `evalBool(env, b1); }
And(False(),_)  -> { return `False(); }
And(_,False())  -> { return `False(); }
And(b1,b2)      -> { return `evalBool(env, And(evalBool(env,b1),evalBool(env,b2))); }
Eq(e1,e2)       -> {
Expr x = `evalExpr(env,e1);
Expr y = `evalExpr(env,e2);
return (x==y)?`True():`False();
}
x               -> { return `x; }
}
throw new RuntimeException("should not be there: " + bool);
}```

Once defined the methods `evalExpr` and `evalBool`, it becomes easy to define the interpreter:

```public static void eval(Map env, Inst inst) {
%match(inst) {
Skip() -> {
return;
}
Assign(name,e) -> {
env.put(`name,evalExpr(env,`e));
return;
}
Seq(i1,i2) -> {
eval(env,`i1);
eval(env,`i2);
return;
}
Print(e) -> {
System.out.println(evalExpr(env,`e));
return;
}
If(b,i1,i2) -> {
if(evalBool(env,`b)==`True()) {
eval(env,`i1);
} else {
eval(env,`i2);
}
return;
}
w@While(b,i) -> {
Bool cond = evalBool(env,`b);
if(cond==`True()) {
eval(env,`i);
eval(env,`w);
}
return;
}
}
throw new RuntimeException("strange term: " + inst);
}```
 Note: the last two rules use the conditional test `if` from Java to implement a conditional rule. Note also the use of `w@...` that introduces a variable `w` (an alias). This variable can be used in the action part to avoid building the term `While(b,i)`.

To play with the `Pico` language, we just have to initialize the environment (env), create programs (p1 and p2), and evaluate them (eval):

```public final static void main(String[] args) {
Map env = new HashMap();
Inst p1 = `Seq(Assign("a",Cst(1)) , Print(Var("a")));
System.out.println("p1: " + p1);
eval(env,p1);
Inst p2 = `Seq(Assign("i",Cst(0)),
While(Not(Eq(Var("i"),Cst(10))),
Seq(Print(Var("i")),
Assign("i",Plus(Var("i"),Cst(1))))));
System.out.println("p2: " + p2);
eval(env,p2);
}```
Exercise: write a Pico program that computes prime numbers up to 100.
 Tutorial Tom-2.10 Documentation Guided Tour :: Tutorial :: Language Reference :: Tools