Previous Up Next

Chapter 13  Runtime Library

13.1  Predefined mappings

13.1.1  Builtin sorts

The system comes with several predefined signature-mappings for C, Java and Caml. Among them, let us mention:

These mappings define, for each builtin sort of the host language, an algebraic sort that can be used in a %match or a signature definition construct. Thus, builtin values, such as f(5), g(’a’) or h("foo"), can be used in patterns.

The string.tom mapping is interesting because it provides an associative operator (concString) which allows the programmer to consider a string as a list of characters. Thus, the string "foo" can be seen as the algebraic object concString(’f’,’o’,’o’). By using this mapping, it becomes possible to perform pattern matching against the content of a string. The pattern concString(’f’,X*) will match any string which begins with the character ’f’. By using the unnamed-symbol capability, this pattern can be written: (’f’,X*) instead of concString(’f’,X*).

To match any string which begins with the substring "fo", the corresponding pattern should be (’f’,’o’,X*). To simplify the definitions of such patterns, Tom supports an exception which allows the programmer to write (’fo’,X*) instead. Internally, the ill-formed character ’fo’ is expanded into the list (’f’,’o’).

In addition to these mappings, several other predefined mappings come with the system:

13.1.2  Java

To help manipulating Java data-structures, Tom provides several mappings for the Java Runtime Library. The naming convention follows Java’s one:

  java
  o--- Character.tom
  +--- util
       o--- ArrayList.tom
       o--- HashMap.tom
       o--- HashSet.tom
       o--- LinkedList.tom
       o--- MapEntry.tom
       o--- TreeMap.tom
       o--- TreeSet.tom
       +--- types
            o--- AbsractCollection.tom
            o--- AbsractList.tom
            o--- AbsractSequentialList.tom
            o--- AbsractSet.tom
            o--- ArrayList.tom
            o--- Collection.tom
            o--- HashMap.tom
            o--- HashSet.tom
            o--- LinkedHashSet.tom
            o--- Map.tom
            o--- Object.tom
            o--- Stack.tom
            o--- TreeMap.tom
            o--- TreeSet.tom
            o--- Vector.tom
            o--- WeakHashMap.tom

The directory types contains only %typeterm declarations.

13.2  Strategies

To support strategies, Tom provides a runtime library, called SL and implemented in strategy.jar. This library implements the elementary strategy combinators (Identity, Fail, All, One, Choice, Sequence, etc.) as well as basic support for the computation of positions described in section 12.1.2.

In addition, some predefined mapping are also available to allow the description of strategies:

13.3  Term viewer

The class tom.library.utils.Viewer contains a set of methods to visualize visitable terms.

/* dot representations */
// on the writer stream
public static void toDot(tom.library.sl.Visitable v, Writer w)
// on the standard output stream
public static void toDot(tom.library.sl.Visitable v)
 
/* pstree-like representations */
// on the writer stream
public static void toTree(tom.library.sl.Visitable v, Writer w)
// on standard output stream
public static void toTree(tom.library.sl.Visitable v)

/* gui display */
public static void display(tom.library.sl.Visitable v)

Note that these methods can also be used on strategies as they are also visitable.

13.4  XML

To support the transformation of Xml documents, Tom provides a specific syntax for defining patterns, as well as several predefined mappings:

See Section 10.4 for a detailed description of the Xml facilities offered by Tom.

13.5  Bytecode transformation (*)

To manipulate Java classes, Tom provides a library which supplies a Gom term usable by Tom out of a Java class. The library enables to define transformations of this term by strategic rewriting as well as functionalities to generate a new Java class from the modified term.

This approach is similar to BCEL library in the sense that we construct a complete representation of the class. But thanks to Gom, we obtain a very efficient structure with maximal sharing. Moreover, thanks to associative matching, we can easily express patterns on the bytecode and in this way, ease the definition of transformations.

The library generates a Gom term using the ASM library. This term is a memory-efficient representation of the Java class, which can then be traversed and transformed using Tom. After translating the Java class into a Gom term, we use Tom features to define transformations and traversals and to obtain a new Gom term which can be transformed into a new Java class.

13.5.1  Predefined mapping

To support the analysis and transformation of Java bytecode programs, Tom provides several mappings:

13.5.2  Java classes as Gom terms

In order to represent bytecode programs, we have defined a Gom signature that allows us to represent any bytecode program by a typed term. Given a Java class, we use ASM to read the content and build an algebraic representation of the complete Java class. This approach is similar to BCEL. Contrary to ASM, this permits multi-pass or global analysis.

module Bytecode
imports int long float double String
abstract syntax
TClass = Class(info:TClassInfo, fields:TFieldList,
                 methods:TMethodList)
...
TMethodList = MethodList(TMethod*)
TMethod = Method(info:TMethodInfo, code:TMethodCode)
TMethodCode = MethodCode(instructions:TInstructionList,
                           localVariables:TLocalVariableList,
                           tryCatchBlocks:TTryCatchBlockList)
...
TInstructionList = InstructionList(TInstruction*)
TInstruction = Nop()
             | Iload(var:int)
             | Ifeq(label:TLabel)
             | Invokevirtual(owner:String, name:String,
                               methodDesc:TMethodDescriptor)
 ...

The real signature (adt/bytecode/Bytecode.tom) contains more than 250 different constructors. The given signature shows that a class is represented by a constructor Class, which contains information such as name, packages, and imports. It also contains a list of fields and a list of methods. The latter is encoded using an associative operator MethodList. Similarly, a list of instructions is represented by the associative operator InstructionList. A method contains an info part and a code part. The code part is mainly composed by local variables and a list of instructions. Each bytecode instruction is represented by an algebraic constructor: Nop, Iload, etc.

In the package tom.library.bytecode, there exist two principal classes based on the ASM library:

public static byte[] transform(String file){
  BytecodeReader br = new BytecodeReader(file);
  TClass c = br.getTClass();
  TClass cc = transform(c);
  BytecodeGenerator bg = new BytecodeGenerator();
  return bg.toBytecode(cc);
}

13.5.3  Simulation of control flow by Strategies

When considering a Bytecode program, with the sl library, we can just define traversals without considering the control flow. Our suggestion is to use strategies in order to simulate the control flow during the traversal of the list of instructions. In the Tom language, the rules and the control are completely separated so an alternative for representing control flow graphs (CFG) is to use the control to indicate what is the possible following instruction. We have seen in the previous section that to apply a strategy to children, there exist two combinators All and One.

In the mapping bytecode/cfg.tom, the two combinators AllCfg and OneCfg behave almost as All and One but the considered children are the following instructions in the Control flow graph instead of the following instruction in the list. For example, the Goto instruction has one child with respect to the control flow graph (the instruction corresponding to the label). An If_XX instruction has two children: the one which satisfies the expression, and the one that does not.


Previous Up Next