Technical Design Overview
This page gives a high-level overview of the design of ExtendJ. It is a work in progress.
A condensed version of this information can be found in these presentation slides.
The root of each ExtendJ AST is the Program node. A program consists of several CompilationUnits; one for each source file being compiled. CompilationUnits have an optional package declaration, import declarations, and multiple type declarations. Type declarations are represented by the abstract TypeDecl class. Concrete subclasses of TypeDecl represent class/enum/interface declarations in Java.
Each Java type (TypeDecl) has a list of member declarations. These include methods, fields, constructors, etc. Each member of a type is represented by the BodyDecl AST class. Concrete subtypes of BodyDecl represent different kinds of member declarations:
The code in methods, constructors, and initializers, is stored in Block nodes. Statements and expressions in a code block are represented by the Stmt and Expr classes, respectively. Note that Block is a subtype of Stmt to allow blocks to be nested.
Java statements are represented by the following ExtendJ AST classes:
The parts that build up Java statements are called expressions. The atoms that build up expressions are accesses. Accesses are expressions that refer to variables, fields, types, methods, constructors, etc. More generally, an access always refers to some declaration. For example, a TypeAccess refers to some TypeDecl, a MethodAccess refers to a MethodDecl, etc.
These are the different kinds of accesses in ExtendJ:
Automatic AST Rewriting
ExtendJ parses most Java identifiers as ParseName nodes. These are needed, because the parser is not able to classify all names. Instead, name classification is done later, using attributes.
The ParseName nodes should never be visible during attribute evaluation. ParseName nodes are automatically transformed into one of several different Access subtypes. This transformation is done using JastAdd's rewrite feature.
ParseName rewrites are defined in
JastAdd rewrites are automatically computed when a node with a rewrite is first accessed (using the getChild method). Rewrites are used sparingly in ExtendJ, because we want to keep the original AST structure. In most cases when the AST needs to be transformed or desugared, ExtendJ instead uses NTAs.
It is possible to avoid the rewrite mechanism by using the
Static analysis problems are collected using the collection attribute feature in JastAdd. A collection attributes traverses the AST and gathers values from nodes in the tree.
The collection attribute for collecting static analysis problems is called
There are many examples of error contributions in, e.g.,
A contribution can not be refined, but most error contributions use a helper
attribute that can be refined. For example, the
The extension mechanism used in ExtendJ is inheriting and overriding attributes. The second most used extension mechanism is JastAdd's attribute refinement feature. A refinement allowes you to change the equation of a specific attribute from some other aspect.
There are two techniques for generating code in ExtendJ:
Examples of direct code generation:
For extension development the desugaring method is usually the best alternative. Desugaring is done by introducing a new nonterminal attribute (NTA) that represents the desugared version of a new language construct. An NTA must be used so that attribute evaluation works in the desugared AST. This is necessary because during code generation some inherited attributes are used (localIndex). If the desugared AST is not an NTA, then it won't have a parent pointer and inherited attribute evaluation leads to NullPointerException.
Examples of desugaring:
In building a desugared AST, it is often useful to copy parts of the original AST. For example,
expressions used in a new kind of statement. To copy parts of the original AST, use the