Technical Design Overview

This page gives a high-level overview of the design of ExtendJ. It is a work in progress.

A condensed version of this information can be found in these presentation slides.

Abstract Grammar

The root of each ExtendJ AST is the Program node. A program consists of several CompilationUnits; one for each source file being compiled. CompilationUnits have an optional package declaration, import declarations, and multiple type declarations. Type declarations are represented by the abstract TypeDecl class. Concrete subclasses of TypeDecl represent class/enum/interface declarations in Java.

Each Java type (TypeDecl) has a list of member declarations. These include methods, fields, constructors, etc. Each member of a type is represented by the BodyDecl AST class. Concrete subtypes of BodyDecl represent different kinds of member declarations:

  • methods: MethodDecl
  • constructors: ConstructorDecl
  • fields: FieldDecl
  • initializers: InstanceInitializer and StaticInitializer
  • inner types: MemberClassDecl and MemberInterfaceDecl

The code in methods, constructors, and initializers, is stored in Block nodes. Statements and expressions in a code block are represented by the Stmt and Expr classes, respectively. Note that Block is a subtype of Stmt to allow blocks to be nested.

Statement Grammar

Java statements are represented by the following ExtendJ AST classes:

Java Statement AST Class Comment
{ _ } Block List of statements.
_; ExprStmt Single expression (method call, assignment, ...)
_ _ = _; VarDeclStmt Variable declaration(s).
for (_;_;_) _ ForStmt Classic for-loop.
for (_ _ : _) _ EnhancedForStmt For-each loop.
if ( _ ) _ IfStmt
while ( _ ) _ WhileStmt
do _ while ( _ ) DoStmt
try { _ } TryStmt
try ( _ ) { _ } TryWithResources
switch (_) { _ } SwitchStmt
case _ Case Switch case label.
break BreakStmt
continue ContinueStmt
return ReturnStmt
_: _ LabeledStmt Target for break and continue.
assert AssertStmt
synchronized (_) { _ } SynchronizedStmt

Expression Grammar

The parts that build up Java statements are called expressions. The atoms that build up expressions are accesses. Accesses are expressions that refer to variables, fields, types, methods, constructors, etc. More generally, an access always refers to some declaration. For example, a TypeAccess refers to some TypeDecl, a MethodAccess refers to a MethodDecl, etc.

These are the different kinds of accesses in ExtendJ:

  • VarAccess - an access to a local variable, parameter (method parameter, exception parameter, try resource), or field.
  • MethodAccess - a method call.
  • Dot - a qualified access.
  • TypeAccess - a reference to some type.

Automatic AST Rewriting

ExtendJ parses most Java identifiers as ParseName nodes. These are needed, because the parser is not able to classify all names. Instead, name classification is done later, using attributes.

The ParseName nodes should never be visible during attribute evaluation. ParseName nodes are automatically transformed into one of several different Access subtypes. This transformation is done using JastAdd's rewrite feature.

ParseName rewrites are defined in java4/frontend/ResolveAmbiguousNames.jrag.

JastAdd rewrites are automatically computed when a node with a rewrite is first accessed (using the getChild method). Rewrites are used sparingly in ExtendJ, because we want to keep the original AST structure. In most cases when the AST needs to be transformed or desugared, ExtendJ instead uses NTAs.

It is possible to avoid the rewrite mechanism by using the _NoTransform variant of child accessor methods. The NoTransform accessors should only be used if you have a good reason.

Error Collecting

Static analysis problems are collected using the collection attribute feature in JastAdd. A collection attributes traverses the AST and gathers values from nodes in the tree.

The collection attribute for collecting static analysis problems is called CompilationUnit.problems(). Errors and warnings are added to this attribute by so-called contributions. A contribution is declared using the contributes construct. Here is an example of a simple error contribution from java4/frontend/DefiniteAssignment.jrag:

PostfixExpr contributes
    error("++ and -- can not be applied to final variable " + getOperand().varDecl().name())
    when getOperand().isVariable()
        && getOperand().varDecl() != null
        && getOperand().varDecl().isFinal()
    to CompilationUnit.problems();

There are many examples of error contributions in, e.g., java4/frontend/TypeCheck.jrag.

A contribution can not be refined, but most error contributions use a helper attribute that can be refined. For example, the FieldDeclarator.typeProblems() attribute is used to compute the typing problems in a field declaration.

Refining Attributes

The extension mechanism used in ExtendJ is inheriting and overriding attributes. The second most used extension mechanism is JastAdd's attribute refinement feature. A refinement allowes you to change the equation of a specific attribute from some other aspect.

Name Bindings

Type Analysis

Subtyping

Type Inference

Code Generation

There are two techniques for generating code in ExtendJ:

  • direct code generation by calling methods on a CodeGeneration object, or,
  • code generation by desugaring.

Examples of direct code generation:

  • EnhancedForStmt.createBCode() in java5/backend/EnhancedForCodegen.jrag
  • BasicTWR.createBCode() in java7/backend/TryWithResources.jrag

For extension development the desugaring method is usually the best alternative. Desugaring is done by introducing a new nonterminal attribute (NTA) that represents the desugared version of a new language construct. An NTA must be used so that attribute evaluation works in the desugared AST. This is necessary because during code generation some inherited attributes are used (localIndex). If the desugared AST is not an NTA, then it won't have a parent pointer and inherited attribute evaluation leads to NullPointerException.

Examples of desugaring:

  • TryWithResources.getTransformed() in java7/backend/TryWithResources.jrag
  • Lambda.toClass() in java8/frontend/LambdaToAnonymousDecl.jrag used in java8/backend/CreateBCode.jrag

In building a desugared AST, it is often useful to copy parts of the original AST. For example, expressions used in a new kind of statement. To copy parts of the original AST, use the the ASTNode.treeCopy() method. See, for instance, the TryWithResources.getTransformed() NTA.