Introducing Type-Systems to Xtext-Languages

Domain-specific languages (DSL) are very popular among software developers, as they provide a more compact and constricted way to formulate certain aspects of the resulting system. There are already a wide range of different approaches and tool-chains to the topic of DSL development. Therefor, we use the existing Xtext framework for external DSLs in this article. While Xtext’s partner project Xbase provides a Java-type-system and an expression language fragment for Xtext, which supports Java-types, we developed an approach to design and use arbitrary type-systems on the source and target model and code levels.

In this first article, we describe how to build a simple type system on the source model level and how to integrate it into an Xtext-based language, that it can be used in the editor and the generators. We start, with a brief introduction into type systems, followed by an example DSL and the motivation for its type-system. Then, we develop the meta-model and language constructs for the DSL. And finally, we show how to design the necessary Java-classes and configuration to build the type-system.

Brief Introduction to Type-Systems

Type-systems are a broad topic in computer science, mathematics and especially in formal languages. For a comprehensive discussion of type-systems and their properties, see [Pierce 2002]. In short, there are mainly two different styles of type-systems: Statically and dynamically typed languages. There are also languages without a type system. However, they can be considered to be mono-typed languages.

Type-systems, especially for statically typed languages, serve a special purpose in programming and modeling languages. They allow to check expressions and declaration for sanity at compile-time, which has many benefits. Types also serve as a sort of documentation and they are helpful in specifying interfaces of modules and components.

In this article, we focus on static-typed languages, meaning all types are present at compile-time, and new types are (normally) not created during run-time. For example, C, Java, or Ada are such statically typed languages.

Primitive Types

One trade all type-systems share is a set of primitive types, such as Integer, Float, String etc. For the type-system itself these types are atomic, have no internal structure, and are pairwise disjoint. As types can be defined through a set of mutually disjoint values, these sets may have a subset relationship. For example, a type representing 8 bit unsigned integer values can represent values from 0 to 255, while a 16 bit signed integer can have values from -32768 to 32767. Obviously the 8 bit values are also covered by the 16 bit value set. However, the sets as such are not equal and therefore the types are not equal.

One special primitive type is the type Unit, which has only on value called unit. Unit is used to model expressions, where the return value is of no importance. In C style languages this function is provided by the type void. For more details, see [Pierce 2002, p. 118f].

Derived Types

Beside primitive types, a type system often provides composed types such as arrays, records, unions, classes and references. There are more and other typing structures of types, as you can see in [Pierce 2004]. However, for this article, it shall suffice to have arrays, classes and references.

A very basic composed type is an array. It is normally an ordered set of values of the same type. Each element of the set is called a field and can be accessed by its element index number. Each array provides two operations, one to set a value and one to get a value. In general the array index can start at any number and end at any number as long as the start number is smaller as the end number.

Records are tuples with named fields. Every field has its own type, which can be either only a primitive type or as in most languages be of any valid type. The advantage of records is, that every field can be accessed by name and are by definition unordered. Most implementation of records imply an order, but that is an implementation detail.

The third derived type is the reference type. A reference type is used to express a reference to a value (for example a variable holding that value). To restrain references to certain types of values, they have a type assigned, like arrays. To make use of references, a refer-to and a dereference operation accompany the type.

The most complex type, described in this article, is the class type. Classes have a wide range of properties, like method overloading, shadowing of attributes, sub-typing, access to the parent class, and interfaces. In this article, we will only discuss sub-typing and assume that attributes cannot be redefined and shadowed. Also we will not provide access to the parent class and multiple inheritance, like in C++, is also out of the scope. All these features require a lot of preparations in the meta-model, grammars and later in checks for that language. While they are interesting features, and the gentle reader might want to add them later, we focus on a much simpler model, which is suffice to illustrate the construction of a source level type system.

Sub-typing is a mechanism to create a specialized type on the basis of a general type. The specialized type has the same attributes as the general type. However, it can add additional attributes. Instances of the specialized type are then also instances of the general type, as they can provide a value for every general type property under the same name. Sub-typing is also a transitive property of a type. So if S, T and U are types, S (where <: is the symbol for the sub-typing relation) and U, then also S.

DSL for App Development

Lets assume we want to develop a multi-platform app development language, which is able to correspond to services on the net or on the device itself and provide a nice user interface (UI). The basic properties of that language are:

  • Separation of views and services
  • Data storage
  • User interface design
  • REST service support

To fulfill these properties a set of basic typing structures are required. Most apps require to handle currencies, floats, integers, dates and times, locations, names and images. Where integer, float, and string are primitive types, and date, time, location and image are predefined types with an internal structure. A location can be modeled with two or three float attributes, date is either a long integer and a set of methods which allow to extract day, month, year, and any other time value or a record with fields for all these properties. An image is something with a specific size in two dimension, which are expressed as integers for pixel graphics or in floats for vector graphics. Also, it incorporates a file-handle, which we will model as a primitive type even though this is different on the implementation layer.

As the primitive types do not have any internal structure, we can use an enumeration to model them:

public enum PrimitiveTypes {
    BOOLEAN, INT, FLOAT, FILE, STRING
}

Beside the primitive types, we already defined special composite types. These are image, location and date.

From the “requirements” above, we can conclude, that structures for view, service and composite data structures are required. The composite data structures are modeled as records with sub-typing capabilities, but without any behavior attached to them. In the DSL we called them classes. Services on the other hand are sets of operations, which communicate via a REST-API with services on the Internet. They can be called and they return data. While services communicate with Internet or local services, the UI is handled by views. A view displays preprocessed content using a set of widgets and reacts on user input. The user input then triggers service operations, which enrich the data model implemented using predefined classes.

Meta-Model and Language Design

The meta-model of the DSL must reflect the type-system. While we could construct a meta-model class for every primitive type and use this class to determine the type in sub-sequent transformations and the code generator, we will use only one class to hold all primitive types and use a name attribute to store the type. This has the advantage, that user defined types and primitive types can use the same lookup code and we do not have to use different implementation concepts for user and primitive types in model transformations and code generation. Also modeling each type with its own class will make types instantiable, which cannot be done for user types, as it would require a change of the meta-model for every new type if every type have to be modeled in the same way. Therefore, the best way to model primitive types is to use one meta-model class called PrimitiveType.

For user types, we have a wide variety of, so to speak, types of types. We concluded, that we need classes, which we will model with ClassType. The same applies to services, views, and widgets.

To cover all types we define a major type class, which has only a name attribute. Figure 1 provides a visualization of the type-system so far.

Figure 1: Simplified Meta-Model for the Type-System

In Figure 1 you can see a a class UserType providing a parent class for all user types to come. While that is not necessary, as we can derive all user type classes direct from Type, for a better understanding and for documentation purposes the UserType class is quite handy.

We define for the different types of types the following classes:

  • Type
    • PrimitiveType
    • UserType
      • ClassType
      • ServiceType
      • ViewType
      • WidgetType
    • ArrayType (is a Type, but also a TypeReference)

As you can see, we added a type ArrayType to the types meta-model, so we can describe array types in our language. Having an array type, allows to model operation calls, where arrays can be passed as parameters and return types, while when these property is attached to attributes and parameters, it cannot be used by the type checker.

For the complete meta-model check out the git repository for this example:

git clone git@github.com:rju/xtext-typing-example.git
git clone https://github.com/rju/xtext-typing-example.git

Introducing the Type-System to an Xtext Language

After this brief introduction of our example type-system, it must be integrated into our Xtext-based language. Assuming we already created a Xtext language project called de.cau.cs.se.lad (language for app development) with its two sub-projects for UI and tests, we start by creating a new package in the source folder for the type-system and call it de.cau.cs.se.lad.typing.

The first class to create is Primitives:

public enum PrimitiveTypes {
    /** primitive types for lad */
    BOOLEAN, INT, FLOAT, FILE, STRING;

    public String lowerCaseName() {
        return this.name().toLowerCase();
    }
}

The lowerCaseName method returns the enumeration literal in lower case, just the way we want it in out language.

With this enumeration, the primitive types are declared, but they have to made visible in the Xtext scoping. A solution is the definition of our own scope provider TypeGlobalScopeProvider. It is subclassed from the DefaultGlobalScopeProvider, which we use, because the primitive types cannot be found in the local scope. The TypeGlobalScopeProvider decides if a given reference to a class (referenceType) of the meta-model belongs to the type system (see line 52) and if so creates a new PrimitiveTypeScope (line 59). If the class is something else, a NULLSCOPE is returned.

Note: To have a sound scope chain, it is required to return NULLSCOPE instead of null.

public class TypeGlobalScopeProvider extends DefaultGlobalScopeProvider { 
	@Inject
	private TypeProviderFactory typeProviderFactory;

	@Inject
	private IQualifiedNameConverter qualifiedNameConverter;

    @Override
    public IScope getScope(Resource resource, EReference reference, Predicate filter) {
            IScope parentTypeScope = getParentTypeScope(resource, reference, filter, reference.getEReferenceType());
            return super.getScope(parentTypeScope, resource, false, reference.getEReferenceType(), filter);
    }

    protected IScope getParentTypeScope(Resource resource, EReference reference,
            Predicate filter, EClass referenceType) {
        if (EcoreUtil2.isAssignableFrom(TypesPackage.Literals.TYPE, referenceType)) {
        	if (resource == null)
    			throw new IllegalStateException("context must be contained in a resource");
    		ResourceSet resourceSet = resource.getResourceSet();
    		if (resourceSet == null)
    			throw new IllegalStateException("context must be contained in a resource set");
        	ITypeProvider typeProvider = typeProviderFactory.getTypeProvider(resourceSet);
			return new PrimitiveTypeScope(typeProvider, qualifiedNameConverter, filter);
        } else
        	return IScope.NULLSCOPE;
    }

}

The PrimitiveTypeScope is an AbstractCode which can be found in detail in the code repository. Its primary function is to model the scope for the primitive types.

Now we have a scope, but without anything in it. Something must provide the types. This is done by a TypeProvider. The TypeProvider controls two major objects one is a Resource. The Resource is necessary, as every data in Eclipse must be contained in a resource. For the type system, we call it TypeResource. The second object is called PrimitiveMirror. This mirror object builds the types in conformance to the type-system and assigns the types to the TypeResource. Also it is used by the TypeResource to lookup the type instances.

/*
 * Science Blog
 *
 * http://www.se.informatik.uni-kiel.de
 * 
 * Copyright 2012 by
 * + Christian-Albrechts-University of Kiel
 *   + Department of Computer Science
 *     + Software Engineering Group
 * 
 * This code is provided under the terms of the Eclipse Public License (EPL).
 * See the file epl-v10.html for the license text.
 */
package de.cau.cs.se.lad.typing;

import org.eclipse.emf.ecore.EObject;
import org.eclipse.emf.ecore.resource.Resource;
import org.eclipse.xtext.resource.IFragmentProvider;

import de.cau.cs.se.lad.types.TypesFactory;
import de.cau.cs.se.lad.types.TypesPackage;
import de.cau.cs.se.lad.types.PrimitiveType;
import de.cau.cs.se.lad.types.Type;

/**
 * @author Christian Schneider - Initial contribution
 */
public class PrimitiveMirror {

	/**
	 * Constructs the primitive mirror. It requires a type factory for primitive types described
	 * with strings.
	 * 
	 * @param typeFactory The type factory
	 */
	public PrimitiveMirror() {
	}

	/**
	 * Searches for an object in a resource described by a fragment.
	 * 
	 * @param resource
	 * @param fragment
	 * @param fallback
	 * @return
	 */
	public EObject getEObject(final Resource resource, final String fragment,
	        final IFragmentProvider.Fallback fallback) {
		for (EObject obj : resource.getContents()) {
			String otherFragment = getFragment(obj, fallback);
			if (fragment.equals(otherFragment))
				return obj;
		}
		return fallback.getEObject(fragment);
	}

	// TODO Why do we need to hide something we do not inherit?
	/**
	 * This specialisation is introduced to hide the JVMType filtering. {@inheritDoc}
	 */
	public String getFragment(EObject obj, IFragmentProvider.Fallback fallback) {
		if (TypesPackage.eINSTANCE.getPrimitiveType().isInstance(obj)) {
			return ((PrimitiveType) obj).getName();
		} else {
			return fallback.getFragment(obj);
		}
	}

	/**
	 * Initialises a given type resource with the primitive types from the type enumeration.
	 * 
	 * @param typeResource
	 *            the type resource to be initialised
	 * 
	 */
	public void initialize(final TypeResource typeResource) {
		for (PrimitiveTypes primitiveType : PrimitiveTypes.values()) {
			Type type = TypesFactory.eINSTANCE.createPrimitiveType();
	        type.setName(primitiveType.lowerCaseName());
			typeResource.getContents().add(type);
		}
	}
}

The whole type-system support is illustrated in the following diagram.type-system-overview

Now, with the typing infrastructure in place, the last thing to to: register the TypeGlobalScopeProvider and give the application a try. Checkout the project from our git repository and import it into your Eclipse.

The example was developed using

  • Eclipse 4.2.x modeling edition with
  • Xtext 2.3.1 with Xtend 2.3.1, MWE 2
  • EGit 2.0.0 and JGit 2.0.0

Build the project and launch it. As you can see (hopefully), simple definitions for classes, services and views can be made. There is a example file provided in the src folder of de.cau.cs.se.lad.

Right now the meta-model cannot handle expressions with these primitive types, as the specific portions of the meta-model are missing. To provide a more complete app development language, expressions will be targeted in one of my next posts.