Kieker | Reiner Jung

The instrumentation record language (IRL) is part of a model-driven instrumentation approach being developed in context of the iObserve project. The goal of this project is to forecast the behavior of cloud-based applications by simulation. The simulation utilizes the application model in conjunction with a usage profile. However, profiles defined at design time do not reflect the real system usage. Therefore, the models must be calibrated based on the observed behavior. Our model-driven instrumentation approach addresses this issue.

Since the last post on the instrumentation record language, additional requirements surfaced by analyzing all existing record structures currently included in the Kieker distribution. This lead to a more comprehensive list of language features and a more complex type system. Furthermore, properties in a few Kieker record types were found, which may either be great ideas to extend the language or reflect properties which do not belong to the record definition aspect. As no final decision has been made on those properties, they are described in the fourth section of this post to foster the discussion.

Language Features

The instrumentation record language (IRL) is a designed to represent monitoring records. Therefore, the language belongs to the domain of entity languages, such as the entity language of the JPA defined by special annotations and Java classes. However, the IRL is much more specialized for the needs of defining monitoring records. In short, the realized features are:

Definition of record types based on a set of primitive types and arrays.
A multi-inheritance typing system supporting inheritance of one parent record and additional property sets, which we call templates.
Default initialization of record properties, either by value or by constant.
Definition of constants usable during analysis.

Typing

The type system support multi-inheritance. In general this could be realized by simply specifying multiple parent structures, which would be a very clean approach. However, due to the present record structure, which is implemented in Java, we must differentiate between parent classes defining record types with properties and interfaces emulating record types with properties. Therefore, the type system provides template (type) and entity (type) to define interfaces and classes respectively.

In our type system, templates are ordered sets of properties, which may even come with initialization values, however, they are not instantiable entities for the instrumentation aspect language (IAL). Templates can inherit properties from any number of other templates, but not from entities.

Entities allow to define constants and properties and can have one parent entity where they inherit properties from. Furthermore, they can realize any number patterns. An entity can also be abstract, which means they are not instantiable in the IAL.

Properties declared in entities or templates have a primitive or array type, where arrays can have fixed or dynamic sizes. Furthermore, arbitrary data structures specified in the instrumented application can also be used as types. However, this feature alongside arrays are presently not supported by the Kieker infrastructure.

Code Generation

The generator is realized with the normal Xtext API, meaning, we have one central class defining the generator. However, to support multiple target languages, we use this central class only as a dispatcher calling dedicated generators for different languages and IRL data types.

The API for generators for different language types are described in AbstractPartialRecordTypeGenerator and AbstractRecordTypeGenerator located in the de.cau.cs.se.instrumentation.rl.generator package.

Special Properties

Right now, there are three architectural properties present in the Java implementation of Kieker records, which are not covered by the language.

Kieker defines interfaces to properties in multiple interfaces with the same signature. In the IRL this would imply defining a property in different patterns and all inheriting them in one entity. The type-checker of the IRL prohibits this, as it results in redefining properties. Redefining properties can make the record structure less comprehensible as two properties, declared in the same way, but with different semantics shadow one another. A user not aware of the situation will not be able to identify those cases and might use properties in the wrong way. However, there might also be pro arguments to support such multiple definitions.
Only a few record types implement special operations (see #1101). These operations are only used in a few classes throughout the analysis framework and are mostly utilized in tests. Therefore, it could be argued that these methods should be moved to the filters requiring them. However, no final decision has been made on this point. Therefore, this feature is listed here.
In one interface, getters are defined, which are then later implemented in a class where they realize aliases to another property emulating an alias or a renaming of properties. This can be helpful for users to understand the semantics of the properties better in certain contexts. However, it could also lead to confusion or even worse be used to change the semantics. Like the other two features, no final decision has been made.

Resources

The record language is presently only available from a git repository which can be accessed via our gitlab instance https://build.se.informatik.uni-kiel.de/gitlab/iobserve/model-driven-instrumentation. The site also includes a more precise documentation of the language and its features.

Instrumentation of software systems is used to monitor systems’ behavior to determine performance properties, the internal state, and also to recover the actual architecture of systems. Today, systems are implemented in different programming languages. Monitoring different parts of the system, require therefore monitoring record structures in the different languages. To perform joined analyses over all components of a system the different records must be compatible. This can be achieved by defining common serialization standards for various languages, as we did for the Kieker Data Bridge, and implement suitable runtime libraries for the different languages. However, the monitoring records must be rewritten for different languages all from scratch.

Furthermore, model-driven software development uses models to describe software systems. Code generators interpret these models and generate source code in the used programming languages. Therefore, it would be beneficial to express monitoring also in a model level.

A solution to these two issues are modeling languages to express instrumentation. At present, we divide instrumentation in monitoring records, a monitoring runtime, and weaving instructions for the monitoring aspect. In this post, we address modeling and code generation for monitoring records and their supplemental serialization functions.

This post first introduces the requirements for such an instrumentation record language (IRL), followed by the language features realizing these requirements. Then briefly the technological context is introduced. Before the grammar and code generation are explained. Finally, a short summary of the results is given.

Requirements

The model-driven instrumentation record language must fulfill the following distinct requirements to be part of a model-driven monitoring approach:

Target language independent description of monitoring data
Mapping of types from other languages to Java types
Code generation for all languages supported by Kieker
Ability to add additional payload to records
Limit the size of the payload, to ensure reasonable response times
Reuse of record definition, especially in context of payload definitions
Fit in model-driven monitoring approach

Language Features

The instrumentation record language must be able to support different target languages, which requires a language independent type systems. The type system must provide the necessary structures and primitive types used in monitoring records. As the analysis side of Kieker is written in Java, primitive types of a target language must be mappable to Java types.

To initialize properties of records with default values, the language must support the definition of constants for defaults.

Kieker has many different record types and in future, with the payload feature, the number of records might increase. Therefore, it is helpful to reuse record definitions, which can easily be realized through a sub-typing feature.

Payload data is formulated in the type system of the target language of the software system. Therefore, the generators must be able to inherit those types and provide serialization for them. This feature can be implemented later, as it requires quite some work on the level of code generator design.

On the modeling level, the language must be able to import the type system of the modeling language. It must also provide a transformation for these type structures to the target language of the project and a transformation to map the types onto Java-types.

In OO-languages a payload can be described by a reference to an object. As objects can have references to other objects, this could result in a large quantity of data to be transmitted every time the probe is executed. To limit the size, references should not be followed implicitly. They must be explicitly specified.

Finally, the language requires code generators for every supported programming language. These code generators should be easy to implement and share model navigation routines. Also they require an API specification and runtime. The runtime is not part of the language itself, but it is necessary to specify the code generator, as it has to generate code utilizing that API.

Development Environment

The IRL is based on the Xtext DSL development framework, which utilizes Eclipse as IDE and EMF as basis for the meta-modeling. Xtext provides a grammar editor and language, which utilizes Antlr to generate actual parsers for editors’ syntax highlighting and the model generation. The framework is bundled with the model to model and model to text language Xtend, which is used to implement the IRL code generators.

Grammar

The grammar if the IRL can be divided in four different parts. First, the generic header of any Xtext language importing terminals and meta-models. Second, a service part to specify package names, import ecore models and other resources. Third, the actual record type definition. And fourth, literals and additional terminals.

Header

The header is not very spectacular. It defines the class name for the parser, specifies the import of the terminals grammar, and imports the ecore package so ecore types can be used in the grammar.

grammar de.cau.cs.se.instrumentation.rl.RecordLang with org.eclipse.xtext.common.Terminals

generate recordLang "http://www.cau.de/cs/se/instrumentation/rl/RecordLang"
import "http://www.eclipse.org/emf/2002/Ecore" as ecore

Service Part

The service part defines the root of the model. Therefore, the first rule is called Model. The model has a name attribute holding the package name, where the model belongs to. The grammar then allows to specify ecore models. In general that should be ecore-based meta-models comprising a type system of some sort. They are used introduce the typing of DSLs to formulate the payload. However, the mechanism behind them might not be perfect and will be subject to change.

The language allows also to import record entities with the import statement. However, normally they should be imported automatically, if they are in the same project or in an associated project.

Model: 'package' name = QualifiedName
       (packages += Package)*
       (imports += Import)*
       (records += RecordType)*
;

Import: 'import' importedNamespace = QualifiedNameWithWildcard ;

Package: 'use' name=ID package=[ecore::EPackage|STRING] ;

In future, the use and import infrastructure must be able to import other language type system and reflect their code generation principles in order to realize arbitrary payloads for any language.

Record Declaration

The record declaration must fulfill different tasks. At present, it must to be able to specify record properties for the monitoring, and default values for these properties. Depending on present source code of record structures in Kieker, it might be necessary to augment these basic declarations. First, there might be automatic values in the record structure, which the language must support, and second, some properties might be mandatory while others or optional. Those optional properties require then a default value for the serialization. At present this is realized through default constants.

TemplateType:
	'template' name=ID (':' inherits+=[TemplateType|QualifiedName] (',' inherits+=[TemplateType|QualifiedName])*)? 
	(
		('{' (properties+=Property | constants+=Constant)* '}') |
		(properties+=Property)
	)?
;

RecordType:
	(abstract?='abstract')? ('struct') name=ID 
	('extends' parent=[RecordType|QualifiedName])?
	(':' inherits+=[TemplateType|QualifiedName]
	('{'
		(properties+=Property | constants+=Constant)*
	'}')?
;

In the grammar each record is represented by a RecordType instance. The language uses the keyword struct, as the word record is already part of the Kieker package names and would then clash with the keyword. A RecordType can have a parent RecordType which it extends with additional properties. Previously defined properties and default constants may not be overwritten. The inheritance pattern is a classic sub-typing pattern. So properties are all visible on the child level. The body of a RecordType consists of Property and Default-declarations. However, at present there also exists a combined method to declare defaults in combination with properties.

Default:
	'default' type=Classifier name=ID '=' value=Literal
;

A normal default constant starts with a default followed by a type. In general all ecore types are allowed. However, the literals are limited to primitive types. Therefore, at present only primitive classifiers are useful. The language defines a set of primitive types, which are programmatically extendable by modifying the enumeration de.cau.cs.se.instrumentation.rl.typing.PrimitiveTypes.

Properties are defined in a similar way. However, they do not have a initial keyword. They are defined by a type followed by a name.

Property: type=Classifier name=ID
             ('{' (properties+=ReferenceProperty)* '}' | 
              '=' value=Literal |
              '=' const=Constant)
;

Constant: name=ID value=Literal ;

After the name three different things can happen. The second option allows to specify a literal, which can be a value of some kind or a previously defined default value. The third option allows to define a default constant and automatically assign it to the property, just like a combination of a default constant declaration and a property using the default constant as literal. The first option addresses payloads.

The type is declared by a reference to a ecore classifier, which can be one of the built in types, or one of the imported structures in a package imported above.

Classifier:
	(package=[Package] '.')? class=[ecore::EClassifier|ID]
;

To specify a payload, the language uses also a property. In that case, the classifier is not a primitive type, but a complex type defined in an imported model. As complex types may imply recursive structures, which could result in referencing the whole runtime state, each required value must be stated explicitly if it is complex. For example, a property may have the type of a Java class, then all properties of that class, which have a primitive type will be serialized and stored in the transmission, however, every property referencing a complex type, will only be represented by an object id. If also the content of such complex type is required, it must be specified as ReferenceProperty between the two braces.

ReferenceProperty:
	ref=[ecore::EStructuralFeature|ID] ('{' (properties+=ReferenceProperty)* '}'
;

The ReferenceProperty allows to recurse down into data structures with nested ReferenceProperty declarations. While this looks complicated for deeply nested structures, it assures that only a minimum of data is retrieved from the systems runtime state and that the data size is limited.

Literals and Terminals

The language supports literals for all supported primitive types and references to default constants.

Literal:
	StringLiteral | IntLiteral | FloatLiteral | BooleanLiteral | DefaultLiteral
;

StringLiteral:
	value=STRING
;

IntLiteral:
	value=INTEGER
;

FloatLiteral:
	value=FLOAT
;

BooleanLiteral: 
	{BooleanLiteral} ((value?='true')|'false')
;

DefaultLiteral:
	value=[Default|ID]
;

To model qualified names and express imports, standard qualified name rules have been added. The literals require signed floating point and integer values, therefore, two suitable terminals complete the language.

QualifiedName:
  ID (=>'.' ID)*;

QualifiedNameWithWildcard:
	QualifiedName ('.' '*')?
;

// terminals
terminal INTEGER returns ecore::EInt: ('+'|'-')? INT;

terminal FLOAT returns ecore::EFloat: ('+'|'-')? (INT '.' INT? | '.' INT);

Code Generation

At present the code generator for the IRL allows to produce Java and C-files representing the monitoring records and provide some serialization functionality for C-based records.

Every code generator must provide a type mapping routine called createTypeName which converts a classifier of a primitive type into a target language primitive type. And all generators must be able to unfold the number of properties, which belong to the monitoring record. This functionality is realized in the RecordLangGenericGenerator class with the compileProperties method.

The detailed structure of the monitoring record the generators have to produce is partially documented for Java in the source code of IMonitoringRecord. The two other example languages C and Perl used records from the KDB test client written in C and the eprints instrumentation written in Perl ab Nis Wechselberg.

Summary

This post introduced the instrumentation record language for the Kieker monitoring framework and the Kieker Data Bridge. At present it is a simple language with sub-typing to build monitoring record structures in three programming languages (C,Java,Perl). It is the first prototype for a generic record language for Kieker used in future releases to ease instrumentation of applications written in different languages. Furthermore, it is a building block in the effort to develop model-driven instrumentation.

The whole language is based on Xtext, the typing mechanism explained in a previous post, which will get its typing rules from XSemantics in the near future. The code generation is based on Xtend and due to a generator API, it is easy to write generators for new languages.

At the moment, the code can be found at:

git clone https://github.com/research-iobserve/instrumentation-language.git

The repository contains also the previous record language in de.cau.cs.se.instrumentation.language, which is deprecated, and an instrumentation application language in de.cau.cs.se.instrumentation.al, which is one of the next step to model-driven monitoring. The tree also contains an old version of the Kieker Data Bridge, which has since been moved to the main Kieker repository.

Reiner Jung

Research, Digital Transformation, Climate Action and other Random Stuff

Tag Archives: Kieker

Present State of the Instrumentation Record Language for Kieker

Language Features

Typing

Code Generation

Special Properties

Resources

Like this:

An Instrumentation Record Language for Kieker

Requirements

Language Features

Development Environment

Grammar

Header

Service Part

Record Declaration

Literals and Terminals

Code Generation

Summary

Like this:

Language Features

Typing

Code Generation

Special Properties

Resources

Share this:

Like this:

Requirements

Language Features

Development Environment

Grammar

Header

Service Part

Record Declaration

Literals and Terminals

Code Generation

Summary

Share this:

Like this: