An Instrumentation Record Language for Kieker

Instrumentation of software systems is used to monitor systems’ behavior to determine performance properties, the internal state, and also to recover the actual architecture of systems. Today, systems are implemented in different programming languages. Monitoring different parts of the system, require therefore monitoring record structures in the different languages. To perform joined analyses over all components of a system the different records must be compatible. This can be achieved by defining common serialization standards for various languages, as we did for the Kieker Data Bridge, and implement suitable runtime libraries for the different languages. However, the monitoring records must be rewritten for different languages all from scratch.

Furthermore, model-driven software development uses models to describe software systems. Code generators interpret these models and generate source code in the used programming languages. Therefore, it would be beneficial to express monitoring also in a model level.

A solution to these two issues are modeling languages to express instrumentation. At present, we divide instrumentation in monitoring records, a monitoring runtime, and weaving instructions for the monitoring aspect. In this post, we address modeling and code generation for monitoring records and their supplemental serialization functions.

This post first introduces the requirements for such an instrumentation record language (IRL), followed by the language features realizing these requirements. Then briefly the technological context is introduced. Before the grammar and code generation are explained. Finally, a short summary of the results is given.

Requirements

The model-driven instrumentation record language must fulfill the following distinct requirements to be part of a model-driven monitoring approach:

  1. Target language independent description of monitoring data
  2. Mapping of types from other languages to Java types
  3. Code generation for all languages supported by Kieker
  4. Ability to add additional payload to records
  5. Limit the size of the payload, to ensure reasonable response times
  6. Reuse of record definition, especially in context of payload definitions
  7. Fit in model-driven monitoring approach

Language Features

The instrumentation record language must be able to support different target languages, which requires a language independent type systems. The type system must provide the necessary structures and primitive types used in monitoring records. As the analysis side of Kieker is written in Java, primitive types of a target language must be mappable to Java types.

To initialize properties of records with default values, the language must support the definition of constants for defaults.

Kieker has many different record types and in future, with the payload feature, the number of records might increase. Therefore, it is helpful to reuse record definitions, which can easily be realized through a sub-typing feature.

Payload data is formulated in the type system of the target language of the software system. Therefore, the generators must be able to inherit those types and provide serialization for them. This feature can be implemented later, as it requires quite some work on the level of code generator design.

On the modeling level, the language must be able to import the type system of the modeling language. It must also provide a transformation for these type structures to the target language of the project and a transformation to map the types onto Java-types.

In OO-languages a payload can be described by a reference to an object. As objects can have references to other objects, this could result in a large quantity of data to be transmitted every time the probe is executed. To limit the size, references should not be followed implicitly. They must be explicitly specified.

Finally, the language requires code generators for every supported programming language. These code generators should be easy to implement and share model navigation routines. Also they require an API specification and runtime. The runtime is not part of the language itself, but it is necessary to specify the code generator, as it has to generate code utilizing that API.

Development Environment

The IRL is based on the Xtext DSL development framework, which utilizes Eclipse as IDE and EMF as basis for the meta-modeling. Xtext provides a grammar editor and language, which utilizes Antlr to generate actual parsers for editors’ syntax highlighting and the model generation. The framework is bundled with the model to model and model to text language Xtend, which is used to implement the IRL code generators.

Grammar

The grammar if the IRL can be divided in four different parts. First, the generic header of any Xtext language importing terminals and meta-models. Second, a service part to specify package names, import ecore models and other resources. Third, the actual record type definition. And fourth, literals and additional terminals.

Header

The header is not very spectacular. It defines the class name for the parser, specifies the import of the terminals grammar, and imports the ecore package so ecore types can be used in the grammar.

grammar de.cau.cs.se.instrumentation.rl.RecordLang with org.eclipse.xtext.common.Terminals

generate recordLang "http://www.cau.de/cs/se/instrumentation/rl/RecordLang"
import "http://www.eclipse.org/emf/2002/Ecore" as ecore

Service Part

The service part defines the root of the model. Therefore, the first rule is called Model. The model has a name attribute holding the package name, where the model belongs to. The grammar then allows to specify ecore models. In general that should be ecore-based meta-models comprising a type system of some sort. They are used introduce the typing of DSLs to formulate the payload. However, the mechanism behind them might not be perfect and will be subject to change.

The language allows also to import record entities with the import statement. However, normally they should be imported automatically, if they are in the same project or in an associated project.

Model: 'package' name = QualifiedName
       (packages += Package)*
       (imports += Import)*
       (records += RecordType)*
;

Import: 'import' importedNamespace = QualifiedNameWithWildcard ;

Package: 'use' name=ID package=[ecore::EPackage|STRING] ;

In future, the use and import infrastructure must be able to import other language type system and reflect their code generation principles in order to realize arbitrary payloads for any language.

Record Declaration

The record declaration must fulfill different tasks. At present, it must to be able to specify record properties for the monitoring, and default values for these properties. Depending on present source code of record structures in Kieker, it might be necessary to augment these basic declarations. First, there might be automatic values in the record structure, which the language must support, and second, some properties might be mandatory while others or optional. Those optional properties require then a default value for the serialization. At present this is realized through default constants.

TemplateType:
	'template' name=ID (':' inherits+=[TemplateType|QualifiedName] (',' inherits+=[TemplateType|QualifiedName])*)? 
	(
		('{' (properties+=Property | constants+=Constant)* '}') |
		(properties+=Property)
	)?
;

RecordType:
	(abstract?='abstract')? ('struct') name=ID 
	('extends' parent=[RecordType|QualifiedName])?
	(':' inherits+=[TemplateType|QualifiedName]
	('{'
		(properties+=Property | constants+=Constant)*
	'}')?
;

In the grammar each record is represented by a RecordType instance. The language uses the keyword struct, as the word record is already part of the Kieker package names and would then clash with the keyword. A RecordType can have a parent RecordType which it extends with additional properties. Previously defined properties and default constants may not be overwritten. The inheritance pattern is a classic sub-typing pattern. So properties are all visible on the child level. The body of a RecordType consists of Property and Default-declarations. However, at present there also exists a combined method to declare defaults in combination with properties.

Default:
	'default' type=Classifier name=ID '=' value=Literal
;

A normal default constant starts with a default followed by a type. In general all ecore types are allowed. However, the literals are limited to primitive types. Therefore, at present only primitive classifiers are useful. The language defines a set of primitive types, which are programmatically extendable by modifying the enumeration de.cau.cs.se.instrumentation.rl.typing.PrimitiveTypes.

Properties are defined in a similar way. However, they do not have a initial keyword. They are defined by a type followed by a name.

Property: type=Classifier name=ID
             ('{' (properties+=ReferenceProperty)* '}' | 
              '=' value=Literal |
              '=' const=Constant)
;

Constant: name=ID value=Literal ;

After the name three different things can happen. The second option allows to specify a literal, which can be a value of some kind or a previously defined default value. The third option allows to define a default constant and automatically assign it to the property, just like a combination of a default constant declaration and a property using the default constant as literal. The first option addresses payloads.

The type is declared by a reference to a ecore classifier, which can be one of the built in types, or one of the imported structures in a package imported above.

Classifier:
	(package=[Package] '.')? class=[ecore::EClassifier|ID]
;

To specify a payload, the language uses also a property. In that case, the classifier is not a primitive type, but a complex type defined in an imported model. As complex types may imply recursive structures, which could result in referencing the whole runtime state, each required value must be stated explicitly if it is complex. For example, a property may have the type of a Java class, then all properties of that class, which have a primitive type will be serialized and stored in the transmission, however, every property referencing a complex type, will only be represented by an object id. If also the content of such complex type is required, it must be specified as ReferenceProperty between the two braces.

ReferenceProperty:
	ref=[ecore::EStructuralFeature|ID] ('{' (properties+=ReferenceProperty)* '}'
;

The ReferenceProperty allows to recurse down into data structures with nested ReferenceProperty declarations. While this looks complicated for deeply nested structures, it assures that only a minimum of data is retrieved from the systems runtime state and that the data size is limited.

Literals and Terminals

The language supports literals for all supported primitive types and references to default constants.

Literal:
	StringLiteral | IntLiteral | FloatLiteral | BooleanLiteral | DefaultLiteral
;

StringLiteral:
	value=STRING
;

IntLiteral:
	value=INTEGER
;

FloatLiteral:
	value=FLOAT
;

BooleanLiteral: 
	{BooleanLiteral} ((value?='true')|'false')
;

DefaultLiteral:
	value=[Default|ID]
;

To model qualified names and express imports, standard qualified name rules have been added. The literals require signed floating point and integer values, therefore, two suitable terminals complete the language.

QualifiedName:
  ID (=>'.' ID)*;

QualifiedNameWithWildcard:
	QualifiedName ('.' '*')?
;

// terminals
terminal INTEGER returns ecore::EInt: ('+'|'-')? INT;

terminal FLOAT returns ecore::EFloat: ('+'|'-')? (INT '.' INT? | '.' INT);

Code Generation

At present the code generator for the IRL allows to produce Java and C-files representing the monitoring records and provide some serialization functionality for C-based records.

Every code generator must provide a type mapping routine called createTypeName which converts a classifier of a primitive type into a target language primitive type. And all generators must be able to unfold the number of properties, which belong to the monitoring record. This functionality is realized in the RecordLangGenericGenerator class with the compileProperties method.

The detailed structure of the monitoring record the generators have to produce is partially documented for Java in the source code of IMonitoringRecord. The two other example languages C and Perl used records from the KDB test client written in C and the eprints instrumentation written in Perl ab Nis Wechselberg.

Summary

This post introduced the instrumentation record language for the Kieker monitoring framework and the Kieker Data Bridge. At present it is a simple language with sub-typing to build monitoring record structures in three programming languages (C,Java,Perl). It is the first prototype for a generic record language for Kieker used in future releases to ease instrumentation of applications written in different languages. Furthermore, it is a building block in the effort to develop model-driven instrumentation.

The whole language is based on Xtext, the typing mechanism explained in a previous post, which will get its typing rules from XSemantics in the near future. The code generation is based on Xtend and due to a generator API, it is easy to write generators for new languages.

At the moment, the code can be found at:

git clone https://github.com/research-iobserve/instrumentation-language.git

The repository contains also the previous record language in de.cau.cs.se.instrumentation.language, which is deprecated, and an instrumentation application language in de.cau.cs.se.instrumentation.al, which is one of the next step to model-driven monitoring. The tree also contains an old version of the Kieker Data Bridge, which has since been moved to the main Kieker repository.

Proposal: Kieker Data Bridge and Adaptive Monitoring

The Kieker Data Bridge is a service designed to support Kieker probes for non-JVM languages. Instead of reimplementing the wide range of Kieker data stores, the bridge allows to implement Kieker records and probes for non-JVM languages, like Perl or C, which send their monitoring data to the bridge for conversion and storage. While it is possible to run the KDB and the monitored application on the same machine, we use the term remote site or remote application to refer to the monitored application form the view point of the KDB. Also the KDB is called remote from the view point of the monitored application.

The present design of the Kieker Data Bridge (KDB) is only able to handle incoming Kieker monitoring records and store them in a Kieker data store. However, Kieker has evolved and supports adaptive monitoring based on regular expression pattern for method or function signatures. In our effort to support a wide range of programming and modeling languages, the Kieker bridge has to be extended accordingly.

In this blog post, we discuss this feature and the different implementation ideas. First, we explain the present KDB behavior and the behavior specification of probes. Second, the properties of the adaptation feature are defined, and then used in the third section to discuss solution concepts. Fourth, we define a new behavior for probes and the bridge.

Behavior Model of the KDB and Probes

The Kieker Data Bridge in its current design, is configured on startup by dynamically loading Kieker record classes, configuring a Kieker monitoring controller, and setting up a network connection with one of its network connectors for TCP or JMS. Then it starts listening for incoming records, as show in Figure 1.

Figure 1: KDB and probe behavior illustrated in an activity diagram

Figure 1: KDB and probe behavior illustrated in an activity diagram

Of course there can be more than one probe and each probe can be triggered multiple times. Each time the probe is triggered it sends a record to the KDB, which then processes the data and stores it in a Kieker store, which can be a number of different storage methods, like files, databases or message queues.

While the KDB design can use multiple threads to handle incoming data, the probe can run embedded in normal code and does not require a task switch. The single thread network connectors of the KDB and the JMS connectors, also are single threaded, as they can wait for incoming data and block while listening. This results in low system overhead for the transmission of record data.

Understanding the Adaptive Monitoring Feature

Adaptive monitoring allows to activate and deactivate probes at runtime. Every time, a probe is triggered the system must evaluate if the triggered probe is active and then collect data and store it with Kieker.

The adaptive monitoring in Kieker is based on regular expression patterns, which allow to describe method or function signatures similar to expressions used in AspectJ. In general, if a probe is triggered, it passes that method’s signature as a string to the ProbeController, which checks if the given signature matches to any stored pattern. Successful matches are stored in an internal signature cache to speed up lookup at a later time. The controller returns either true or false, indicating that the probe can proceed with data collection and data storage or must skip those tasks respectively.

These lookups are performed every time a probe is triggered. Therefore, the cost for the evaluation should be minimized as much as possible. In the present implementation of the probes, the method signature is provided to the ProbeController and that is first checked against a signature cache. On cache misses the signature is passed to a pattern matching routine.

In the KDB context, each time a probe is triggered in a remote program, the activation lookup must be performed, prior to the data collection. If this lookup has to use IPC for every call, then the overall response time of the probes would be unacceptable slow. Especially, the presently implemented TCP and JMS connections bring a lot of latency to remote calls.

Another problem arises when patterns for probe activation are changed. In that case a probe activation cache must be invalidated or at least updated. The ProbeController performs such invalidation, but in the KDB context, these changes must be communicated to the remote application, at best before the next probe is triggered. However, the probe design should stay as simple as possible.

Design Concepts for Adaptive Monitoring with KDB

The adaptive monitoring via KDB must solve all the above problems and provide a general approach which is suitable for a wide range of languages and run-time contexts.

In many bi-directional communication scenarios, two threads are used to realize the communication. One thread is there to listen for incoming requests and the other thread is actively processing something and initiates communication when it is necessary. For monitoring probes, this is a complex solution. First, it requires multi-threading, which is not available in all run-time environments, in some contexts it is expensive, and in some contexts threads are hard to handle. Second, it includes synchronizations and could lead to a wide range of race-conditions or other execution artifacts you do not want to handle in monitoring probes, as they would add to the overall execution time. A simpler solution for two way communication is a simple query-response pattern, initiated by the probe, either when the probe is triggered or when the probe is executed.

If the update for the signature cache is requested in the probe trigger routine, it would be called for every probe. That would completely render an application side cache useless, because a request to the KDB has to be made every time and then waited for the reply. In this scenario, it would be simpler to just send the signature and wait for a one byte result containing 0 or 1.

The alternative scenario, would trigger a request for a cache update only when a probe is actually executed. While that would reduce the number of update calls, it has two downsides. First, updates in the KDB signature cache are only promoted on active probes. If the program is not triggering any probe, no updates can be made. Second, every probe execution would delay the overall execution, as the update request must be processed.
In consequence, the application side needs some sort of update thread, dedicated to listening to signature update requests.

We, therefore, propose a classic model with two application-threads and two KDB-threads to implement the two way communication. On application side, the application main thread includes probes, which first check if they are active and if so, execute their code. On cache misses they send the signature to the KDB and request an answer. This operation will also update the signature cache. If the probe activation pattern are changed on the KDB side, KDB will reevaluate the signature cache and send an update list to application side.

Figure 2 illustrates the interaction of one probe in the application main thread with the KDB. The focus of the figure focuses is on the interaction on cache misses. Therefore, the send data event between Send Data to Bridge and Receive Command is omitted.

Figure 2: Behavior model in the adaptive monitoring scenario for a probe and the KDB

Figure 2: Behavior model in the adaptive monitoring scenario for a probe and the KDB

These two activity charts can be embedded into the present concept of the KDB, as the communication is initiated the probe. However, to handle signature cache updates on KDB side, a push mechanism must be modeled as well. The push mechanism is also triggered by an event, which originates from an JMX call or the periodically invoked pattern reader ConfigFileReader inside of the ProbeController.

As the JMX method is more suitable for the KDB, the following proposal focus on that technology for pattern modification. Via JMX, methods of the MonitoringController can be called, especially the methods

  • boolean activateProbe(final String pattern)
  • boolean deactivateProbe(final String pattern)

Both methods take one pattern string and add them to the pattern list by invoking corresponding functions in the ProbeController. The present implementation of the ProbeController then invalidates its signature cache and registers the pattern and the activation state in an internal list.

For the KDB, this is not suffice, as the signature cache on the probe side must be updated as well. Therefore, the ProbeController must be extended to use the ServiceContainer and cause an update of the application signature cache through one of the ServiceContainer method. As stated earlier, this must be done in a separate thread. Fortunately, we already have that thread, as the JMX MBeanServer runs in one. Also, the file update mechanism is a runnable and is executed by a scheduler in its own thread.

Realization of the Adaptive KDB

The previous section explained some of the design issues of the KDB, in this section we propose an implementation of the design within the KDB. This can be accomplished by modifying and extending three places in the current implementation. First, the ServiceContainer requires one additional method, implementing a update push method. Second, the IServiceConnector interface requires a push method signature and its implementation corresponding push method. And third, a the ProbeController must be extended to trigger the ServiceContainer.

For the ServiceContainer and the ServiceConnectors, two implementation options are available. We could use synchronous calls throughout that implementation, which would cause the JMX-thread to be blocked until the update is sent. Or we could use an asynchronous approach where we deliver the updates internally to either one other thread or instantiate a new thread for every update.

The last method is quite simple on the KDB implemenetation side, but would require unique and strictly increasing transaction ids to be able to sort out if updates arrive out of order. This approach would increase the memory footprint on the application side and require extra execution time, as for every update not only the signature must be found, but also the id must be checked. As branches can be very expensive on modern CPUs, we should avoid this.

The second approach would not require such transaction ids, but it would require an internal cache for updates generated by the ProbeController. Furthermore, it does not require the instantiation of a thread for every update request. Its advantage is, that the JMX or ConfigFileReader thread can terminate or go back to listening directly after they calculates the update. Its disadvantages are, that they still require another thread to handle the update transmission, and if updates appear faster than they can be transmitted, the internal buffer would grow. The latter might be a rare case for slow remote connections, but on fast connections the delay of a blocking synchronous approach would be minimal.

Therefore we propose the synchronous approach. Every time the ConfigFileReader is triggered and updates are computed or every time a JMX pattern change happens which causes an update, this event handling thread can directly call the ServiceContainer‘s push method and communicate the updates. After that communication, the thread can terminate or listen for the next event.

In a Java environment, the probes call the ProbeController and hand over the method signature as a string to ask if the probe is active or inactive. This can be slow, but might be necessary in a Java environment. Also, there is only one lookup per call. However, for the signature cache updates there can be multiple signature cache updates at once, which would require a search for all updates. Furthermore, all the signatures must be send over the (Internet) connection.

A more efficient solution is the use of signature ids. Every time a signature is not present in the application cache, the signature is send to the KDB, which answers with a boolean value, indicating the probe status, and a signature id. These ids are generated in the KDB uniquely. In future updates only a boolean and a signature id are suffice to describe the update.

A slightly different approach would generate the id on application side. On a cache miss, a signature is added to the signature cache on application side, and then the signature is send together with the cache slot number to the KDB, which answers with a boolean and the slot number. This last implementation would require no complicated lookup on application side and reduce the load for the price of 4 bytes more traffic towards the KDB for every cache miss. As many method signatures are at least 80 characters long, the payload overhead is approx. 5%. Together with network communication overhead it is negligible.

The last piece to build is a suitable ProbeController. The present ProbeController throws away its signature cache on pattern updates and regenerates the cache. this makes sense in a Java environment, because long running update threads working on common data may result in long probe activation status checks. In the KDB, the lookup is decoupled. Therefore, the update does not need to dump the signature cache. Even more, it should preserve the cache and calculate a delta to the application side. If the cache would be dropped, the application side cache must be dropped too, and in consequence, the application signature cache must be rebuild.

Protocol Specification

We have two communication directions in the adaptive monitoring, which require a proper protocol for communication. This protocol must be able to work in a wide range of transport protocol contexts, which require data serialization.

The first protocol describes communication initiated by a probe. There are two types of communication coming from a probe. Either it is a data record, like in the present KDB, or it is a signature request. Based on the present serialization scheme, the necessary extension can be modeled via the existing class type indicator. The difference is, beside normal Kieker IMonitoringRecord type ids,  command ids are allowed as well. This requires to reserve some ids for KDB commands moving the lowest possible id for IMonitoringRecord types to 32. This allows us to specify 32 different commands, which should be suffice for future extensions.

The record structure for signature requests can be defined as follows:

  1. 4 bytes (int32), defining the data type. In this case the id is 0 for SignatureRequest.
  2. 4 bytes (int32), defining the signature cache position
  3. 4 bytes (int32), defining the length of the signature string
  4. A byte sequence representing the string content

The string notation with the size prefix, allows faster read operations, otherwise we either need to read byte by byte until we read 0, or we read into a buffer and must parse the buffer.

The answer from the KDB to the application probe controller has the following serialization:

  1. 4 bytes (int32), defining the data type (in this case 0)
  2. 4 bytes (int32), defining the signature cache position
  3. 1 byte (int8), representing true or false (0 or 1) indicating the probe is active or inactive.

The same structure is also used for updates triggered by the KDB. These records also have a prefix containing a 4 byte integer representing the answer type id. This is necessary to allow future extensions without breaking code.

For the plain TCP connections, it makes no difference to send one or multiple updates in a chunk, as TCP can be used as a stream. However, in many cases using a block reader can be faster in low level programming languages. In addition, with other technologies, like JMS, it might be advisable to send updates in groups. Therefore, it multiple records must be stored in one message or communication event. To distinguish these two types a second data type id is required.

The format is defined as:

  1. 4 bytes (int32), defining the data type (in this case 1)
  2. 4 bytes (int32), defining the nested type (in this case 0)
  3. n times
    • 4 bytes (int32), defining the signature cache position, where n is the number of consecutive signature cache positions.
    • 1 byte (int8), representing true or false (0 or 1) indicating the probe is active or inactive.

Summary

The presented proposal specifies the generic parts of the KDB extension for adaptive monitoring. It defines the transport serialization format for all technologies based on streams of buffers. Furthermore, it explains which components require a modification. However, it is not a complete design document.