Proposal: Kieker Data Bridge and Adaptive Monitoring

The Kieker Data Bridge is a service designed to support Kieker probes for non-JVM languages. Instead of reimplementing the wide range of Kieker data stores, the bridge allows to implement Kieker records and probes for non-JVM languages, like Perl or C, which send their monitoring data to the bridge for conversion and storage. While it is possible to run the KDB and the monitored application on the same machine, we use the term remote site or remote application to refer to the monitored application form the view point of the KDB. Also the KDB is called remote from the view point of the monitored application.

The present design of the Kieker Data Bridge (KDB) is only able to handle incoming Kieker monitoring records and store them in a Kieker data store. However, Kieker has evolved and supports adaptive monitoring based on regular expression pattern for method or function signatures. In our effort to support a wide range of programming and modeling languages, the Kieker bridge has to be extended accordingly.

In this blog post, we discuss this feature and the different implementation ideas. First, we explain the present KDB behavior and the behavior specification of probes. Second, the properties of the adaptation feature are defined, and then used in the third section to discuss solution concepts. Fourth, we define a new behavior for probes and the bridge.

Behavior Model of the KDB and Probes

The Kieker Data Bridge in its current design, is configured on startup by dynamically loading Kieker record classes, configuring a Kieker monitoring controller, and setting up a network connection with one of its network connectors for TCP or JMS. Then it starts listening for incoming records, as show in Figure 1.

Figure 1: KDB and probe behavior illustrated in an activity diagram

Figure 1: KDB and probe behavior illustrated in an activity diagram

Of course there can be more than one probe and each probe can be triggered multiple times. Each time the probe is triggered it sends a record to the KDB, which then processes the data and stores it in a Kieker store, which can be a number of different storage methods, like files, databases or message queues.

While the KDB design can use multiple threads to handle incoming data, the probe can run embedded in normal code and does not require a task switch. The single thread network connectors of the KDB and the JMS connectors, also are single threaded, as they can wait for incoming data and block while listening. This results in low system overhead for the transmission of record data.

Understanding the Adaptive Monitoring Feature

Adaptive monitoring allows to activate and deactivate probes at runtime. Every time, a probe is triggered the system must evaluate if the triggered probe is active and then collect data and store it with Kieker.

The adaptive monitoring in Kieker is based on regular expression patterns, which allow to describe method or function signatures similar to expressions used in AspectJ. In general, if a probe is triggered, it passes that method’s signature as a string to the ProbeController, which checks if the given signature matches to any stored pattern. Successful matches are stored in an internal signature cache to speed up lookup at a later time. The controller returns either true or false, indicating that the probe can proceed with data collection and data storage or must skip those tasks respectively.

These lookups are performed every time a probe is triggered. Therefore, the cost for the evaluation should be minimized as much as possible. In the present implementation of the probes, the method signature is provided to the ProbeController and that is first checked against a signature cache. On cache misses the signature is passed to a pattern matching routine.

In the KDB context, each time a probe is triggered in a remote program, the activation lookup must be performed, prior to the data collection. If this lookup has to use IPC for every call, then the overall response time of the probes would be unacceptable slow. Especially, the presently implemented TCP and JMS connections bring a lot of latency to remote calls.

Another problem arises when patterns for probe activation are changed. In that case a probe activation cache must be invalidated or at least updated. The ProbeController performs such invalidation, but in the KDB context, these changes must be communicated to the remote application, at best before the next probe is triggered. However, the probe design should stay as simple as possible.

Design Concepts for Adaptive Monitoring with KDB

The adaptive monitoring via KDB must solve all the above problems and provide a general approach which is suitable for a wide range of languages and run-time contexts.

In many bi-directional communication scenarios, two threads are used to realize the communication. One thread is there to listen for incoming requests and the other thread is actively processing something and initiates communication when it is necessary. For monitoring probes, this is a complex solution. First, it requires multi-threading, which is not available in all run-time environments, in some contexts it is expensive, and in some contexts threads are hard to handle. Second, it includes synchronizations and could lead to a wide range of race-conditions or other execution artifacts you do not want to handle in monitoring probes, as they would add to the overall execution time. A simpler solution for two way communication is a simple query-response pattern, initiated by the probe, either when the probe is triggered or when the probe is executed.

If the update for the signature cache is requested in the probe trigger routine, it would be called for every probe. That would completely render an application side cache useless, because a request to the KDB has to be made every time and then waited for the reply. In this scenario, it would be simpler to just send the signature and wait for a one byte result containing 0 or 1.

The alternative scenario, would trigger a request for a cache update only when a probe is actually executed. While that would reduce the number of update calls, it has two downsides. First, updates in the KDB signature cache are only promoted on active probes. If the program is not triggering any probe, no updates can be made. Second, every probe execution would delay the overall execution, as the update request must be processed.
In consequence, the application side needs some sort of update thread, dedicated to listening to signature update requests.

We, therefore, propose a classic model with two application-threads and two KDB-threads to implement the two way communication. On application side, the application main thread includes probes, which first check if they are active and if so, execute their code. On cache misses they send the signature to the KDB and request an answer. This operation will also update the signature cache. If the probe activation pattern are changed on the KDB side, KDB will reevaluate the signature cache and send an update list to application side.

Figure 2 illustrates the interaction of one probe in the application main thread with the KDB. The focus of the figure focuses is on the interaction on cache misses. Therefore, the send data event between Send Data to Bridge and Receive Command is omitted.

Figure 2: Behavior model in the adaptive monitoring scenario for a probe and the KDB

Figure 2: Behavior model in the adaptive monitoring scenario for a probe and the KDB

These two activity charts can be embedded into the present concept of the KDB, as the communication is initiated the probe. However, to handle signature cache updates on KDB side, a push mechanism must be modeled as well. The push mechanism is also triggered by an event, which originates from an JMX call or the periodically invoked pattern reader ConfigFileReader inside of the ProbeController.

As the JMX method is more suitable for the KDB, the following proposal focus on that technology for pattern modification. Via JMX, methods of the MonitoringController can be called, especially the methods

  • boolean activateProbe(final String pattern)
  • boolean deactivateProbe(final String pattern)

Both methods take one pattern string and add them to the pattern list by invoking corresponding functions in the ProbeController. The present implementation of the ProbeController then invalidates its signature cache and registers the pattern and the activation state in an internal list.

For the KDB, this is not suffice, as the signature cache on the probe side must be updated as well. Therefore, the ProbeController must be extended to use the ServiceContainer and cause an update of the application signature cache through one of the ServiceContainer method. As stated earlier, this must be done in a separate thread. Fortunately, we already have that thread, as the JMX MBeanServer runs in one. Also, the file update mechanism is a runnable and is executed by a scheduler in its own thread.

Realization of the Adaptive KDB

The previous section explained some of the design issues of the KDB, in this section we propose an implementation of the design within the KDB. This can be accomplished by modifying and extending three places in the current implementation. First, the ServiceContainer requires one additional method, implementing a update push method. Second, the IServiceConnector interface requires a push method signature and its implementation corresponding push method. And third, a the ProbeController must be extended to trigger the ServiceContainer.

For the ServiceContainer and the ServiceConnectors, two implementation options are available. We could use synchronous calls throughout that implementation, which would cause the JMX-thread to be blocked until the update is sent. Or we could use an asynchronous approach where we deliver the updates internally to either one other thread or instantiate a new thread for every update.

The last method is quite simple on the KDB implemenetation side, but would require unique and strictly increasing transaction ids to be able to sort out if updates arrive out of order. This approach would increase the memory footprint on the application side and require extra execution time, as for every update not only the signature must be found, but also the id must be checked. As branches can be very expensive on modern CPUs, we should avoid this.

The second approach would not require such transaction ids, but it would require an internal cache for updates generated by the ProbeController. Furthermore, it does not require the instantiation of a thread for every update request. Its advantage is, that the JMX or ConfigFileReader thread can terminate or go back to listening directly after they calculates the update. Its disadvantages are, that they still require another thread to handle the update transmission, and if updates appear faster than they can be transmitted, the internal buffer would grow. The latter might be a rare case for slow remote connections, but on fast connections the delay of a blocking synchronous approach would be minimal.

Therefore we propose the synchronous approach. Every time the ConfigFileReader is triggered and updates are computed or every time a JMX pattern change happens which causes an update, this event handling thread can directly call the ServiceContainer‘s push method and communicate the updates. After that communication, the thread can terminate or listen for the next event.

In a Java environment, the probes call the ProbeController and hand over the method signature as a string to ask if the probe is active or inactive. This can be slow, but might be necessary in a Java environment. Also, there is only one lookup per call. However, for the signature cache updates there can be multiple signature cache updates at once, which would require a search for all updates. Furthermore, all the signatures must be send over the (Internet) connection.

A more efficient solution is the use of signature ids. Every time a signature is not present in the application cache, the signature is send to the KDB, which answers with a boolean value, indicating the probe status, and a signature id. These ids are generated in the KDB uniquely. In future updates only a boolean and a signature id are suffice to describe the update.

A slightly different approach would generate the id on application side. On a cache miss, a signature is added to the signature cache on application side, and then the signature is send together with the cache slot number to the KDB, which answers with a boolean and the slot number. This last implementation would require no complicated lookup on application side and reduce the load for the price of 4 bytes more traffic towards the KDB for every cache miss. As many method signatures are at least 80 characters long, the payload overhead is approx. 5%. Together with network communication overhead it is negligible.

The last piece to build is a suitable ProbeController. The present ProbeController throws away its signature cache on pattern updates and regenerates the cache. this makes sense in a Java environment, because long running update threads working on common data may result in long probe activation status checks. In the KDB, the lookup is decoupled. Therefore, the update does not need to dump the signature cache. Even more, it should preserve the cache and calculate a delta to the application side. If the cache would be dropped, the application side cache must be dropped too, and in consequence, the application signature cache must be rebuild.

Protocol Specification

We have two communication directions in the adaptive monitoring, which require a proper protocol for communication. This protocol must be able to work in a wide range of transport protocol contexts, which require data serialization.

The first protocol describes communication initiated by a probe. There are two types of communication coming from a probe. Either it is a data record, like in the present KDB, or it is a signature request. Based on the present serialization scheme, the necessary extension can be modeled via the existing class type indicator. The difference is, beside normal Kieker IMonitoringRecord type ids,  command ids are allowed as well. This requires to reserve some ids for KDB commands moving the lowest possible id for IMonitoringRecord types to 32. This allows us to specify 32 different commands, which should be suffice for future extensions.

The record structure for signature requests can be defined as follows:

  1. 4 bytes (int32), defining the data type. In this case the id is 0 for SignatureRequest.
  2. 4 bytes (int32), defining the signature cache position
  3. 4 bytes (int32), defining the length of the signature string
  4. A byte sequence representing the string content

The string notation with the size prefix, allows faster read operations, otherwise we either need to read byte by byte until we read 0, or we read into a buffer and must parse the buffer.

The answer from the KDB to the application probe controller has the following serialization:

  1. 4 bytes (int32), defining the data type (in this case 0)
  2. 4 bytes (int32), defining the signature cache position
  3. 1 byte (int8), representing true or false (0 or 1) indicating the probe is active or inactive.

The same structure is also used for updates triggered by the KDB. These records also have a prefix containing a 4 byte integer representing the answer type id. This is necessary to allow future extensions without breaking code.

For the plain TCP connections, it makes no difference to send one or multiple updates in a chunk, as TCP can be used as a stream. However, in many cases using a block reader can be faster in low level programming languages. In addition, with other technologies, like JMS, it might be advisable to send updates in groups. Therefore, it multiple records must be stored in one message or communication event. To distinguish these two types a second data type id is required.

The format is defined as:

  1. 4 bytes (int32), defining the data type (in this case 1)
  2. 4 bytes (int32), defining the nested type (in this case 0)
  3. n times
    • 4 bytes (int32), defining the signature cache position, where n is the number of consecutive signature cache positions.
    • 1 byte (int8), representing true or false (0 or 1) indicating the probe is active or inactive.

Summary

The presented proposal specifies the generic parts of the KDB extension for adaptive monitoring. It defines the transport serialization format for all technologies based on streams of buffers. Furthermore, it explains which components require a modification. However, it is not a complete design document.

Kieker Data Bridge

Kieker is a Java-based monitoring and analysis framework, which can be used to instrument any kind of Java-application, either by directly introducing instrumentation code or using AOP techniques such as AspectJ. Furthermore, it can be introduced into servlet contexts. Monitoring data can be stored in files, databases or passed through messaging services, and later be processed with our analysis tools. For more detail, you may read the user guide or visit our wiki and ticket site.

As not all applications on earth are written in Java, other languages cannot be directly instrumented by the Java-framework. This shortcoming has been addressed for particular languages in the DynaMod (C#, VB6, Cobol) and MENGES (IEC61131-3 languages) projects, which ended in 2012. In the active project Pubflow, Kieker is used to instrument different languages, including Perl and Java. In the near future, additional monitoring scenarios will be addressed in the iObserve project. Therefore, I decided to build a common usable Kieker Data Bridge (KDB), which allows to add support for new host languages in a more elegant way.

KDB is presently not available in binary form and installation packages, but the sources of the KDB are available for the public in a git repository.

Public read only access

git clone http://build.se.informatik.uni-kiel.de/de.cau.se.instrumentation.language.git

Read write access (login required)

git clone git@build.se.informatik.uni-kiel.de:de.cau.se.instrumentation.language

Note: The Kieker Data Bridge is currently integrated into the main Kieker repository. Therefore, the specified URLs will change in the near future.

The Kieker Data Bridge

The Kieker Data Bridge (KDB) is designed to support a wide range of monitoring sources, it allows to add monitoring to any language, and is extensible considering the means of data relay. Furthermore, it can be integrated in any other Java application, as the core of the bridge is designed as container providing all the functionality. That container is embedded in two service implementations, a command line application and an Eclipse plugin.

Kieker Data Bridge Core

The core of the KDB is implemented in ServiceContainer. The class provides central service hooks for Kieker and a main loop, implemented by the run() method, for retrieving records and storing them with a Kieker MonitoringWriter.

The constructor takes two parameter. The first is a Kieker configuration object, which is used to setup the Kieker MonitoringWriter. It can be created with different factory methods provided by the Kieker framework through the ConfigurationFactory. The second, is a service connector conforming to the IServiceConnector interface. This interface defines three hooks for a service connector providing a connector setup, a connector shutdown and a record receiver method. In detail they are:

  • setup() is used to setup a data source. This can be opening a socket, connecting to queuing service, or other data sources, like RMI, Corba, OLE, etc.
  • close() is used to close and cleanup the source connection.
  • deserialize() is used to retrieve and deserialize data from a data source.

Beside retrieving, deserializing and storing monitoring record, a user might want to know what is going on or what is going wrong. The ServiceContainer provides therefore a listener registration for IServiceListener, which must implement a handleEvent-method.

public interface IServiceListener {

	/**
	 * Called by the main service loop to inform the listener about processed records and
	 * an optional message.
	 * 
	 * @param recordCount number of processed records
	 * @param message optional message (could be null)
	 */
	void handleEvent(long recordCount, String message);
}

The method has two arguments, which represent the number of records transferred and an optional message, which can be null.

In real world use cases, thousands of records could be received every second. To inform an server application about every record, which is received without an error, will lead to a slow service, which is mostly occupied with informing the user a new about record than it is actually transferring data. To avoid this, the internal update method is only called for every 100th record. As this might be still too often, the method setListenerUpdateInterval allows to set a different update interval.

	/**
	 * Set the update interval for the listener information. The default is 100 records. 
	 * @param listenerUpdateInterval the new update interval in number of records
	 */
	public void setListenerUpdateInterval(final long listenerUpdateInterval) {
		this.listenerUpdateInterval = listenerUpdateInterval;
	}

At present, the Kieker Data Bridge supports five different connection realizations. First, it can act as a service waiting for one incoming connection from a client providing monitoring records. Second, it can be run as a service allowing multiple sources to connect and reconnect. Third, it can connect itself to a monitoring record provider by acting as a client. Fourth, it can be a JMS listener. And fifth, as the setup of a JMS messaging queue might be difficult, it can provide one itself and auto-connect to it.

Record Formats

These five implementations are able to receive monitoring data. To make any sense of it, the data must follow a predefined format. Right now two format schemes have been defined and are used in the TCP (binary) and JMS (binary and textual) implementations.

In general, both formats must be able to identify record types. These types are encoded with numbers mapping to Kieker IMonitoringRecord classes. For all currently implemented services, this mapping must be provided by a Map<Integer, Class<IMonitoringRecord>>, which have to be composed by a server application. The two present implementation use a mapping configuration and dynamically loaded classes from URIs.

Binary Record Format

The binary record format is defined with a TCP connection in mind, which ensures, that, beside connection interruption, all send data is received in the same way. Therefore, no additional transmission control is defined.

A record starts with a 32 bit signed integer (as Java is not able to handle unsigned values) identifying the data type. The rest of the data stream is determined by the data structure implemented in the corresponding IMonitoringRecord described in its TYPES property. Each property is read, interpreted and then stored in a record of the right type.

Numeric values are all in network byte order (which is big endian by the way). As Java supports primitive types (lower case) and classes for primitive types, the deserialization routine must support both implementations. The TYPES array of a IMonitoringRecord must be used to identify the correct Java type. The following primitive types are supported (conforming to the serialization described in DataInput).

  • Boolean one byte non-zero = true, zero = false
  • Byte one byte (signed 8 bit)
  • Short two bytes (signed 16 bit)
  • Integer four bytes (signed 32 bit)
  • Long eight bytes (signed 64 bit)
  • Float four bytes (IEEE 754 floating-point “single format” bit layout)
  • Double eight bytes (IEEE 754 floating-point “double format” bit layout)
  • String four bytes indicating the buffer length of the String, followed by n bytes representing the buffer content. The String is in UTF-8.

Textual Record Format

The textual format is similar to the binary format. However, the received package is one large String containing all values. The values are separated by a semicolon (;). Therefore strings need to escape ; with a slash (\). The whole text is encoded in UTF-8 and the usual Java methods parse* are used to convert values to numbers. The String in textual representation is NOT preceded by an Integer to indicate the length, because the String length is determined by the ; or end of message.

Instrumentation in Perl

The Perl instrumentation is a work in progress and has initially been developed by Nis Börge Wechselberg in his bachelor thesis. At present an example implementation is available at de.cau.se.measure.instrumentation.language/de.cau.cs.se.kieker.service.perltest/src/Kieker in the repository. It comprises of monitoring records corresponding to the event-based set of Kieker monitoring records corresponding to BeforeOperationEvent.java (OperationEntryEvent.pm), AfterOperationEvent.java (OperationExitEvent.pm), and Trace.java (Trace.pm). In future, the different record types can be generated with an instrumentation record description language, which will provide generators for a wide range of host languages.

Monitoring probes in Perl create new records by directly instantiating a record class, filling in parameter values and finally call a writer instance to store the records. The writer used in conjunction with the Kieker Data Bridge is the JMSWriter.pm. This writer requires an active JMS message queue service. In our studies we used ActiveMQ. ActiveMQ provides support for the Stomp message protocol, used by Perl, and a Stomp-JMS mapping.

Instrumentation in C

The instrumentation of C code or other languages able to use C object files is at its infancy. However, a the basic primitives for monitoring programs written in C has been made. The code is available in the KDB repository in the directory de.cau.se.measure.instrumentation.language/de.cau.cs.se.kieker.service.testclient. The library code and one example record type is located in src/kieker and src/kieker/records respectively.

  • src/kieker
    • socket.[hc] implements some convenience functions to establish and handle TCP connections.
    • binary_serializer.[hc] implements primitives for data serialization conform to the format defined above.
  • src/kieker/records
    • operation_execution_record.[hc] implements a Kieker record structure for an operation execution record, which conforms to the Java pendant OperationExecutionRecord.java, and a specialized serialization function for that record type.

It is intended to extend the C instrumentation library and use a generator to produce a wide range of record types. Furthermore, an automated weaving mechanism should be implemented to ease the use of instrumentation for C-based languages.

The Command Line Server

The current KDB implements two servers based on the KDB core. One is the command line server. The command line server (CLI) provides most functions of the KDB and has a rich set of options.

usage: cli-kieker-service [-d] [-h <hostname>] [-k <configuration>] -L
       <paths> [-l <jms-url>] -m <map-file> [-p <number>] [-s] -t <type>
       [-u <username>] [-v <arg>] [-w <password>]
 -d,--daemon                   detach from console; TCP server allows
                               multiple connections
 -h,--host <hostname>          connect to server named <hostname>
 -k,--kieker <configuration>   kieker configuration file
 -L,--libraries <paths>        List of library paths separated by :
 -l,--url <jms-url>            URL for JMS server
 -m,--map <map-file>           Class name to id (integer or string)
                               mapping
 -p,--port <number>            listen at port (tcp-server or jms-embedded)
                               or connect to port (tcp-client)
 -s,--stats                    output performance statistics
 -t,--type <type>              select the service type: tcp-client,
                               tcp-server, tcp-single-server, jms-client,
                               jms-embedded
 -u,--user <username>          user name for a JMS service
 -v,--verbose <arg>            output processing information
 -w,--password <password>      password for a JMS service

The primary option is -t. It determines which type of data source the bridge will use and which other parameters are required.

A tcp-client requires a port and a host name to connect to and receive data from the data source. A tcp-server or tcp-single-server opens a port for listening. Therefore only the port is necessary. A jms-client requires an URL of a JMS service. In addition that service might require a user name and password.

Beside the configuration for the data source, the server requires a set of Kieker MonitoringRecords, which must be provided by a library. Normally this is kieker-1.6.jar. If a user defines new record types, they must be provided in the same way.

Subsequently, a mapping file (-m filename) must be specified, which references numerical ids to full qualified Java class names, as shown in the following listing.

1=kieker.common.record.flow.trace.operation.BeforeOperationEvent
2=kieker.common.record.flow.trace.operation.AfterOperationEvent
3=kieker.common.record.flow.trace.Trace
10=kieker.common.record.controlflow.OperationExecutionRecord

Id and names are separated by an equal sign (=). Finally, a Kieker configuration file should be specified. If no configuration is provided, the server tries to use the default configutration.

The Eclipse Plugin

When using Kieker in a development, rather than an pure monitoring environment, an integration into Eclipse could be helpful. The Eclipse-plugin for KDB provides such integration.

The Eclipse-plugin comes with a run configuration (Kieker Servie) for KDB, where the same options are available. The configuration is organized in two tabs for the basic connectivity and project, and a second tab to configure the mapping. While the plugin uses normal Kieker confiugration files, the mapping is serialized with Eclipse services and therefore the mapping file from the CLIServer cannot be reused directly.