17 March 2017

Lightning Memory-mapped Database on OSGi

Now and then i get what i call a niche requirement, something that does not hold to any standard or good practice. These kind of requirements are usually very development intensive, time consuming and great fun, or a real PITA when having to deal with a deadline.

I am in need of a fast, local, in process atomic counter for daily file name creation, which is thread and crash save, runs within a Java/OSGi environment and can participate in a XA transactions. The files being published are getting consumed and modified by other applications as soon as they are marked with a done file, making a stateless file counting solution impossible. A RDBMS springs to mind thinking of state, but the overhead in production is unproportional as the data generated is only valid for a short period of time and in limited scope.

After turning my toolbox upside down, finding nothing even remotely fitting this job description, I heated up the search engines ready to burn time. My data structure is simple and the amount of data generated is small so a simple key/value store should be just right. This results in a long hit list, but only few libraries support file persistence, and only one made a near perfect match: lmdb. Let me emphase, near perfect match, since it is no Java, no OSGi and has no transactional integration.

The Symas page lists wrappers, and sure enough, java is listed with a JNI implementation called lmdbjni. Strange thing is that this binary has no more Maven support for Windows, MacOS and Android from 0.4.7 onward, bummer. While browsing the github repo i stumbled (by accident via RxLMDB) upon lmdbjava supporting the latest and greatest lmdb version.

So, Java is tackled with a big thanks to the people of lmdbjava. OSGi brings its own mechanism for handeling runtime files and since version 0.0.6 of  lmdbjava the path to the lmdb binary, extracted by OSGi, can be set manually to allow the JNI part to link to it. A example bundle is on github and i leave the JCA integration as exercise.

Happy coding.

2 March 2017

Twelvemonkeys on OSGi

Using Java for working with images has never been first choice, especially if the requirement for advanced computer vision functionalities needs to be met. For these cases please study OpenCV and JavaCV.
For the more daily use cases there is of course the standard Java ImageIO, providing limited image format support. My project required extensive MTIFF with CCITT compression capabilities and the good old JAI-ImageIO seemed to be a good candidate. But MTIFF support requires some extra steps to be taken, and the 10 jear old code base is missing active development. I wanted something fresh and after some research i found myself staring at the github page of twelvemonkeys ImageIO.
TwelveMonkeys is a pure Java based collection of plugins and extensions for Java's ImageIO with few optional dependencies, and under active development. It provides all i need, with a flick of the service provider switch. But lets not make this too easy now. The library is not bundled like JAI, and since i am a heavy OSGi user i want to put it on Karaf.
The first thing to do is get the jars in a bundle, ready for Karaf deployment. The Apache-felix-maven-bundle-plugin tool brings the solution in the form of embedding dependencies, putting the twelvemonkeys jars inside of a single bundle. The registration of these plugins with ImageIO gets handled by the classloader through the javax.imageio.spi.ServiceRegistry. This is a type of Java service loader responseble for finding ImageIO extensions in the classpath under the META-INF/services folder, bringing us straight to the core of our problem. Every OSGi bundle has its own classloader, meaning any image plugin loaded in one bundle is not visible to ImageIO in others. This is also true for every other library on OSGi using the java service loader mechanism.
The OSGi Enterprise R5 ServiceLoader Mediator specs discribe a standard solution for this kind of integration problem, and the reference implementation is Apache Aries Spi Fly. Under the hood it uses some fancy runtime bytecode weaving, resulting in a short classloader switch as soon as the ImageIO Serviceregistry in a bundle starts to look for plugins on the classpath, loading the plugins from the bundle containing the twelvemonkeys embedded dependencies.
Getting the MTIFF utilities in the contrib package functional under OSGi is as simple as adding a  blueprint or declarative service to the bundle containing the twelvemonkeys jars.

Sample project on github.

Recommended reading:

OSGi Alliance javautilserviceloader in OSGi

10 September 2016

A Bio mimetic human retina model with JavaCV

In my job as a software developer for HMM Deutschland, i received a requirement to preprocess and enhance binary images of documents taken in the field with a Tablet or Smartphone. The people making these pictures are not professional photographers and do not have the knowledge nor the time to model the lighting to produce a perfect illuminated and sharp picture of the document. The result is loss of detail and strong luminance changes within one image, making it hard to process further down the pipeline.
Nature already solved the problem at hand , and as we may take it for granted, our vision is a expert in dealing with such circumstances. Jeanny Herault's research (1) has produced a model for human retina spatio-temporal image processing, performing texture analysis with enhanced signal to noise ratio and enhanced details robust against input images luminance ranges. Briefly, here are the main human retina model properties:

  • spectral whitening (mid-frequency details enhancement)
  • high frequency spatio-temporal noise reduction (temporal noise and high frequency spatial noise are minimized)
  • low frequency luminance reduction (luminance range compression) : high luminance regions do not hide details in darker regions anymore
  • local logarithmic luminance compression allows details to be enhanced even in low light conditions
This is not a complete representation of our complex human vision system but it already presents interesting properties that can be involved for a enhanced image processing experience.
OpenCV has a implementation of this model in its contributed section under the bioinspired module(2). OpenCV documentation can be found under 3. 
Since i favor OpenCV on caffeine i added the bioinspired module to the JavaCV project. At this time it is parked on pull request #282 in the javacpp-preset project, meaning for now you have to build it yourself. Samuel Audet did a great job with JavaCPP and describes how to do this on the Bytedeco github pages(4).
A sample project can be found on my github(5) which produced the following output:



The first image is a picture of a document taken by my mobile phone having a lot of noise and luminance variation. The second picture was processed with the retina model using its standard settings. The luminance difference and noise were greatly reduced. The retina model has quite some parameters to play with so have some fun.

Update: available as of version 1.2.2


Acknowledgments

HMM Deutschland GmbH for investing work-time into the project



  1. Herault Jeanny. Vision: Images, Signals and Neural Networks-Models of Neural Processing in Visual Perception. World Scientific, 2010.
  2. https://github.com/opencv/opencv_contrib
  3. http://docs.opencv.org/3.1.0/d3/d86/tutorial_bioinspired_retina_model.html
  4. https://github.com/bytedeco
  5. https://github.com/Maurice-Betzel/net.betzel.bytedeco.javacv.bioinspired

11 July 2016

ServiceMix-7.0.0.M2 Camel Context not resolving

ServiceMix 7.0.0.M2 containing Karaf 4.0.5 installs on first boot the Blueprint Core version 1.6.0  which clashes on XSD level with the containing Camel Blueprint and produces the following error if you deploy a bundle with a Camel context:

Unable to start blueprint container for bundle xxx/x.x.x.SNAPSHOT
org.xml.sax.SAXParseException: src-import.3.1: The namespace attribute, 'http://aries.apache.org/blueprint/xmlns/blueprint-ext/v1.0.0',
of an <import> element information item must be identical to the targetNamespace attribute,
'http://camel.apache.org/schema/blueprint', of the imported document.


Strange thing is, in the parent pom version 1.6.1 is declared. I did not search for the root cause of this issue since the 1.6.1 has issues with offline enviroments.
To get this solved we override the old versions with 1.6.2.  In the ServiceMix build just by add the following to the overrides.properties:

# workaround for Blueprint Core version 1.6.0 Camel XSD alignment issue, force usage of version 1.6.2
mvn:org.apache.aries.blueprint/org.apache.aries.blueprint.core/${aries.blueprint.core.version};range="[1,2)"

and the next to <installedBundles> in the assembly POM:

<bundle>mvn:org.apache.aries.blueprint/org.apache.aries.blueprint.core/${aries.blueprint.core.version}</bundle>

This forces Karaf to use the 1.6.2 version of Blueprint Core. Do not forget to increase the property placeholder in the parent pom.

10 March 2016

KIE Server Apache Thrift extension

I am using the JBoss KIE Server and Drools for some time now in the Java Domain, telling my colleagues, who are mostly PHP experts, how great it is to get the spaghetti code out of my imperative code base. After getting on there nerves one too many times, they nailed me to give a presentation on the subject and find a way for letting the PHP fraction benefit from this great piece of technology.
Since i already introduced the Apache Thrift(1) protocol (without the underlying transport part of the Thrift stack) a while ago for binary integration with Java based micro-services, it seemed natural to extend the KIE Server ReST transport with Apache Thrift. The JBoss Principal Software engineer Maciej Swiderski wrote a great blog(2) about the new possibilities for extending the KIE Server.

So why add Apache Thrift to the equation, since we already have JSON/XML and even SOAP right out of the KIE Server box?
  • Speed
    • very few bytes on the wire, cheap (de)serialization  (think large object graphs)
  • Validation
    •  encoding the semantics of the business objects once in a Thrift IDL scheme for all services, for all supported languages, in a object oriented typed manner (as supported by the target language)
  • Natural language bindings
    •  for example, Java uses ArrayList. C++ uses std::vector
  • Backward Compatibility
    •  The protocol is allowed to evolve over time, with certain constraints, without breaking already running implementations

There are a number of Protocols to choose from within Apache Thrift, but for optimal performance there is only one: TCompactProtocol. It is the most compact binary format which is typically more efficient to process than the other protocols.

The project is published on github(3) and consists mainly of  two parts. The org.kie.server.thrift repo and the thrift-maven-plugin repo. Please build the thrift-maven-plugin first as it is a dependency for the server. It contains the Thrift compiler version 0.9.2 for Windows and Linux (tested on Centos / RHEL) for compiling the Thrift IDL files.
The org.kie.server.thrift repo dowloads the KIE Server war file, extracts it, adds the Thrift extension and repackages the sources into a new war file ready for deployment. Tested on Wildfly 8.2.1.
How to setup a KIE server and accompanying workbench is explained under (4).
Test facts and rules with matching PHP and Java clients are also provided under (3).

Workflow



Workflow

Architecture

From the view point of the KIE Server there is a known and a unknown model. The known model consists of the objects used internaly by the KIE Server to handle its commands (Command pattern). These objects are mirrored to IDL in the kie-server-protocol maven module to make them available to all Thrift supported languages. The unknown model is ofcourse the graph of objects that needs to be transported into the KIE core engine for execution. The unknown model must also be designed with the Thrift IDL so all objects that have to pass the Thrift protocol layer are of type TBase. These two object models force a sequential two step (de)serialization.
In the first step the known model is (de)serialized revealing the KIE Server internal objects. This gets handled by the Thrift message reader and writer class that get registered on the resteasy framework, as used by Wildfly, for the application/xthrift content type.  These known objects contain binary fields holding the unknown object bytes.
For the second deserialization I am forced to use a not so smooth a trick. Since there is no way to tell by the bytes what type is represented (due to the compactness of  the Thrift TCompactProtocol which does not deliver a class name like xstream),  the fully qualified Java class name from the IDL generated objects must be provided within the transporting known KIE Server objects. Now the (de)serialisation can take place using the classloader from the deployed KIE Container holding the unknown object model.  On the client side deserialisation after reply is easy as the models are both known.
To allow other languages the use of Java objects like BigDecimal, which is great for monetary calculations, there are integrated converters with Thrift IDL representations in the maven kie-server-java module to ease development. If such a TBase Java representation is not wrapped within another struct it is converted automaticaly. Wrapped represenrations (forced to bind to a TBase type) can make use of the static conversion helper methods.

Please study the source code for further details on the combination of technologies used.

Acknowledgments

HMM Deutschland GmbH for investing work-time into the project
Maciej Swiderski for his informative blog http://mswiderski.blogspot.de
My colleague Alexander Knyn, for being my PHP integration sparring partner

Links:


(1) Apache Thrift
(2) Extending the KIE Server

(3) KIE Server Apache Thrift extension
(4) Installing KIE Server and Workbench