The Oracle Coherence Incubator contains some really cool bits of code and the Coherence Incubator Commons in particular contains functionality useful in just about any Coherence project. One of the things lacking from the Incubator though is a decent set of “How To” documentation showing examples of its use it. The Incubator was started as a side project by a few already busy guys at Oracle and if they are anything like me documentation often takes a back seat compared to writing code. There are code examples, but you need to work through these to figure things out and they do not cover all the functionality available; and you need to know Coherence pretty well to work out what some of the more advanced Incubator project code does. Unless you attend one of the Oracle Coherence SIG meetings (if there is one near you then you should) where the Incubator sometimes gets presented, you probably don’t know too much about what the Incubator contains either. In this post and maybe a few more forthcoming posts I will try and explain how some of the functionality works and how you might use it.
The Runtime Package
This post is going to be about the com.oracle.coherence.common.runtime package in the Coherence Incubator Commons 2.1.2 release. This package was also in earlier releases but may have had other names. The runtime package is basically a set of classes they allow you to build and realise instances of Java processes, predominantly Coherence processes but it will support other processes too. This allows you to write code to configure, start and stop whole Coherence clusters or individual cluster members very easily.
During the post I mention a few places where the code could be improved, which I will address later in the post when I cover some extensions to the runtime package that make it a bit more useful with a link to the customised source at the very end. If any of the Incubator Team are reading this – don’t take this as a criticism, the Incubator is great but having used the Runtime Package in anger and while writing up this blog I put together a list of things that would make it even more useful.
The main use I have for the runtime package is in testing as it makes writing and running test cases very simple. It can easily be used to run a cluster of a number of different processes as part of an automated build and test cycle.
Builders and Schema
The runtime package is built around a typical builder pattern and contains two types of classes. Schema classes that define a configuration for a process, and Builder classes that take a given schema and realise one or more running instance of a process based on the schema. The Incubator comes with concrete schema and builder implementations for running Coherence processes.
Usage
An example is always better than lots of text, so here is the simplest one I could think of. We will start a simple single Coherence node that will just use the defaults for everything.
ClusterMemberSchema storageNodeSchema = new ClusterMemberSchema(); ClusterMemberBuilder memberBuilder = new ClusterMemberBuilder(); ClusterMember clusterMember = memberBuilder.realize(storageNodeSchema, "storage-node", new SystemApplicationConsole()); Thread.sleep(60000); clusterMember.destroy();
That is pretty much all there is to it, the code above will fork a process to run an instance of com.tangosol.net.DefaultCacheServer with all of the default settings. It then sleeps for 60 seconds before killing the process. If you run the code and look at the processes on your machine you should see the new Java process start and then after a minute, die.
To stop a realised running process you just call the destroy method on the ClusterMember instance.
The Schema Hierarchy
The top class of the Schema class hierarchy is obviously the most basic. The ApplicationSchema has methods on it to set the environment variables for a process and set any command line arguments. The JavaApplicationSchema then extends this to set the class whose main method will be executed, set the classpath, system properties and JVM arguments. The only concrete implementations of the builders and schema classes currently available are to build ClusterMember instances. You can still use the ClusterMemberBuilder and ClusterMemberSchema to configure non-Coherence processes, and with a few tweaks covered later, non-Java processes too.
Fluent Interface
While you could configure a cluster member using just the environment variables and system property settings of the ClusterMemberSchema class it has a number of methods on it to make it life a little easier and make the code a little more obvious. All of the schema class implementations use a fluent style of interface, that is the setter methods always return the instance of the schema so that you can chain calls together in your code like this…
ClusterMemberSchema schema = new ClusterMemberSchema()
        .setStorageEnabled(true)
        .setCacheConfigURI("gridman-cache-config.xml")
        .setPofConfigURI("gridman-pof-config.xml")
        .setClusterPort(10000)
        .setClusterName("GridMan")
        .setMulticastTTL(0)
        .setLocalHostAddress("localhost")
        .setRemoteJMXManagement(true)
        .setJMXPort(40000)
        .setJMXManagementMode(ClusterMemberSchema.JMXManagementMode.LOCAL_ONLY)
        .setJMXAuthentication(false);
As I am sure you will agree that is an elegant interface that makes it easy to see exactly what the intention is. Fluent interfaces such as this are very good for builder type patterns and we use them a lot in the system that I currently work on, which is probably why I find their use in the Incubator so good. It is open to debatable whether the set methods should have been prefixed with, which I think is a more readable fluent interface, for example…
ClusterMemberSchema schema = new ClusterMemberSchema()
        .withStorageEnabled(true)
        .withCacheConfigURI("gridman-cache-config.xml")
        .withPofConfigURI("gridman-pof-config.xml");
…but I’m just being picky really.
Starting a Cluster
Besides the schema and builder for an individual cluster member there is also a ClusterBuilder which can start a whole cluster of processes in one go. This allows you to define the schema once for each of the individual process types in the cluster, for example a storage node and an extend proxy node, and then start the required number of these processes. Here is an example
ClusterMemberSchema storageNodeSchema = new ClusterMemberSchema();
ClusterMemberSchema proxyNodeSchema = new ClusterMemberSchema();
proxyNodeSchema.setStorageEnabled(false)
               .setSystemProperty("tangosol.coherence.extend.enabled", true);
ClusterMemberBuilder memberBuilder = new ClusterMemberBuilder();
ClusterBuilder clusterBuilder = new ClusterBuilder();
clusterBuilder.addBuilder(memberBuilder, storageNodeSchema, "Data", 2);
clusterBuilder.addBuilder(memberBuilder, proxyNodeSchema, "Proxy", 1);
Cluster cluster = clusterBuilder.realize(new SystemApplicationConsole());
Thread.sleep(60000);
cluster.destroy();
Again, the above example does nothing fancy, it starts a cluster using the out of the box configuration built into the Coherence jar file. What we get this time though is a cluster of two storage nodes and an Extend proxy node. The Extend Proxy node is identical to the storage node schema, but just sets the storage disabled flag to false and sets the system property to enable the extend proxy service used in the default cache configuration file. If you run the above code you will see three processes start on your system, then after a minute die.
In the case of the cluster builder’s realize method, what you get back as a return value is a Cluster instance, which encapsulates the group of processes just realised (or started). 
To stop all the running processes of a Cluster you just call the destroy method on the Cluster instance. You could also stop individual members of the cluster by getting the required ClusterMember from the Cluster instance and calling its destroy method.
Unfortunately the Cluster class is very basic and does not give you much. It would be better if internally it held its application list in a Map, as all applications have a name for the key, and then they could be accessed individually. This would make it useful for running JMX queries on an individual process or stopping an individual process to test things like fail-over. Actually the changes for this are very simple to apply to the AbstractApplicationGroup class, which is the super class the Cluster class extends.
public abstract class AbstractApplicationGroup<A extends Application> implements ApplicationGroup<A>
{
    /**
     * The collection of {@link Application}s that belong to the {@link ApplicationGroup}.
     */
    protected LinkedHashMap<String,A> m_applications;
    /**
     * Constructs an {@link AbstractApplicationGroup} given a list of {@link Application}s.
     *
     * @param applications  The list of {@link Application}s in the {@link ApplicationGroup}.
     */
    public AbstractApplicationGroup(List<A> applications)
    {
        m_applications = new LinkedHashMap<String,A>();
        for (A application : applications)
        {
            m_applications.put(application.getName(), application);
        }
    }
    /**
     * Returns the application in this group with the given name or null
     * if no application has been realized with the given name.
     * @param name - the name of the application to get
     *
     * @return the application in this group with the given name or null
     * if no application has been realized with the given name.
     */
    public A getApplication(String name)
    {
        return m_applications.get(name);
    }
    /**
     * {@inheritDoc}
     */
    @Override
    public Iterator<A> iterator()
    {
        return m_applications.values().iterator();
    }
    /**
     * {@inheritDoc}
     */
    @Override
    public void destroy()
    {
        for (A application : m_applications.values())
        {
            if (application != null)
            {
                application.destroy();
            }
        }
    }
}
Output Capturing
If you ran any of the examples above you will have seen that the output from std out and std err (System.out and System.err in Java land) is captured and displayed on console. This is done by passing an instance of ApplicationConsole to the realize method of the builders. In the example above we pass an instance of SystemApplicationConsole which routes the process output to the console of our controlling process. There are other versions of ApplicationConsole in the Incubator that route the output to other places such as a log file, or just loose it if you are not interested in seeing it, and of course you can write your own if you want something else. The format of the captured output is prefixed with the process name so it is easy to identify lines of logging from the different running processes, which is very useful when trying to identify what is happening across the processes of your mini test cluster.
Setting Port Numbers
While the various port numbers used by the cluster members can be configured in the schema as a fixed port number, one very useful feature of the schema is the ability to configure port numbers at runtime to be an available port rather than a hard coded port. This allows you to set ports and realise multiple member of a cluster without having to worry about port clashes. Setting port numbers is done by using the com.oracle.coherence.common.network.AvailablePortIterator class instead of a fixed port number when setting properties. For example suppose we are creating a schema to start a cluster member and we want to set its JMX port, but we are starting multiple members in the cluster so we cannot hard code the port. We could do it by creating a schema per member, which is more code or we could use the AvailablePortIterator like this…
AvailablePortIterator jmxPorts = new AvailablePortIterator(40000);
ClusterMemberSchema schema = new ClusterMemberSchema()
        .setRemoteJMXManagement(true)
        .setJMXPort(jmxPorts);
ClusterMemberBuilder memberBuilder = new ClusterMemberBuilder();
ClusterMember clusterMember = memberBuilder.realize(schema);
The example above will set the JMX port of each process as that process is realised. In this case the JMX port will be set to the first free port >= 40000. The port allocated can be discovered by calling the getRemoteJMXPort() method on the realised cluster member application.
The Application Class
All of the builders in the runtime package return instances of classes implementing the com.oracle.coherence.common.runtime.Application interface. This allows you to interrogate the Application to obtain information about it, such as its environment, System properties etc. All applications also have a name to identify them, which ideally you should make unique in some way.
The Application also has a destroy method, which is how you kill the application when you want it to stop.
The top level com.oracle.coherence.common.runtime.AbstractApplication allows you to get the environment variables and the PID of the forked process. The environment variables will be those used to start the process, if the process itself changes the environment somehow then this will not be reflected in the information available from the application instance.
The next level down is the com.oracle.coherence.common.runtime.JavaConsoleApplication which expands the information available to include System properties and also allows interaction with the process JMX server to make MBean queries. Anyone who knows Coherence knows that being able to do MBean queries is a useful way to know what is going on in a cluster of nodes – even something as useful as knowing all the processes you have just started have now joined the cluster. 
The com.oracle.coherence.common.runtime.ClusterMember application does not add too much more but has some shortcut methods to get the cluster size and service MBean information. 
Enhancements
There are a number of things that I personally would change about the runtime package to make it a bit more useful. One of the beauties of the Incubator is that it has always been a type of open-source in that you can take it and freely modify it to your hearts content. It is not true open source as you cannot get your changes checked back in but the guys at Oracle have always open to hearing good ideas and taking feedback for changes.
Running Simple Non-Java Processes
One change I would make to the runtime package is to make concrete implementations of some of the classes at the top of the builder and schema hierarchies, probably following the normal Coherence naming convention and prefix them with Simple. This, with a tweak to allow passing in of the executable name, as it is currently fixed to be Java, would then allow them to be used to start non-Java applications. One other change would be to allow the working directory to be set in the schema too.
It is easy enough to write your own implementations of the required three classes by subclassing the Abstract classes.
public class SimpleApplicationSchema 
extends AbstractApplicationSchema<SimpleApplication,SimpleApplicationSchema>
{
    /** The name of the executable that will be run */
    private String executable;
    private File directory;
    public SimpleApplicationSchema(String executable) {
        this.executable = executable;
    }
    public String getExecutable() {
        return executable;
    }
    public File getDirectory() {
        return directory;
    }
    public SimpleApplicationSchema setDirectory(File directory) {
        this.directory = directory;
        return this;
    }
    public SimpleApplicationSchema setDirectory(String directory) {
        this.setDirectory(new File(directory));
        return this;
    }
}
Above is the code for the SimpleApplicationSchema which has a couple of additional methods on as already mentioned to get and set the executable name and to get and set the working directory.
public class SimpleApplicationBuilder
extends AbstractApplicationBuilder<SimpleApplication, SimpleApplicationSchema, SimpleApplicationBuilder>
{
    @Override
    public SimpleApplication realize(SimpleApplicationSchema schema, 
                                     String name, ApplicationConsole console) throws IOException {
        ProcessBuilder processBuilder;
        processBuilder = new ProcessBuilder(schema.getExecutable());
        File directory = schema.getDirectory();
        if (directory != null) {
            processBuilder.directory(directory);
        }
        
        // determine the environment variables for the process (based on the Environment Variables Builder)
        Properties environmentVariables = schema.getEnvironmentVariablesBuilder().realize();
        // we always clear down the process environment variables as by default they are inherited from
        // the current process, which is not what we want as it doesn't allow us to create a clean environment
        if (!schema.shouldCloneEnvironment())
        {
            processBuilder.environment().clear();
        }
        // add the environment variables to the process
        for (String variableName : environmentVariables.stringPropertyNames())
        {
            processBuilder.environment().put(variableName, environmentVariables.getProperty(variableName));
        }
        List<String> command = processBuilder.command();
        // add the arguments to the command for the process
        for (String argument : schema.getArguments())
        {
            command.add(argument);
        }
        // start the process
        SimpleApplication application = new SimpleApplication(processBuilder.start(),
                                              name,
                                              console,
                                              environmentVariables);
        // raise lifecycle event for the application
        LifecycleEvent<SimpleApplication> event = new LifecycleStartedEvent<SimpleApplication>(application);
        for (EventProcessor<LifecycleEvent<SimpleApplication>> processor : m_lifecycleEventProcessors)
        {
            processor.process(null, event);
        }
        return application;
    }
}
The SimpleApplicationBuilder above just has to use the values from the schema to create a ProcessBuilder which in turn is used to start the process. The code above is pretty much a stripped down version of the code in the CoherenceMemberBuilder.
public class SimpleApplication extends AbstractApplication
{
    private final Process process;
    public SimpleApplication(Process process, String name, 
                             ApplicationConsole console, Properties environmentVariables)
    {
        super(process, name, console, environmentVariables);
        this.process = process;
    }
}
You can see that it would have been very simple to have put the code above into the Abstract classes but even so, we now have a schema and builder that can run any type of executable. Here is an example.
SimpleApplicationSchema schema = new SimpleApplicationSchema("java");
schema.addArgument("-help");
SimpleApplicationBuilder builder = new SimpleApplicationBuilder();
SimpleApplication application = builder.realize(schema, "java-app");
application.waitFor();
int exitCode = application.exitValue();
In the example above I have used the Java executable, even though the point is to show that we can run any executable. I use a MacBook Pro so I thought choosing an executable that I could run probably would not make much sense to readers who are still to see the light and switch to a Mac instead of a PC. The code above does nothing more than run the Java executable with the -help argument, which will just print the standard Java help text and exit. The code waits for the executable to exit and gets it exit code; all pretty simple really.
Running Non-Coherence Java Processes
The next enhancement I would make to the runtime package is to add a schema and builder to create simple Java processes that are not Coherence cluster members. I know you could make the ClusterMemberScheme and ClusterMemberBuilder run any Java process, but for completeness it would be nice to have these classes. If you look at the runtime package you will see there is already a JavaConsoleApplication, but the schema and builder are abstract. Again it would not be hard to make them concrete classes or provide your own implementations in the same way we did above for non-Java processes.
public class SimpleJavaApplicationSchema extends AbstractJavaApplicationSchema {
    public SimpleJavaApplicationSchema(String applicationClassName) {
        super(applicationClassName);
    }
    public SimpleJavaApplicationSchema(String applicationClassName, String classPath) {
        super(applicationClassName, classPath);
    }
}
Above is the schema for a simple Java application; you can see it actually adds nothing to the abstract parent class.
public class SimpleJavaApplicationBuilder 
extends AbstractJavaApplicationBuilder 
implements JavaApplicationBuilder {
    public SimpleJavaApplicationBuilder() {
    }
}
And here is the builder, which again adds nothing to the abstract class apart from making it a concrete implementation that we can use.
Putting them all together we can easily build and run a non-Coherence Java class. For example, if we want to run a class called TestApp and pass in some System properties and arguments to its main method we can do this…
SimpleJavaApplicationSchema schema = 
        new SimpleJavaApplicationSchema(TestApp.class.getCanonicalName())
            .setArgument("arg1")
            .setArgument("arg2")
            .setSystemProperty("prop1", "value1");
SimpleJavaApplicationBuilder builder = new SimpleJavaApplicationBuilder();
CapturingApplicationConsole console = new CapturingApplicationConsole();
JavaApplication application = builder.realize(schema, "java", console);
So now we have enough schema and builders to run any type of process.
Process Return Code (Exit Values)
Sometimes we might be using the builder to run a process that unlike a cluster member only lives for a short time. In this case we might want to wait for it to finish and/or obtain its exit value. This is easy to do with the addition of two methods to the AbstractApplication class
    /**
     * causes the current thread to wait, if necessary, until the
     * process represented by this <code>Process</code> object has
     * terminated. This method returns
     * immediately if the subprocess has already terminated. If the
     * subprocess has not yet terminated, the calling thread will be
     * blocked until the subprocess exits.
     *
     * @return     the exit value of the process. By convention,
     *             <code>0</code> indicates normal termination.
     * @exception  InterruptedException  if the current thread is
     *             {@linkplain Thread#interrupt() interrupted} by another
     *             thread while it is waiting, then the wait is ended and
     *             an {@link InterruptedException} is thrown.
     */
    public int waitFor() throws InterruptedException
    {
        return this.m_process.waitFor();
    }
    /**
     * Returns the exit value for the subprocess.
     *
     * @return  the exit value of the subprocess represented by this
     *          <code>Process</code> object. by convention, the value
     *          <code>0</code> indicates normal termination.
     * @exception  IllegalThreadStateException  if the subprocess represented
     *             by this <code>Process</code> object has not yet terminated.
     */
    public int exitValue()
    {
        return m_process.exitValue();
    }
Running Nodes In-Process
Now we get to probably the biggest enhancement and in my opinion, most useful. The one problem that the runtime package has for me is that all of these processes are forked processes that run outside of the JVM that created them. For some use cases this is fine, I often run a mini cluster locally to do ad-hoc testing, but for real testing in particular as part of a build process, I do not want to have the chance of leaving orphaned processes behind – or “hanging chads” as they were once described to me (I think you need to be American to get that one) – it would be great if we could configure the schema so the the builder realised processes to run inside the same JVM as the controlling code.
As anyone who knows me knows, some colleagues and I have had code that does this for quite a while now so after a bit of work I patched the Coherence Incubator Commons code to allow processes to be configured to be optionally run in-process or forked as separate processes. When running in process the code needed to do the following:
- Isolate each pseudo-processes classpath
- Isolate each pseudo-processes System properties
- Isolate each pseudo-processes JMX server
The code works a bit like a Java application server in the way that the pseudo-processes are isolated by ClassLoader. Each psuedo-process has a ChildFirstClassLoader which will always load classes from its own classpath before going up to the parent loader. This class loader is also used to isolate the system properties and JMX server. 
The usage of the runtime package is identical to the examples above with the addition of extra methods on the AbstractJavaApplicationSchema to specify whether the realised processes run in the same JVM or externally.
If we repeat the example above where we start a cluster we can now see how easy it is to make this cluster run inside a single JVM.
ClusterMemberSchema storageNodeSchema = new ClusterMemberSchema().runInProcess();
ClusterMemberSchema proxyNodeSchema = new ClusterMemberSchema();
proxyNodeSchema.runInProcess()
               .setStorageEnabled(false)
               .setSystemProperty("tangosol.coherence.extend.enabled", true);
ClusterMemberBuilder memberBuilder = new ClusterMemberBuilder();
ClusterBuilder clusterBuilder = new ClusterBuilder();
clusterBuilder.addBuilder(memberBuilder, storageNodeSchema, "Data", 2);
clusterBuilder.addBuilder(memberBuilder, proxyNodeSchema, "Proxy", 1);
Cluster cluster = clusterBuilder.realize(new SystemApplicationConsole());
Thread.sleep(60000);
cluster.destroy();
The code above is identical to the previous example but this time the runInProcess() method is called on the storageNodeSchema and proxyNodeSchema so that they run in a single JVM. If you run this code and look at the processes running on your machine you will see that there are no new Java processes started like there were last time. 
Obviously there are some caveats with this technique. You will certainly need to run the process with a large heap size as it is running a whole cluster in one JVM. You will also need to increase the size of the JVM’s Perm Gen, which is where the JVM stores classes as we now have multiple class loaders. Typically we run this type of process with the -XX:MaxPermSize=256m JVM parameter, which is fine for most things although if you try running a psuedo-cluster with a lot of nodes you might need to go higher. Another setting we use when running in a single JVM is -Xss64k which will reduce the amount of memory used for the stack for each thread. As you can imagine running a cluster in a single JVM can start a lot of threads which all consume memory for their stacks. Reducing the stack size is not an issue for our tests and builds but could be if you have very deep recursive method calls which need a big stack.
The Source Code
So that’s it, the Commons Runtime package is a simple elegant way to start processes which makes it especially useful for building test harnesses. The fact that you can also modify it to run the processes in a single JVM make it ideal to build in-process cluster that you can run as part of a functional test suite in a continuous integration environment. The changes above, especially the in-process changes have been something I have had coded for a while now so it is quite easy to back-port them to earlier versions of the Incubator and hence earlier versions of Oracle Coherence.
The source code for the Incubator Commons is available from the Oracle Coherence Incubator Commons site
The customised and expanded version of Commons 2.1.2 discussed above is available here coherence-common-2.1.2.gridman.zip. The customised code also includes the start of some unit test, yes I know, the tests should have been done first and all that, but to compile and run the tests then as well as the normal Incubator Commons dependencies you will need to add Mockito mockito-core-1.8.5.jar and Hamcrest hamcrest-all-1.2.jar which are available from the sites for those tools. 
Sweet! I’ve face problems starting multiple a private clusters in the same JVM – all because of the Coherence singletons in the classloader.
BTW, do you plan on putting this on Github or Google code?
I have not put the code on public repositories yet as ideally I would like the changes incorporated back into the Incubator Commons source code so everyone gets them from there rather than patching them from my code. We will have to see what the guys at Oracle think about it.