Tuesday, January 17, 2012

GroovyMX: New Remote JVM Test Cases

I just completed a check in to gmx with a small number of critical test cases.
I am attempting to make gmx a Groovy oriented but Java usable API, so for the most part, each logical test case has a Java and a Groovy version. Here's an example of a Groovy test:

    public void testRemoteClosureForMBeanCountAndDomains() throws Exception {
        def port = 18900;
        def gmx = null;
        def jvmProcess = null;
        try {
            jvmProcess = JVMLauncher.newJVMLauncher().timeout(120000).basicPortJmx(port).start();
            gmx = Gmx.remote(jmxUrl(port));
            def remoteDomains = gmx.exec({ return it.getDomains();});
            def domains = gmx.getDomains();
            Assert.assertArrayEquals("Domain array", domains, remoteDomains);
            def remoteMBeanCount = gmx.exec({ return it.getMBeanCount();});
            def mbeanCount = gmx.getMBeanCount();
            Assert.assertEquals("MBean Count", mbeanCount, remoteMBeanCount);
        } finally {
            if(gmx!=null) try { gmx.close(); } catch (Exception e) {}
            if(jvmProcess!=null) try { jvmProcess.destroy(); } catch (Exception e) {}           
        }
    }
In a nutshell, this test:
  1. Starts a new JVM with  the JMX management agent enabled.
  2. Retrieves the MBeanServer's MBean domains using a remote JMX call and then using a remoted closure. Verifies they're equal.
  3. Retrieves the MBeanServer's MBean count using a remote JMX call and then using a remoted closure. Verifies they're equal.


The latest snapshot is available here:
 Of course, these tests raised a bunch of issues which I am in arrears entering into Github. If you encounter any issues, please leave me some feedback.

//Nicholas

Friday, January 13, 2012

Remoting Groovy with Generated Closures

On attempting to implement remote closure execution in Gmx, I envisioned simply generating a closure on the fly and passing it over the wire to a remote Gmx counterpart.
In my mind, it looked a bit like this very simple example that would print "Hello Jupiter" to the standard out on the target remote server:


This is the simplest version of the idea, and so serves as a good proof of concept.

Gmx already has a lot of the plumbing in place, namely:
  • The ReverseClassLoader service spins up an HTTP server to provide remote class loading to foreign MBeanServers and performs an acceptable job of locating and serving the bytecode (in the form of a byte[]) to requesting classloaders.
  • Gmx automatically detects when remoting will be required and starts the ReverseClassLoader and installs the remote counterpart (RemotableMBeanServer) on the foreign MBeanServer. (This means that some of the code in this example is redundant).
The major barrier I ran into was discovering that it is quite difficult to acquire the bytecode (the class representation in the form of bytes) of a closure compiled on the fly. Excuse my use of an imprecise term like "on the fly". What I mean is that if a closure undergoes a formal compilation process, such as using groovyc or a CompilationUnit, then the class itself is accessible from the java code source URL in the form of a stream of bytes. However, if you define an inline closure in an inline script and execute it (such as you might in GroovyConsole),  the class appears to have a "fake" code source URL that cannot be read from. I am not that familiar with the internals of Groovy so this needs further explanation by way of an example:


The output of this script, when run in GroovyConsole, is as follows:

Class [Name:ClosureFactory$_getClosure_closure1] Bytes:1192 Interfaces: [interface org.codehaus.groovy.runtime.GeneratedClosure] 
Hello Venus 
Exception:/groovy/shell (The system cannot find the file specified) 
Class [Name:ConsoleScript5$_run_closure1] Bytes:[] Interfaces: [interface org.codehaus.groovy.runtime.GeneratedClosure] 
Hello Jupiter 


In lines 4..6, the example defines a class that creates a closure and returns it from a call to getClosure(). Using the CompilationUnit is the equivalent of using groovyc. When the byte code of the closure is read using clozure.getClass().getProtectionDomain().getCodeSource().getLocation().getBytes(). In the first instance, this is successful. However, when script defines an "on the fly" closure on line 16 (the closure works fine, as can be seen on line 26), the same method does not work. In support of this, the bytecode for the "compiled" closure is writen to disk. The value of the URL returned by clozureClass.getProtectionDomain().getCodeSource().getLocation() is file:/home/nwhitehe/groovy/groovy-1.8.5/./ and the following can be seen in that directory:

-rw-r--r-- 1 nwhitehe nwhitehe 5893 2012-01-13 11:54 ClosureFactory.class 
-rw-r--r-- 1 nwhitehe nwhitehe 2454 2012-01-13 11:54 ClosureFactory$_getClosure_closure1.class 

For the "on the fly" closure on the other hand, the script gets an exception when attempting to read the bytes from the code source URL pointing to a file /groovy/shell which does not exist. The bytecode disappears off into the aether.

There are a few hints on this challenge in a Jira ticket filed at the end of 2010 titled Ability to get class bytes of closures at runtime, including nested closures (for remote control). RemoteControl is a groovy package for groovy closure remoting, so this seemed like a good place to start, but the documentation [more concisely than I did above] alerts the reader to the same problem:
The remote execution mechanism works by sending the definition of the closure class to the server. It does this by finding the corresponding .class file for the closure on the class path. This means that there must be a .class file for the closure on the class path for the closure to be able to be remotely executed. Closure's whose class has been generated dynamically at runtime are currently not supported.
The Jira issue also points to a script by Guillaume Laforge that outlines a method of finding nested closures with an advisory that this might be useful in resolving the problem, but the script uses a CompilationUnit to acquire the bytecode so it was not clear to me if that strategy would work.

Aside from using the Class/ProtectionDomain/CodeSource/Location.getBytes method to get class bytecode, another path I have used is Javassist which did not work, returning a null CtClass when requested from the ClassPool.

The solution that ended up working was using a ClassFileTransformer. This interface is defined in the java.lang.instrument package and instance of it can be registered with the JVM's Instrumentation
instance. It has a single method:

byte[] transform(ClassLoader loader, String className, Class classBeingRedefined, ProtectionDomain protectionDomain, byte[] classfileBuffer)

Accordingly, a ClassFileTransformer can provide the bytecode of a class. For our purposes, it needs to:
  • Be registered before the class loads or be invoked by a request to retransform the target class.
  • Filters out the class that has been targeted.
The following is a contrived example that retrieves a flat (i.e. no nested closures) closure's class byte code. The instrumentation instance is provided by a Gmx utility class called ByteCodeRepository. (More on that later).


In short, the class file transformer captures the bytecode of the class it is configured to capture. Once the bytecode has been captured, the transformer is unregistered. In order to trigger a call to the transformer, the instrumentation instance requests a retransform on the closure's class. Acquiring an instrumentation can be simplified, but the underlying mechanics are complicated. The JVM's instrumentation instance is not automatically created or available. It is typically created when the JVM is started with a -javaagent JVM startup option.

Alternatively, it is possible to use the Java Attach API to load a Java Agent into a running JVM which will trigger the creation of an instrumentation instance. This step is implemented by the Gmx utility class LocalAgentInstaller's static method getInstrumentation().

That's pretty much all that is required to get a reference to the instrumentation with some caveats:
  • The Attach API is only supported in Java 1.6+
  • The Attach API is variably implemented by different JVMs. You may have trouble with IBM JVMs, for example.
  • The Attach API is contained in the JDK's tools.jar so if you're using the JRE, you need to specifically add tools.jar to the classpath.
Gmx uses a reflective wrapper around the Attach API to avoid sticky compilation sores, but the details can be seen in the eponymous classes in the org.helios.vm package.

The next piece of Gmx is the ByteCodeRepository class. It is a singleton and its functionality is as follows:
  • It is a ClassFileTransformer that targets classes that implement org.codehaus.groovy.runtime.GeneratedClosure. These are typically the closures we're looking for. In my parlance, GeneratedClosure means "compiled on the fly". See this gist for the transformer basics.
  • It is a caching repository and cross-indexer for closure classes, bytecode and class names (binary and resource).
  • It automatically loads an instrumentation instance, so one can ignore the LocalAgentInstaller when using the ByteCodeRepository.

Capturing a closure's bytecode is now siplified to:


There's one subtlety left [that will be discussed here] in the quest to get bytecode for remoting closures and that's the rider on the Groovy Jira issue I mentioned, namely "including nested closures". Consider a closure like this that prints the declared methos signatures methods for each class in an array:

def Class[] classes = .....; 
classes.each() { 
     it.getDeclaredMethods().each() { 
          println it.toGenericString(); 
     } 

That's a closure within a closure, and they're two different classes. The bytecode for the outer closure does not contain the bytecode for the inner, so the earlier examples have hidden this problem. The class file transformer will still see the inner closure, and even though the name of the inner closure class is not as obvious (we could derive or guess it), here's why you don't need it:
  1. For the pruposes of remoting, so long at the ByteCodeRepository is installed, it will have captured all closures and indexed them by class name. When the remote class loader gets a serialized closure to invoke, it will surely know and will request each class from the ReverseClassLoader, which in turn looks them up in the ByteCodeRepository. 
  2. The following code prints (among other things that my optimizing editor has elided) the names of the loaded generated closures in this modified example. 


The output is:

Generated Closure: ClosureFactory3$_getClosure_closure1
Generated Closure: ClosureFactory3$_getClosure_closure1_closure2
All this nonsense was dedicated to getting all the bytecode necessary to remote closures, which I have not addressed at all, but I will in the next post.

Cheers.



Friday, January 06, 2012

GroovyMX: A Groovy JMX Client


I bootstrapped a new project on Github called GroovyMX (or just gmx). It is a monitoring oriented API but you may find it useful for various things. In the spirit of Groovy SQL, I am attempting to provide a JMX client API that is rich in functionality, terse in code and that extends the natural abilities of the native Java client.

This is a quick example which demonstrates how to connect to a remote MBeanServer and list the committed memory in bytes of each of the JVM's Memory Pools:


The output of this script is:

java.lang:type=MemoryPool,name=PS Eden Space:  402653184
java.lang:type=MemoryPool,name=PS Survivor Space:   16777216
java.lang:type=MemoryPool,name=Code Cache:  3407872
java.lang:type=MemoryPool,name=PS Perm Gen: 84738048
java.lang:type=MemoryPool,name=PS Old Gen:  268435456
 
Briefly,what this code is doing:
  1. This is usually the only import you will need.
  2. The Gmx class represents an MBeanServer, or more specifically, an MBeanServerConnection. There are a few ways to acquire one, depending on the situation, but in this case, I am connecting to a remote MBeanServer using its JMXServiceURL.
  3. The mbeans method has several overloads. In this case, I am providing an ObjectName pattern that will match to all the JVM's MemoryPool MXBeans, and a closure which executes for each ObjectName returned. 
  4. The closure is passed an instance of a MetaMBean which is notionally a proxy that combines the ObjectName of the MBean that it represents, and a MBeanServerConnection to the MBeanServer where the MBean is registered (the Gmx). In as much as possible, MetaMBeans act just like regular Pojos (or Pogos)  so the MBean attributes are accessed as simple properties and MBean operations are invoked like regular methods.
  5. The MetaMBean also has various local properties which are also accessed as simple properties in a Pogo. An example of this is objectName in line 4.
  6. The MemoryPool MBeans publish an attribute called Usage which is an instance of CompositeData. Fortunately, Groovy allows the simple reference of the composite sub-values by simple dot notation so the expression it.Usage.committed retrieves the nested value from the Usage composite structure that is keyed by the key committed
  7. Keen observers might wonder why Usage is cased that way. This is because the Groovy property specifier Usage is directly mapped to the MemoryPool MBean attribute Usage. Since it is perfectly legal for an MBean to have two separate attributes Foo and foo, the MetaMBean only honors the correctly specified case.
One of the tricky parts of making MetaMBean behave like a regular Pojo is that MBean operation invocations require the exact signature of the operation to be passed as an argument, which is very exacting and JMX can be quite fussy in this regard. Unlike regular Java and Groovy, there is no implicit [un/]boxing or inheritance easing done on your behalf. Consider these signature pairs:

void foo(int)                void foo(Integer)
void log(CharSequence)       void log(String)

If these were MBeanOperations, they would represent two different signatures, so the tricky part is inspecting the MetaMBean invocation name and arguments and executing a multidimensional pattern match against the MBean provided MBeanOperationInfos.

Gmx is also integrated with the Java Attach API so another way of acquiring a Gmx instance is to specify the PID of the JVM you want to attach to. The following example illustrates a script that discovers all (Attach API compatible) JVMs on the local machine and then prints the MBeanServerID attribute for each from the MBeanServerDelegate MBean.


The output of this script is:

helios104_1324051453215
helios104_1325860886519
helios104_1325876067431
helios104_1325782180150

Not super interesting, but I think the brevity is great.
This last example demonstrates some of the powerful optimizations of the Groovy remoting implemented by Gmx. To contrive an example, consider determining the total number of thread blocks across all threads in a JVM. Rather than retrieving every ThreadInfo from the ThreadMXBean and computing the total locally, I install the Gmx remoting on the remote MBeanServer and pass a script to perform the computation on the remote JVM and then return the result.

 The raw script passing is a bit awkward, and I'm working on implementing seamless remote closure invocation (See Issue #15).  and native closures are now supported. See here and here.

There are several more features which are complete, defective, in the roadmap/documentation or just in my imagination, but if you are interested, please check out the Gmx GitHub Site's Source Code, Issues and Wiki. I just started this recently so its not ready for a release, but you can download a snapshot from my Cloudbees snapshot repository.

Snapshots:
The dependencies can be viewed in the Gmx Maven Pom.

There are some additional examples in the project unit tests:
If you have any feedback, please drop a comment on this blog.

Cheers.

Monday, August 15, 2011

Java 7 for .Net - IKVM Not Slouching

I've always been impressed with IKVM, a compiler that converts Java Byte Code into .NET CLR byte code in the form of assemblies. It is astonishing how much core (and a lot of non-core) Java re-compiles into .Net with relatively little effort.

Shortly after the release of Java 7, the IKVM team has already released a development snapshot of IKVM supporting it.

After a massive amount of work, we finally have a new development snapshot based on OpenJDK 7 b147. There is still a lot of work to do to implement all the new functionality, but at least all the OpenJDK code has now been integrated.
Nice work guys.

Tuesday, July 12, 2011

JBoss AS7 Wicked Fast Startup, But Has Serious Deal Breaker ?

The new JBoss AS7 Release definitely has a blazingly fast startup. When I ran it for the first time:

14:26:24,770 INFO  [org.jboss.as] (Controller Boot Thread) JBoss AS 7.0.0.Final "Lightning" started in 1489ms

And not for nothing, it has a wicked fast shutdown too !

14:39:03,703 INFO  [org.jboss.as] JBoss AS 7.0.0.Final "Lightning" stopped in 7ms

A colleague commented:


"Yes, I've already been heckling the Tomcat guys, since they staked all their arguments on startup times. :)"

So that's all very exciting (I mean the whole AS7) but I have to say that the one deal breaker is the absence of JMX-Console. (Note: I might be wrong here, but it sure doesn't look like its in there) For those not in the know, this is a very spartan, but genius and super functional admin and development console for JBoss AS. I always loved the darn thing. It cuts straight to the chase giving you the info and operation invocation links you need without any of the fluff other application servers seem to insist on putting in.

The JNDIViewer alone was incredibly useful and if more widely used would preempt a bunch a problems because developers can actually see what's bound in JNDI !  What a concept !

So has JMX been eschewed completely ? I seriously doubt it. Uh... wait...  what is that.... 
I mean, I seriously hope not, that's what it is.

At any rate, it seems the platform MBeanServer has a few bits and pieces registered in it, and I suspect (or is it hope ?)  there is more that is optionally configurable.




Thursday, May 19, 2011

Graphite Server Virtual Box Appliance

I have made some additional improvements to my automated Graphite installer. Last time I posted the install script, it was admittedly  missing several things, but the new one goes from ./install.sh to the Graphite web console in one go.

At any rate, the new installer has significantly eased the process of creating VM appliances for Graphite which I have made available for download if you are interested in taking Graphite for a test drive, or you need a pre-built development environment. Since the VMs, as configured, are resource limited, I am not sure if they are suitable for any production use, but if you bump up the CPUs, the memory and modify the setup to use some fast native (or SAN) disks for the Whisper storage, I imagine they would be pretty decent.

Here's some details you should know:

  • Created with Virtual Box 4.0.6
  • The OS user is helios. The password is helios.
  • The Graphite admin user is helios. The password is helios. Now that I think about it, the user's email address is graphite@heliosdev.org so you might want to review my summary below on how you can change this.
  • Once you start the VM, the Graphite Console is available at http://localhost and remotely at whatever IP you designate. (Out of the box, the NIC is configured for bridging).
  • The Graphite install script and supporting files are in /home/helios/graphite.
  • The carbon listener is started by /etc/init.d/carbon which runs the process as user www-data and I recommend you stick to using it. If you start carbon directly with sudo, the process will run as root and may create files that are subsequently not readable by www-data.
  • The carbon [linetext] listener listens on port 2003.

Changing the Graphite Admin User

The user accounts are stored in a sqlite database/file /opt/graphite/storage/graphite.db,  specifically in a table called auth_user. You can run the sqlitebrowser GUI tool and edit the file directly  (using sudo). Or, you can delete the file (using sudo) and re-initialize as follows:  (using sudo) 

  1. cd /opt/graphite/webapp/graphite 
  2. sudo python manage.py syncdb
This will recreate the database and prompt you to create a new admin user.

And I thank HostMonster for their nicely priced hosting that affords me the ability to serve these files.

Next post perhaps I will explain:
  • Why did I build this appliance using Desktop instead of Server ?
  • Where is the VMware version of this appliance ?
  • Where and when is the OpenTrace client for passing collected JMX metrics to Graphite ?

Saturday, May 14, 2011

Graphite - Great App, Laborious Install (Until now ?)

Graphite is a great APM supporting tool. It is basically (and broadly speaking)  a python based server (Linux only AFAICT) with 3 components:
  1. Whisper: An RRD like database which is quite fast.
  2. Carbon: A high performance cache and network listener that listens for submitted data and stashes it in Whisper.
  3. Graphite: A web application that allows you to visualize and report data historically or in real time.
The data submission protocol and data format is very simple so wherever you collect performance data from, it's easy to deliver it to Graphite. You can optionally submit data through RabbitMQ but I have observed that the basic Text over socket is better performing since you can batch up many metrics in one go,while the Rabbit submission is one metric per message.

This python language is pretty interesting. Not that I am trying to learn it, but perusing the .py files in the Graphite distro is edifying and for a java person, the general working bits-and-pieces of python is not hard to understand. (Now writing something useful.... different story).

After the umpteenth DZone posting from some code rock-star cum superhero admonishing us all to learn multiple languages, my feeling of inadequacy led me to consider delving into python. Sadly, I chose Erlang, having heard that women in bars are the most impressed with that. I'll save that story for another day, but suffice it to say that after I recovered from that blazing flame-out, I picked Groovy. Yes, it is still Java based, but the rock-stars let it slide because it has closures and befriends both sides of the statically-typed-versus-dynamically-typed interstellar war. Apparently my prior list up until Groovy which was Java 1.3, Java 1.4, Java 1.5, Java 1.6 and bash (for a total of 5) was not passing muster.

I have been adapting OpenTrace to optionally send collected metrics to Graphite. Naturally, since I am A.R. about low level metrics, I publish the tracer's metrics through JMX.


The  idea is the tracer accumulates metrics until t time elapses, or n metrics accumulate, and then the buffer is flushed up to Carbon. So can you see the metrics in real time ? Well yes. (Nice of you to ask). The Graphite Dev team explains:


How are Graphite graphs always real-time, even when carbon hasn't had time to write it's cached data to disk yet?

Upon receiving a rendering request, the Graphite webapp simultaneously retrieves data for the requested metrics from the disk and from carbon's cache via a simple cache query socket that carbon-cache provides. Graphite then combines these two sources of data points into a single series, which is then rendered. This ensures that graphs are always real-time, even when the data hasn't been written to disk yet.



I have been working on an Ubuntu 10.10 VirtualBox appliance that I can distribute because the installation of Graphite is a bit laborious, and each time I do it, I always forget one thing or another. The installation documentation is decent, but there's a lot of moving parts here, and many of them differ by Linux distro, so I can't really fault the Graphite dev team in this regard. I saw something somewhere about an effort to put together Linux packages, but I can't find reference to it now.

At any rate, after the 23rd consecutive time of installing Graphite (and all the requisite packages) onto my Ubuntu guest, I think I finally nailed down a script that does most of the hard work, and it is here for your productivity and reading pleasure:

Updated: Friday May 20, 2011