Thursday, May 19, 2011

Graphite Server Virtual Box Appliance

I have made some additional improvements to my automated Graphite installer. Last time I posted the install script, it was admittedly  missing several things, but the new one goes from ./install.sh to the Graphite web console in one go.

At any rate, the new installer has significantly eased the process of creating VM appliances for Graphite which I have made available for download if you are interested in taking Graphite for a test drive, or you need a pre-built development environment. Since the VMs, as configured, are resource limited, I am not sure if they are suitable for any production use, but if you bump up the CPUs, the memory and modify the setup to use some fast native (or SAN) disks for the Whisper storage, I imagine they would be pretty decent.

Here's some details you should know:

  • Created with Virtual Box 4.0.6
  • The OS user is helios. The password is helios.
  • The Graphite admin user is helios. The password is helios. Now that I think about it, the user's email address is graphite@heliosdev.org so you might want to review my summary below on how you can change this.
  • Once you start the VM, the Graphite Console is available at http://localhost and remotely at whatever IP you designate. (Out of the box, the NIC is configured for bridging).
  • The Graphite install script and supporting files are in /home/helios/graphite.
  • The carbon listener is started by /etc/init.d/carbon which runs the process as user www-data and I recommend you stick to using it. If you start carbon directly with sudo, the process will run as root and may create files that are subsequently not readable by www-data.
  • The carbon [linetext] listener listens on port 2003.

Changing the Graphite Admin User

The user accounts are stored in a sqlite database/file /opt/graphite/storage/graphite.db,  specifically in a table called auth_user. You can run the sqlitebrowser GUI tool and edit the file directly  (using sudo). Or, you can delete the file (using sudo) and re-initialize as follows:  (using sudo) 

  1. cd /opt/graphite/webapp/graphite 
  2. sudo python manage.py syncdb
This will recreate the database and prompt you to create a new admin user.

And I thank HostMonster for their nicely priced hosting that affords me the ability to serve these files.

Next post perhaps I will explain:
  • Why did I build this appliance using Desktop instead of Server ?
  • Where is the VMware version of this appliance ?
  • Where and when is the OpenTrace client for passing collected JMX metrics to Graphite ?

Saturday, May 14, 2011

Graphite - Great App, Laborious Install (Until now ?)

Graphite is a great APM supporting tool. It is basically (and broadly speaking)  a python based server (Linux only AFAICT) with 3 components:
  1. Whisper: An RRD like database which is quite fast.
  2. Carbon: A high performance cache and network listener that listens for submitted data and stashes it in Whisper.
  3. Graphite: A web application that allows you to visualize and report data historically or in real time.
The data submission protocol and data format is very simple so wherever you collect performance data from, it's easy to deliver it to Graphite. You can optionally submit data through RabbitMQ but I have observed that the basic Text over socket is better performing since you can batch up many metrics in one go,while the Rabbit submission is one metric per message.

This python language is pretty interesting. Not that I am trying to learn it, but perusing the .py files in the Graphite distro is edifying and for a java person, the general working bits-and-pieces of python is not hard to understand. (Now writing something useful.... different story).

After the umpteenth DZone posting from some code rock-star cum superhero admonishing us all to learn multiple languages, my feeling of inadequacy led me to consider delving into python. Sadly, I chose Erlang, having heard that women in bars are the most impressed with that. I'll save that story for another day, but suffice it to say that after I recovered from that blazing flame-out, I picked Groovy. Yes, it is still Java based, but the rock-stars let it slide because it has closures and befriends both sides of the statically-typed-versus-dynamically-typed interstellar war. Apparently my prior list up until Groovy which was Java 1.3, Java 1.4, Java 1.5, Java 1.6 and bash (for a total of 5) was not passing muster.

I have been adapting OpenTrace to optionally send collected metrics to Graphite. Naturally, since I am A.R. about low level metrics, I publish the tracer's metrics through JMX.


The  idea is the tracer accumulates metrics until t time elapses, or n metrics accumulate, and then the buffer is flushed up to Carbon. So can you see the metrics in real time ? Well yes. (Nice of you to ask). The Graphite Dev team explains:


How are Graphite graphs always real-time, even when carbon hasn't had time to write it's cached data to disk yet?

Upon receiving a rendering request, the Graphite webapp simultaneously retrieves data for the requested metrics from the disk and from carbon's cache via a simple cache query socket that carbon-cache provides. Graphite then combines these two sources of data points into a single series, which is then rendered. This ensures that graphs are always real-time, even when the data hasn't been written to disk yet.



I have been working on an Ubuntu 10.10 VirtualBox appliance that I can distribute because the installation of Graphite is a bit laborious, and each time I do it, I always forget one thing or another. The installation documentation is decent, but there's a lot of moving parts here, and many of them differ by Linux distro, so I can't really fault the Graphite dev team in this regard. I saw something somewhere about an effort to put together Linux packages, but I can't find reference to it now.

At any rate, after the 23rd consecutive time of installing Graphite (and all the requisite packages) onto my Ubuntu guest, I think I finally nailed down a script that does most of the hard work, and it is here for your productivity and reading pleasure:

Updated: Friday May 20, 2011

Monday, May 09, 2011

Spin vs. Sleep vs. Join

As I was reading about why Esper sometimes implements Spin Locks instead of thread suspension, I wondered what the actual benefits were. I put together a quick and very simple test to measure elapsed time and CPU utilization of 3 different suspension operations:  (Using java version "1.6.0_21" on Windows 7)
  1. Spin: Capture until (the current time plus the suspend time in ms.) and loop until the current time is equal to or greater than until.
  2. Sleep: Call Thread.sleep()
  3. Join: Call Thread.currentThread().join(). 
This was a single thread, called in main, sleeping for 5 ms. in a loop of  100,000. As it turns out, Spin is the absolute most precise of the three, running with zero loss of precision, but at a cost of 499,905 ms. of CPU time.

Sleep lost the most precision with 1,358 ms. worth of latency and a cost of 31 ms. of CPU time.

Join seems to be the best compromise with zero CPU utilization and only 7 ms. loss of precision.

So the moral of the story, it seems to me, is to always use Join unless you need a hyper-precise wake-up time. Furthermore,  it would seem that Sleep is the worst way to achieve a suspension with much lower precision than Join at an infinitely higher CPU cost.

Output:

SpinLock vs. Sleep vs. Join
Warming Up....
Warm Up Complete
Starting Spin Test
Spin:   500000/499905204500
Starting Sleep Test
Sleep:  501358/31200200
Starting Join Test
Join:   500007/0

====  Update ====

Tested the exact same code using java version "1.6.0_24" on Ubuntu 10.10 X64 and the results were a little different. The Spin did have some precision loss (but not much). The biggest difference was that Join went from zero CPU cost to 3,970 ms cost, making it more expensive than Sleep and with slightly less precision. Not sure what to make of that, except that I need to rerun the tests to make sure there is no skew from background processes or something.

SpinLock vs. Sleep vs. Join
Warming Up....
Warm Up Complete
Starting Spin Test
Spin:   500034/499590000000
Starting Sleep Test
Sleep:  516430/3680000000
Starting Join Test
Join:   516650/3970000000