JAX-RS Client Filters

Life goes on with JAX-RS/Jersey. I wasted a couple of moments figuring out how to add custom headers to a Jersey generated JAX-RS client. Might as well write it down in the hope of saving someone a couple of minutes.

For starters, you’ll need a Client Filter that does the actual heavy(ish) lifting.

import javax.ws.rs.client.ClientRequestContext;
import javax.ws.rs.client.ClientRequestFilter;
 
/**
 * Add the X-GARBAGE header to all requests.
 */
public class GarbageFilter implements ClientRequestFilter {
	@Override
	public void filter(final ClientRequestContext requestContext) throws IOException {
		requestContext.getHeaders().add("X-GARBAGE", "This is added to all requests");
	}
}

And then you’ll have to register the filter with the Client(Config).

// import org.glassfish.jersey.client.ClientConfig;
 
final ClientConfig clientConfig = new ClientConfig();
clientConfig.register(new GarbageFilter()); // Yes, you could use JDK8 magic :-)
 
final Client client = ClientBuilder.newClient(clientConfig);

And that’s all. Every request you launch using the generated client will now contain your X-GARBAGE header.

JAX-RS Client File Upload

Another hiccup in using the wadl2java client generated from a (Jersey) JAX-RS app. This time, it concerns multipart/form-data.

The method:

	@POST
	@Produces(MediaType.APPLICATION_JSON)
	@Consumes(MediaType.MULTIPART_FORM_DATA)
	public JsonFoo create(@FormDataParam("file") InputStream data, 
		@FormDataParam("file") FormDataContentDisposition fileDetail, 
		@FormDataParam("file") FormDataBodyPart bodyPart) {
			// Implementation foo bar baz
	}

The WADL exposes something that looks like this.

<method id="create" name="POST">
	<request>
		<representation mediaType="multipart/form-data"/>
	</request>
	<!-- Response omitted for the sake of brevity -->
</method>

And the generated client has a method to go along with it. Unfortunately, it gives you no hints whatsoever as to how to actually provide a file/data.

// Long class names shortened
public static Create create(Client client, URI baseURI) {
	return new Create(client, baseURI);
}
 
// The Create object contains this little gem
public<T >T postMultipartFormDataAsJson(Object input, GenericType<T> returnType);

That’s wonderful. Unfortunately, if you pass in a java.io.File, nothing happens. The client barfs.

Many DuckDuckGo-searches, StackOverflow hunts and headscratchings later, I came up with a working solution:

// import import org.glassfish.jersey.media.multipart.FormDataMultiPart;
// import org.glassfish.jersey.media.multipart.file.FileDataBodyPart;
 
File file = new File(); // Your file!
FormDataMultiPart form = new FormDataMultiPart();
form.field("filename", file.getName());
form.bodyPart(new FileDataBodyPart("file", upload, new MediaType("image", "jpeg")));
 
Api.create(client, uri).postMultipartFormDataAsJson(form, new GenericType<CreateResponse>() {});

But wait! That won’t cut it. You also need to tell your Client that you want to use the Multipart Feature. Makes sense. If you don’t, you’ll end up with this exception.

org.glassfish.jersey.message.internal.MessageBodyProviderNotFoundException: MessageBodyWriter not found for media type=multipart/form-data, type=class org.glassfish.jersey.media.multipart.FormDataMultiPart, genericType=class org.glassfish.jersey.media.multipart.FormDataMultiPart.
// import org.glassfish.jersey.client.ClientConfig;
// import org.glassfish.jersey.media.multipart.MultiPartFeature;
 
final ClientConfig clientConfig = new ClientConfig();
clientConfig.register(MultiPartFeature.class);

And there you have it. File upload with JAX-RS and a wadl2java generated client.

JAX-RS vs Collections vs JSON Arrays

The Problem

Looks like I’ve managed to get myself into a pickle. I have a (Jersey) JAX-RS app, which automagically publishes a WADL. If I then generate a Client (using wadl2java), I end up with generated code that doesn’t work.

	@GET
	@Produces(MediaType.APPLICATION_JSON)
	public List<Foo> get() {
		return myList;
	}

Foo is a simple POJO with an @XmlRootElement annotation. The resulting JSON is perfectly fine: I get a JSON Array with a bunch of Foo objects. Perfect!

The generated WADL is a different matter:

    <method id="get" name="GET">
        <response>
            <!-- foo is a reference to an XSD Complex Type ... but where's my list? -->
            <ns2:representation element="foo" mediaType="application/json"/>
        </response>
    </method>

For some reason, the WADL generator is smart enough to realise that we’re dealing with Foo instances. But it’s too stupid to realise that we’re dealing with more than one.

If you then generate a Client using wadl2java, you’ll end up with something like this:

    public Foo getAsFoo() {
        return response.readEntity(Foo.class);
    }

Well that’s not going to work, is it? Trying to read a single Foo when you’ve got an array of them. And indeed … you get a wonderful exception.

Internal Exception: java.lang.ClassCastException: Foo cannot be cast to java.util.Collection

This seems to be a fundamental XML vs JSON problem. After all there is no such thing as an “XML Array”. Either you have one element, or you have multiple elements nested under a common root.

I could solve this by not relying on a plain JSON Array and encapsulating the result.

// Not this:
[
   {
      "foo":"bar"
   },
   {
      "bar":"baz"
   }
]
 
// But this:
{
   "listOfFoos":[
      {
         "foo":"bar"
      },
      {
         "bar":"baz"
      }
   ]
}

But then that’s ugly. And instead of using a native Java collection I’ll have to create a useless intermediary object. And it would break compatibility with current API clients.

The Solution

Whelp … I haven’t found one yet. To be continued, I hope. But I did manage to find a workaround, thanks to Adam Bien’s blog.

The incorrectly generated getAsFoo() method doesn’t work. But we can use getAsJson() instead — which doesn’t necessarily have to return a JSON string.

    List<Foo> foos = client.getFoo().getAsJson(
        new GenericType<List<Foo>>() {
        }
    );

GenericType is a dirty JAX-RS hack in my opinion, but it works. It’s a shame that I have to rely on getAsJson(), though. It would’ve been much cleaner to use the getAsFoo() method directly.

Yet Another Battery Widget (Awesome 3.5.1)

Yet another battery widget for Awesome. This one actually works (shock! horror!) on Awesome 3.5.1 on my Thinkpad x230. Your mileage may vary. Colours used are from the excellent Solarized colour scheme. Behold the mighty widget, in all its unobtrusive glory!

battery

The implementation is in two parts: a simple shell script to output the battery status, and a bit of rc.lua tweaks to display the widget. This is mostly the result of a bit of copy/pasting from different sources I forgot to bookmark. Oh well.

~/bin/battery.sh:

#!/bin/bash
 
healthy='#859900'
low='#b58900'
discharge='#dc322f'
 
capacity=`cat /sys/class/power_supply/BAT0/capacity`
if (($capacity <= 25));
then
        capacityColour=$low
else
        capacityColour=$healthy
fi
 
status=`cat /sys/class/power_supply/BAT0/status`
 
if [[ "$status" = "Discharging" ]]
then
        statusColour=$discharge
        status="▼"
else
        statusColour=$healthy
        status="▲"
fi
 
echo "<span color=\"$capacityColour\">$capacity%</span> <span color=\"$statusColour\">$status</span>"

Add the following snippets to /path/to/awesome/rc.lua. I’ll attempt to indicate the approximate location at the top of each snippet.

Create the widget..and don’t forget to adjust the path to the battery.sh script.

-- This goes below the line containing mytextclock = awful.widget.textclock()
 
-- Create a battery widget
battery = wibox.widget.textbox()
function getBatteryStatus()
   local fd= io.popen("/path/to/battery.sh")
   local status = fd:read()
   fd:close()
   return status
end

Add the widget..

-- This goes above the line containing right_layout:add(mytextclock)
    right_layout:add(battery)

Get the widget to refresh every 30 seconds. Put this somewhere near the end of the config file.

-- Battery status timer
batteryTimer = timer({timeout = 30})
batteryTimer:connect_signal("timeout", function()
  battery:set_markup(getBatteryStatus())
end)
batteryTimer:start()
battery:set_markup(getBatteryStatus())

That’s all! Restart awesome and you’ll see a relatively purdy yet unobstrusive battery status display.

Gnuplot data analysis, real world example

Creating graphs in LibreOffice is a nightmare. They’re ugly, nearly impossible to customize and creating pivot tables with data is bloody tedious work. In this post, I’ll show you how I took the output of a couple of performance test scripts and turned it into reasonably pretty graphs with a few standard command line tools (gnuplot, awk, a bit of (ba)sh and a Makefile).

The Data

I ran a series of query performance tests against data sets of different sizes. The sets contain 10k, 100k, 1M, 10M, 100M and 500M documents. One of the basic constraints is that it has to be easy to add/remove sets. I don’t want to faff about with deleting columns or updating pivot tables. If I add a set to my test data, I want it automagically show up in my graphs.

The output of the test script is a simple tab separated file, and looks like this:

#Set	Iteration	QueryID	Duration
500M	1	101	10.497499465942383
500M	1	102	3.9973576068878174
500M	1	103	9.4201889038085938
500M	1	104	2.8091645240783691
500M	1	105	2.944718599319458
500M	1	106	5.1576917171478271
500M	1	107	5.7224125862121582
500M	1	108	5.7259769439697266
500M	1	109	4.7974696159362793

Each row contains the query duration (in seconds) for a single execution of a single query.

Processing the data

I don’t just want to graph random numbers. Instead, for each query in each set, I want the shortest execution time (MIN), the longest (MAX) and the average across iterations (AVG). So we’ll create a little awk script to output data in this format. In order to make life easier for gnuplot later on, we’ll create a file per dataset.

% head -n 3 output/500M.dat

#SET	QUERY	MIN	MAX	AVG	ITERATIONS
500M	200	0.071	2.699	0.952	3
500M	110	0.082	5.279	1.819	3

Here’s the source of the awk script, transform.awk. The code is quite verbose, to make it a bit easier to understand.

BEGIN {
}
 
{
        if($0 ~ /^[^#]/) {
                key = $1"_"$3
                first = iterations[key] > 0 ? 0 : 1
                sets[$1] = 1
                queries[$3] = 1
                totals[key] += $4
                iterations[key] += 1
 
                if(1 == iterations[key]) {
                        minima[key] = $4
                        maxima[key] = $4
                } else {
                        minima[key] = $4 < minima[key] ? $4 : minima[key]
                        maxima[key] = $4 > maxima[key] ? $4 : maxima[key]
                }
        }
}
 
END {
 
        for(set in sets) {
                outfile = "output/"set".dat"
                print "#SET\tQUERY\tMIN\tMAX\tAVG\tITERATIONS" > outfile
                for(query in queries) {
                        key = set"_"query
                        iterationCount = iterations[key]
                        average = totals[key] / iterationCount
                        printf("%s\t%d\t%.3f\t%.3f\t%.3f\t%d\n", set, query, minima[key], maxima[key], average, iterationCount) >> outfile
 
                }
        }
}

This code will read our input data, calculate MIN, MAX, AVG, number of iterations for each query and dump the contents in a tab-separated dat file with the same name as the set. Again, this is done to make life easier for gnuplot later on.

I want to see the effect of dataset size on query performance, so I want to plot averages for each set. Gnuplot makes this nice and easy, all I have to do is name my sets and tell it where to find the data. But ah … I don’t want to tell gnuplot what my sets are, because they should be determined dynamically from the available data. Enter, a wee shellscript that outputs gnuplot commands.

#!/bin/sh
 
# Output plot commands for all data sets in the output dir
# Usage: ./plotgenerator.sh column-number
# Example for the AVG column: ./plotgenerator.sh 5
 
prefix=""
 
echo -n "plot "
for s in `ls output | sed 's/\.dat//'` ;
do
        echo -n "$prefix \"output/$s.dat\" using 2:$1 title \"$s\""
 
        if [[ "$prefix" == "" ]] ; then
                prefix=", "
        fi
done

This script will generate a gnuplot “plot” command. Each datafile gets its own title (this is why we named our data files after their dataset name) and its own colour in the graph. We want to plot two columns: the QueryID, and the AVG duration. In order to make it easier to plot the MIN or MAX columns, I’m parameterizing the second column: the $1 value is the number of the AVG, MIN or MAX column.

Plotting

Gnuplot will call the plotgenerator.sh script at runtime. All that’s left to do is write a few lines of gnuplot!

Here’s the source of average.gnp

#!/usr/bin/gnuplot
reset
set terminal png enhanced size 1280,768
 
set xlabel "Query"
set ylabel "Duration (seconds)"
set xrange [100:]
 
set title "Average query duration"
set key outside
set grid
 
set style data points
 
eval(system("./plotgenerator.sh 5"))

The result

% ./average.gnp > average.png

Click for full size.

average

Wrapping it up with a Makefile

I don’t like having to remember which steps to execute in which order, and instead of faffing about with yet another shell script, I’ll throw in another *nix favourite: a Makefile.

It looks like this:

average:
        rm -rf output
        mkdir output
        awk -f transform.awk queries.dat
        ./average.gnp > average.png

Now all you have to do, is run

make

whenever you’ve updated your data file, and you’ll end up with a nice’n purdy new graph. Yay!

Having a bit of command line proficiency goes a long way. It’s so much easier and faster to analyse, transform and plot data this way than it is using graphical “tools”. Not to mention that you can easily integrate this with your build system…that way, each new build can ship with up-to-date performance graphs. Just sayin’!

Note: I’m aware that a lot of this scripting could be eliminated in gnuplot 4.6, but it doesn’t ship with Fedora yet, and I couldn’t be arsed building it.

SSH Gateway Shenanigans

I love OpenSSH. Part of its awesomeness is its ability to function as a gateway. I’m going to describe how I (ab)use SSH to connect to my virtual machines. Now, on a basic level, this is pretty easy to do, you can simply port forward different ports to different virtual machines. However, I don’t want to mess about with non-standard ports. SSH runs on port 22, and anyone who says otherwise is wrong. Or you could give each of your virtual machines a seperate IP address, but then, we’re running out of IPv4 addresses and many ISPs stubbornly refuse to use IPv6. Quite the pickle!

ProxyCommand to the rescue!

ProxyCommand in ~/.ssh/config pretty much does what it says on the tin: it proxies … commands!

Host fancyvm
        User foo
        HostName fancyvm
        ProxyCommand ssh foo@physical.box nc %h %p -w 3600 2> /dev/null 

This allows you to connect to fancyvm by first connecting to physical.box. This works like a charm, but it has a couple of very important drawbacks:

  1. If you’re using passwords, you have to enter them twice
  2. If you’re using password protected key files without an agent, you have to enter that one twice as well
  3. If you want to change passwords, you have to do it twice
  4. It requires configuration on each client you connect from

Almighty command

Another option is the “command=” option in ~/.ssh/authorized_keys on the physical box:

command="bash -c 'ssh foo@fancyvm ${SSH_ORIGINAL_COMMAND:-}'" ssh-rsa [your public key goes here]

Prefixing your key with command=”foo” will ensure that “foo” is executed whenever you connect using that key. In this case, it will automagically connect you to fancyvm when you log in to physical.box using your SSH key. This has a small amount of setup overhead on the server side but it’s generally the way I do things. The only real drawback here is that’s impossible to change your public key, which isn’t too bad, as long as you keep it secure.

The Actual Shenanigans

The command option is wonderful, but some users can’t or won’t use SSH key authentication. That’s a bit trickier, and here’s the solution I’ve come up with — but if you have a better one, please do share!

We need three things:

  1. A nasty ForceCommand script on the physical box
  2. A user on the physical box (with a passwordless ssh key pair)
  3. A user on the VM, with the above user’s public key in ~/.ssh/authorized_keys

This will grant us the magic ability to log in to the VM by logging in to the physical box. We only have to log in once (because the second part of the login is done automagically by means of the key files). A bit of trickery will also allow us to change the gateway password, which was impossible with any of our previous approaches.

Let’s start with a change in the sshd_config file:

Match User foo
        ForceCommand /usr/local/bin/vmlogin foo fancyvm "$SSH_ORIGINAL_COMMAND"

This will force the execution of our magic script whenever the user connects. And don’t worry, things like scp will work just fine.

And then there’s the magic script, /usr/local/bin/vmlogin:

#!/bin/bash

user=$1
host=$2
command="${3:-}"

if [ "$command" = "passwd" ] ; then
        bash -c "passwd"
        exit 0
fi
command="'$command'"
bash -c "ssh -e none $user@$host $command"

Update 2016

The above script no longer works with SFTP on CentOS 7 with Debian guests. Not sure why, and I’m too lazy to find out. So here’s a script that works around the problem.

#!/bin/bash

user=$1
host=$2
command="${3:-}"

if [ "$command" = "passwd" ] ; then
        bash -c "passwd"
        exit 0
fi

# SFTP has been fucking up. This ought to fix it.
if [ "$command" = "/usr/libexec/openssh/sftp-server" ] || [ "$command" = "internal-sftp" ] ; then
        bash -c "ssh -s -e none $user@$host sftp"
        exit 0
fi

command="'$command'"
bash -c "ssh -e none $user@$host $command"

And there you have it, that’s all the magic you really need. Everything works exactly as if you were connecting to just another machine. The only tricky bit is changing the gateway password: you have to explicitly provide the passwd command when connecting, like so:

ssh foo@physical.box passwd

Symfony2 and Jenkins

I was a bit surprised to see that Symfony2 doesn’t come with an ant build file for Jenkins by default, so I spent a bit of time whipping one up for you (well, for me, really):you can get it here.

Maybe it’ll work for you, maybe it won’t. The project I’m working on has the complete SF2 distribution in version control and all of our bundles in the src folder. It’s easier to test & ship the code this way. If you want to test a specific bundle or don’t want to have SF2 in version control, then you’re on your own :-).

Howto: soapUI integration tests with Maven

Running soapUI tests with maven is surprisingly easy, all it requires is a few simple steps. This howto will walk you through deploying your web project in an embedded container and running the soapUI tests in the integration test phase.

Cargo configuration

With the cargo plugin you can deploy your project to just about any container. For the sake of simplicity I’ll be using an embedded Jetty 6 container.

<!-- Deploy the project WAR to a built-in container during the integration test phase -->
<build>
[...]
<plugin>
	<groupId>org.codehaus.cargo</groupId>
	<artifactId>cargo-maven2-plugin</artifactId>
	<executions>
		<!--Start the container in the pre-integration-test phase -->
		<execution>
			<id>start-container</id>
			<phase>pre-integration-test</phase>
			<goals>
				<goal>start</goal>
			</goals>
		</execution>
		<!-- Stop the container after integration tests are done -->
		<execution>
			<id>stop-container</id>
			<phase>post-integration-test</phase>
			<goals>
				<goal>stop</goal>
			</goals>
		</execution>
	</executions>
	<configuration>
		<wait>false</wait> <!-- We want to deploy, run tests and exit, not wait -->
		<container>
			<containerId>jetty6x</containerId>
			<type>embedded</type>
		</container>
		<configuration>
			<properties>
				<cargo.servlet.port>${my.project.port}</cargo.servlet.port>
			</properties>
		</configuration>
	</configuration>
</plugin>
[...]
</build>

soapUI project configuration

If you haven’t already created a SOAP UI test suite, now’s the time to do so. Once this is done, copy the test suite to your test resources folder (src/test/resources). Set up your project to filter resources.

<build>
[...]
	<testResources>
		<testResource>
			<filtering>true</filtering>
			<directory>src/test/resources</directory>
		</testResource>
	</testResources>
[...]
</build>

With that out of the way, you can now edit the soapUI project file with your favourite XML editor. What you want to do is replace all endpoint references (and possibly WSDL locations) by property keys. So <con:endpoint>http://localhost:8080/MyProject/endpoint</con:endpoint> becomes <con:endpoint>${my.project.endpoint}</con:endpoint>.
Your webapp will be deployed to http://localhost:${my.project.port}/${project.artifactId}-${project.version}, so I suggest using that as a property value.

<project | profile>
[...]
<properties> 
	<my.project.port>8888</my.project.port>
	<my.project.endpoint>http://localhost:${my.project.port}/MyProject/endpoint</my.project.endpoint>
</properties>
[...]
</project | profile>

soapUI plugin configuration

First, add the eviware soapUI maven repository to your list of repositories.

<pluginRepositories>
	<pluginRepository>
		<id>eviwarePluginRepository</id>
		<url>http://www.eviware.com/repository/maven2/</url>
	</pluginRepository>
</pluginRepositories>

Then, add the plugin to your build and let maven know when you want to execute it. Considering the container is starting up before the integration test phase, and is shutting down afterwards, running the tests as integration tests seems like the best option ;-).

<!-- Run SOAP UI tests during the integration phase. -->
<plugin>
	<groupId>eviware</groupId>
	<artifactId>maven-soapui-plugin</artifactId>
	<version>2.5.1</version>
	<configuration>
		<junitReport>yes</junitReport>
		<exportAll>yes</exportAll>
		<projectFile>target/test-classes/soapui-project.xml</projectFile>
		<outputFolder>target/soapui-reports</outputFolder>
	</configuration>
	<executions>
		<execution>
			<id>wsn-server-test</id>
			<phase>integration-test</phase>
			<goals>
				<goal>test</goal>
			</goals>
		</execution>
	</executions>
</plugin>

All done!

Now when you run maven verify (or install, or ..) your SOAP UI tests will automagically be executed and you’ll be informed of any failures.

Howto: PostgreSQL data source in JDeveloper/OC4J

Today I had to create a postgres connection pool in JDeveloper’s embedded oc4j container. JDeveloper being the horrible piece of software that it is, and its documentation being rather lacking, this took a lot longer than it should have. The pretty GUI wizards aren’t able to pull it off either — these measly conjurers really aren’t worthy of the name.

The biggest hurdly was postgres’ connection pool not being happy with just a jdbc URL. Instead it expects a hostname, port number and database name. These things are all in the jdbc url, but never mind, that would’ve been too simple. After reading through the XSD for data-sources.xml, I realised that there’s an option to provide custom properties to the factory. Quite simple really. A connection pool definition looks something like this:

<connection-pool name="myPool" disable-server-connection-pooling="false">
	<connection-factory
		factory-class="org.postgresql.jdbc3.Jdbc3PoolingDataSource"
		user="postgres" password="1234"
		url="jdbc:postgresql://localhost:5432/db">
		<property name="serverName" value="localhost" />
		<property name="portNumber" value="5432" />
		<property name="databaseName" value="db" />
	</connection-factory>
</connection-pool>
<managed-data-source name="dataSource" jndi-name="jdbc/postgresDS" connection-pool-name="myPool" />

Once this is done, all that’s left to do is place the postgres driver JAR in the j2ee/home/applib folder in your JDeveloper folder. If you don’t place it there, you’ll get very nice class not found errors.

That’s it. Not very hard at all!