An open-source, federated content repository

ModeShape 3.0 Alpha1 is here, and it rocks!

The ModeShape team is happy to announce that we’ve issued the first alpha release of ModeShape 3. This is the first alpha release we’ve ever made, and it’s still rough around the edges. But we’re so excited about ModeShape 3 that we had to share. (And, yes, this post is really long, but it’s a good read.)

Our goal for ModeShape 3 is for it to be the seriously fast, very scalable, and highly available JCR implementation. To do that, we’ve made some pretty significant architectural changes. Some of these are:

  • We’re using Infinispan for all caching and storage. This gives the foundation we need to meet our goals while giving us the flexibility for how to store the content (via cache stores). ModeShape can still be embedded into applications, but Infinispan will help us scale out to create truly distributed, multi-site, content grids. This completely replaces our old connector framework.
  • So far our tests show ModeShape 3 is ridiculously fast. It’s all around faster than 2.7 – in fact, most operations are at least one (if not several!) orders of magnitude faster. We’ll publish proper performance and benchmarking results closer to the final release.
  • Scalability not only includes clustering (and “scaling out”), but it also means handling a wider range of node structures. We’ve tested our new approach with 100s of thousands of child nodes under a single parent, even when those nodes have ordered children with same-name-siblings. Yet it’s still almost just as fast as nodes with just a few child nodes!
  • Configuring repositories is hopefully much easier. There is no more global configuration of the engine; instead, each repository is configured with a separate JSON file that conforms to a JSON Schema and that your application can validate with one method call. Check out this entirely valid sample configuration file. You can deploy new repositories at runtime, and can even change a repository’s configuration while it is running (some restrictions apply). For example, you can add/change/remove sequencers, authorization providers, and many other configuration options while the repository is being actively used.
  • ModeShape continues to have great options for storing your content. ModeShape 2 had its own connector framework, but with ModeShape 3 we’re simply using Infinispan’s cache stores, with a number of great options out-of-the-box:
    • In-memory (no cache store)
    • BerkleyDB, which is quite fast but has license restrictions
    • JDBM, a free alternative to BerkleyDB
    • Relational databases (via JDBC), including in-memory, disk-based, or remote
    • File system
    • Cassandra
    • Cloud storage (e.g., Amazon’s S3, Rackspace’s Cloudfiles, or any other provider supported by JClouds)
    • Remote Infinispan grid
  • Every session now immediately sees all changes persisted/committed by other sessions, although transient changes of the session still take precedence. This behavior is different from in 2.x, and when combined with the new way node content is being store will hopefully reduce the potential for conflicts during session save operations. This means that all the Sessions using a given workspace can share the cache of persisted content, resulting in faster performance and smaller memory footprint. That means that ModeShape can handle more sessions at the same time in a single process.
  • Our Session, Workspace, NodeTypeManager and other components are thread safe. The JCR specification only requires that the Repository and RepositoryFactory interfaces are thread-safe. But making our implementations thread-safe means that it’s possible for multiple threads to share one Session for reading. Of course, Session is inherently stateful, so sharing a Session for writes is still a bad thing to do.
  • We have a new public API for monitoring the history, activity and health of ModeShape.
  • We’ve changed our sequencing API to use the JCR API. This should make it much easier to create your own sequencers, plus sequencers can also dynamically register namespaces and node types. We’ve already migrated most of our 2.x sequencers to this new API, and will be migrating the rest over the next few weeks.
  • Handling of binary values is greatly improved with a new facility that can store binary values of all sizes, including those that are (much) larger than available memory. In fact, only small binary values are stored in memory (this is configurable), while all other binary value are only streamed. We’ve started out with a file system store that will work even in clustered environments, but we also plan to add stores that use Infinispan and DBMSes.
  • We’re still using Lucene for our indexes, but we’re now using Hibernate Search to give us durable and fast ways to update the indexes, even in a cluster. Note that Hibernate Search is part of the Hibernate family, but it’s a small library that does not use, depend on, or require JPA or the Hibernate ORM.

As if that’s not enough, we still have a lot to do:

  • Kits for deploying ModeShape 3 as a service in JBoss AS7, allowing you to use the AS7 tooling to configure, deploy, manage, monitor, and undeploy your JCR repositories. Infinispan and JGroups are also built-in services in AS7 and can be managed the same way. Plus, ModeShape clustering will work out of the box using AS7’s built-in clustering (domain management) mechanism. ModeShape and JBoss AS7 will be the easiest way to deploy, manage and operate enterprise-grade repositories.
  • JTA support will allow JCR Sessions to participate in XA and container-managed transactions. We’re already using JTA transactions internally with Infinispan, so we’re already a good way toward this feature.
  • Map-Reduce is a great way to process in parallel large amounts of information. ModeShape will let you validate the entire repository content against the current set of node types or even a proposed set of node types, making it far easier to safely and confidently change the node types in a large repository. And we’ll provide a way for you to write your own mappers, reducers, and collectors to implement any kind of (read-only) analysis you want.

Hopefully you’re just as excited as we are. We love how far we’ve able to come with ModeShape 3, and we’re only part way there.

The good news is that you can start kicking the tires and seeing for yourself just how fast ModeShape 3 is. Most of the JCR features are working and are ready for trial and testing. In fact, please file bug reports if you find anything that doesn’t work. But unfortunately a few things still aren’t complete or working well enough:

  • Queries will parse but can’t be executed. Most of it works, but a few key pieces don’t work. Consequently, the JDBC drivers don’t work.
  • Clustering and shareable nodes don’t work.
  • AS7 kits are incomplete and not yet usable.
  • The RESTful and WebDAV services aren’t working as we’d like, so we excluded them from the alpha.
  • Federation is not yet working; see this discussion for how we want to expand federation capabilities.

We’re also overhauling our documentation to make it even more useful. But it’s a little sparse at the moment, we’re focusing on the code. Our What’s New and Getting Started pages are pretty useful, though, and should help you get your testing going. We also have some sample (and stand-alone) example Maven projects on GitHub that you can clone and hack to start putting ModeShape 3 through its paces.

What’s next? Well, we’re continuing to implement the missing and incomplete features, and we plan to release a second alpha in the next few weeks. We’ll follow that up over the following month with a couple of feature-complete beta releases and the final 3.0. release. Stay tuned!

Now, wasn’t that worth a few minutes of your time? We’re really excited about ModeShape 3, and think you’ll really like it, too.


Filed under: features, jcr, news, releases, repository, testing

Running tests against different DBMSes

Testing software against multiple database management systems can be tricky. It’s usually a headache to configure and maintain these tests, and then they usually take a long time to run.

Of course there are multiple ways of doing this, and each approach probably has some advantage for your system, the build environment, and whether your developers have access to the test databases.

JBoss DNA is an open source project, and open source projects run into some walls that commercial software developers often don’t have. With open source, each developer likely has a very different environment and likely does not have access to the same databases or same set of DBMSes. It doesn’t work to hard-coding the connection information or even to put it in some property files, because each developer would have to go and change all those settings. Even if we agreed upon a convention, not everyone has access to the same DBMS systems. What we want is to be able to run all of our builds normally without relying upon any external resources, and then at our choosing easily run our tests against the database each developer has access to.

Luckily, we use Maven, and so we can use Maven profiles to define a different environment for each database we want to use. In each database profile, we can add dependencies for the appropriate JDBC driver and specify connection properties. And it becomes very easy to turn profiles on and off, which means developers can choose which databases they want to test against. And, we can use HSQLDB or H2 by default, since these are fast, all developers have them (merely because of Maven dependencies), and there transient.

The only challenge is that we already are using profiles for different kinds of builds we do. One of our profiles (the one used by default) is fast because it simply compiles the source and run only the unit tests. This is the default because we run this build all of the time – it’s actually the easiest way to run all of the unit tests, so I run it constantly throughout the day as I make changes locally.

We have another profile that also compiles and runs the integration tests – these take several minutes to run, so it’s not very nice to have to wait for all these tests while you’re just verifying some changes didn’t cause some unintended behavior. Although I run these tests locally, I suspect most of the developers let our automated continuous integration server do the work.

Other profiles also generate JavaDoc, build our documentation (compiling DocBook into multiple HTML formats and one PDF form), and create the ZIP archives that we publish in our downloads area.

The database profiles are actually orthogonal to these other profiles. Despite this, there is (at least) one way to make them stay independent and play nicely together, and that’s activating them with properties. Our Maven command line becomes:

mvn clean install -Ddatabase=postgresql

and if we want to use any of our other profiles, we can just use the “-P” switch as we’ve always done:

mvn -P integration clean install -Ddatabase=postgresql

That would work great. All we have to do is define each database profile to activate based upon the “database” property.

Oh, you may be wondering why we don’t just explicitly name each profile on the command line. Well, actually we can, and it works as long as the user always remembers to include one database profile along with the other profiles they want to use. With Maven, if the user names a profile, then no other profile is activated by default, which means that our builds may run without a database profile, and this can break the tests. Activating the database profile with a property solves this problem.

Defining the database profiles

As I mentioned earlier, we really want a profile for each database that we want to test against. If we define the profiles in the parent POM, then all subprojects inherit them. So, the first step is to put in our parent POM a separate profile for each database configuration. Here is the HSQLDB configuration:

    <jpaSource.password />

Note how the profile is activated when the “database” property matches a value (“hsqldb” in this case.) We also define the dependencies on the HSQLDB JDBC driver JAR, and we define the properties that we’ll use in our tests. And we have one of these for each database we want to test against.

But before we look at how those properties get injected into our test cases, let’s define the database profile that should be used if the “database” property is not set. We do this with a different activation strategy:

   <jpaSource.password />

I’ve chosen that the default database profile is identical to one of the other profiles, and this means I have some duplication in the POM file. Personally, the cleanliness for the user seems worth it.

Before we leave our parent POM, we also should probably define default values for all of the properties we use (or can use) in the database profiles. This will make it easier to inject those properties into our tests. So in the “<properties>” section of the parent POM, define a few default values:


Injecting the database properties

Each of the database profiles define a bunch of properties, and we want to use those property values in each of our tests. The easiest way to do this is to use Maven filters to substitute the property values in some of our resource files when it copies them into the ‘target’ directory. We could do that in the parent POM, but it’s probably best to do that in each subproject, where we can be specific about the files we want filtered. JBoss DNA has a “dna-integration-test” subproject, and so in its POM file we just need to turn on filtering:

   <!-- Apply the properties set in the POM to the resource files -->

Here the first “<testResource>” fragment copies all of the files in the “src/test/resources” directory without doing any filtering, while the latter fragment copies the “src/test/resources/tck/jpa/configRepository.xml” file and does the substitution of the property values. For example, any “${japSource.url}” strings in the file are replaced with the appropriate value for this property as defined in the database profile (or the default).

The “configRepository.xml” file is a configuration file for the DNA JCR engine. What if we want to inject the database properties into our test cases? Well, one easy way is to have Maven filter a property file and then just have our test cases load that property file and use the data in it. We actually do this in one of our other projects, and it works great.

Running the tests

Now we can see the fruits of our labor. So we can run

mvn clean install


mvn -P integration clean install

to run all unit and integration tests, just as we could before. Since we don’t specify “-Ddatabase=dbprofile” on the command line, these builds will use the default database profile, which for us is HSQLDB. That means any database-related tests we run are fast and require no external resources. Brilliant!

Of course, once all of our tests pass with the default configuration, we can then run all the tests against different databases with a few simple commands:

mvn -P integration install -Ddatabase=mysql5
mvn -P integration install -Ddatabase=postgresql8
mvn -P integration install -Ddatabase=oracle10g

Pretty sweet! Well, as long as we have some patience.

Hat tip to the Hibernate and jBPM team, since our approach was largely influenced by a combination of their setup.

Filed under: techniques, testing

SAX broken in Java 6 and 7 (and how open source really works)

Not long ago Serge, one of DNA’s committers, found a bug when running our XML sequencer with Java 6. Turns out the culprit is actually the Java 6’s SAX parser, which is calling the handler methods for entity references in the wrong order! Ouch!

Normally, when processing an XML entity reference, whether a standard entity like “&lt;” or a custom one like “&version;“, the SAX parser first calls startEntity(String) method on LexicalHandler with the name of the entity (e.g., “lt”). It then calls the the character(...) method on ContentHandler to process the replacement content (e.g., “<“). Finally, the parser then calls endEntity(String) method on LexicalHandler (again with the name of the entity).

That’s the way it’s supposed to work. In Java 6, the startEntity(String) method is called correctly, but the endEntity(String) method is then called, and the content passed to the characters(...) includes the replacement content and the next set of content that would be processed. WTF?

I had a tough time finding out whether anyone else hand encountered this. Surely they had – a problem like this? In fact, my search results kept including the DNA bug in the top results. (In hindsight, the only reason we care is that we want to keep the original entity reference rather than use the replacement content, which SAX normally does. So most people may not actually notice the problem.)

But, for the moment, it didn’t matter whether it was a JDK problem or not. We want to release DNA in a few days, and the XML sequencer worked correctly in Java 5, but not in Java 6. So, either find a fix or treat as a known issue. Well, this issue could come up a lot with XML sequencing, so we tried a fix. In fact, I found a workaround that was actually pretty minimal, so now the sequencer works on both Java versions. Score!

Okay, now that DNA was on track, back to the JDK bug. Based upon our unit tests, I had a hunch it was just Java 6 on OS-X. So, would I file the bug with Apple or with Sun? This morning I started working on a simple test to show the problem, which I could use to find out exactly which JDKs were a problem and where I’d need to file the bug.

But then open source to the rescue! Daniel (from the Smooks project) found the same problem and he filed a bug report. Today no less. Turns out my hunch about OS-X was wrong, and it’s a problem in Java 6 and 7. Of course, like a good open-source citizen, Daniel commented on our JIRA issue. That’s how I found out about his bug report. Thanks, Daniel – you saved me some time, and hopefully our initial triaging served you well.

I love how open source works.

Filed under: open source, testing

Internationalization in JBoss DNA

It’s pretty important to be able to create applications that support multiple languages, and most libraries should provide some kind of support for this. The first step is making your code support internationalization, but then you need to localize the application for each language (or locale). We’ve included internationalization (or “i18n”) in JBoss DNA from the beginning, but we haven’t done much with localization (or “L10n”), and have only one (default) localization in English.

Java really does a crappy job at supporting internationalization. Sure, it has great Unicode support, and it does provide a standard mechanism for identifying locales and looking up bundles given a locale. But where is the standard approach for representing an internationalized message ready for localization into any locale? ResourceBundle.getString()? Seriously?

What I want is something analogous to an internationalized String capable of holding onto the replacement parameters. Each internationalized string should be associated with the key used in the resource bundles. I want to localize an internationalized string into the default locale, or into whatever locale you supply, and even into multiple locales (after all, web applications don’t support just one locale). And I should be able to use my IDE to find where each internationalized string is used. I should be able to test that my localization files contain localized messages for each of the internationalized strings used in the code, and that there are no duplicate or obsolete entries in the files. I also don’t want that many resource files (one per package – like Eclipse used to do – sucks); one per Maven project is just about right.

I’m not asking for much.

Meet the players

There are quite a few existing internationalization (aka, “i18n”) open source libraries, including JI18n, J18n, Apache Commons I18n, just to name a few. Too many of these try to be too smart and do too much. (Like automatically localizing a message identified by a Java annotation into the current locale, or using aspects to do things automatically.) This stuff just tends to confuse IDE dependency, search, and/or debuggers. We found nothing we liked, and lots of things we didn’t like. Internationalization shouldn’t be this hard.

Sweet and to the point
So we did what we don’t like to do: we invented our own very simple framework. And by simple, I mean there’s only one I18n class that represents an internationalized string with some static utility methods (and an abstract JUnit test class; see below). To use, simply create an “internationalization class” that contains a static I18n instances for the messages in a bundle, and then create a resource bundle properties file for each of these classes. That’s it!

So, let’s assume that we have a Maven project and we want to create an internationalization class that represents the internationalized strings for that project. (We could create as many as we want, but one is the simplest.) Here’s the code:

public final class DnaSubprojectI18n {

// These are the internationalized strings ...
public static I18n propertyIsRequired;
public static I18n nodeDoesNotExistAtPath;
public static I18n errorRemovingNodeFromCache;

static {
// Initializes the I18n instances
try {
} catch (final Exception err) {
System.err.println(err); // logging depends on I18n, so we can't log

Notice that we have a static I18n instance for each of our internationalized strings. The name of each I18n variable corresponds to the key in the corresponding property file. Pretty simple boilerplate code.

The actual localized messages are kept same package as this class, but since we’re using Maven the file goes in src/main/resources):

  propertyIsRequired = The {0} property is required but has no value
nodeDoesNotExistAtPath = No node exists at {0} (or below {1})
errorRemovingNodeFromCache = Error while removing {0} from cache

Again, pretty simple and nothing new.

Using in your code
At this point, all we’ve done is defined a bunch of internationalized strings. Now all we need to do to use an internationalized string is to reference the I18n instance you want (e.g., DnaSubprojectI18n.propertyIsRequired). Pass it (and any parameter values) around. And when you’re ready, localize the message by calling I18n.text(Object...params) or I18n.text(Locale locale, Object...params). The beauty of this approach is that IDE’s love it. Want to know where an internationalized message is used? Go to the static I18n member and find where it’s used.

The logging framework used in JBoss DNA has methods that take an I18n instance and zero or more parameters. (Debug and trace methods just take String, since in order to understand these messages you really have to have access to the code, so English messages are sufficient.) This static typing helps make sure that all the developers internationalize where they’re supposed to.

With exceptions, we’ve chosen to have our exceptions use Strings (just like JDK exceptions), so we simply call the I18n.text(Object...params) method:

  throw new RepositoryException(DnaSubprojectI18n.propertyIsRequired.text(path));

We’ll probably make this even easier by adding constructors that take the I18n instance and the parameters, saving a little bit of typing and delaying localization until it’s actually needed.

Testing localizations
Testing your internationalization classes is equally simple. Create a JUnit test class and subclass the AbstractI18nTest class, passing to its constructor your DnaSubprojectI18n class reference:

public class DnaSubprojectI18nTest extends AbstractI18nTest {
public DnaSubprojectI18nTest() {

That’s it. The test class inherits test methods that compare the messages in the properties file with the I18n instances in the class, ensuring there aren’t any extra or missing messages in any of the localization files. That’s a huge benefit!

One more thing …
Remember when I said there was only one class to our framework? Okay, I stretched the truth a bit. We also abstracted how the framework loads the localized messages, so there’s an interface and an implementation class that loads from standard resource bundle property files. So if you want to use a different loading mechanism for your localized messages, feel free.

Props to John Verhaeg and Dan Florian for the design of this simple but really powerful framework.

Filed under: techniques, testing, tools

Mockito 1.5

Mockito 1.5 has been released with several nice enhancements. Perhaps one of the most useful is the ability to spy on non-mock objects. In other words, you can verify that methods are called on the non-mock object. So, for example (from the release notes):

   List list = new LinkedList();
   List spy = spy(list);

   //wow, I can stub it!

   //wow, I can use it and add real elements to the list!

   //wow, I can verify it!

I haven’t wanted to do this too often, but there was an occasion or two.

Another improvement is supposed to result in more readable code. Instead of


it is now possible to write:


Notice that the code is exactly the same length, so it’s clearly up to you whether you think it’s more or less readable. In addition to doReturn(), there’s also doThrow(), doAnswer(), and doNothing().

Check out the Mockito documentation for examples and details on how to use.

Filed under: techniques, testing, tools

Testing behaviors

I mentioned in my last post how learning Ruby has made me a better Java developer. In particular, learning RSpec opened my eyes to a new way of unit testing.

RSpec is a library for Ruby that is built around Behavior Driven Development (BDD). In BDD and with RSpec, you focus on specifying the behaviors of a class and write code (tests) that verify that behavior. Whether you do this before you write the class is up to you, but I’ve found that outlining the class’ behaviors before (or while) I write the class helps me figure out what exactly the implementation should do.

You may be thinking that BDD sounds awfully similar to Test Driven Development (TDD). In some ways they are similar: they both encourage writing tests first and for fully testing the code you write. However, TDD doesn’t really guide you into the kinds of tests you should be writing, and I think a lot of people struggle with what they should be testing. BDD attempts to give you this guidance by getting the words right so that you focus on what the behaviors are supposed to be.

Let’s look at an example of a class that represents a playlist. The first step will be to decide what the class should and should not do:


  • should not allow a null name
  • should not allow a blank name
  • should always have a name
  • should allow the name to change
  • should maintain the order of the songs
  • should allow songs to be added
  • should allow songs to be removed
  • should allow songs to be reordered
  • should have a duration that is a summation of the durations of each song
  • should not allow a song to appear more than once

Really, these are just the requirements written as a list. With BDD and JUnit 4.4, we can capture each behavior specification as a single unit test method. Initially, we’ll just stub the methods, but later on we’ll implement the test methods to verify the class actually exhibits that behavior. And since JUnit 4.4 gives us the freedom to name our test methods anything we want, let’s take a play from the RSpec playbook and put these behavior specifications directly in our test method names. Pretty cool! Just start listing the expected behaviors, and the test methods simply fall out:

public class PlaylistTest {
 @Test public void shouldNotAllowANullName() {}
 @Test public void shouldNotAllowABlankName() {}
 @Test public void shouldAlwaysHaveAName() {}
 @Test public void shouldAllowTheNameToChange() {}
 @Test public void shouldMaintainTheOrderOfTheSongs() {}
 @Test public void shouldAllowSongsToBeAdded() {}
 @Test public void shouldAllowSongsToBeRemoved() {}
 @Test public void shouldAllowSongsToBeReordered() {}
 @Test public void shouldHaveADurationThatIsASummationOfTheDurationsOfEachSong() {}
 @Test public void shouldNotAllowASongToAppearMoreThanOnce() {}

By capturing the requirements/behaviors in the test class, we don’t need to document them elsewhere. We can even add JavaDoc if the name isn’t clear. And, with a little work, we could generate that list of requirements by processing (or sequencing!) our code, as long as we follow the convention that the method names form a camel-case but readable English description of the behavior. (In fact, the org.jboss.dna.common.text.Inflector has a method to “humanize” camel-case and underscore-delimited strings, making it a cinch to output human readable code.)

And our test class even compiles. Pretty cool, huh? Oh, and that last requirement that’s not very intuitive? We now have a specification (test method) that verifies the seemingly odd behavior, so if a developer later on changes this behavior, it’ll get caught. (Of course the developer might just blindly change the test, but that’s another problem, isn’t it?)

But back to our development process. At this point, we could implement the test methods using the non-existent Playlist class. It may not compile, but we could then use our IDE to help us create the Playlist class and the methods we actually want. Of course, if this is too weird, you can always stub out the class and then implement the test methods. Personally, I like to implement some of the test methods before going any further, and we’ll use Mockito to stub out a Song implementation.

public class PlaylistTest {
  private Playlist playlist;
  private String validName;
  private Song song1;
  private Song song2;

  public void beforeEach() {
    validName = "Pool party songs";
    playlist = new Playlist();
    song1 = mock(Song.class);
    song2 = mock(Song.class);

  @Test(expected = IllegalArgumentException.class)
  public void shouldNotAllowANullName() {

  @Test(expected = IllegalArgumentException.class)
  public void shouldNotAllowABlankName() {
    playlist.setName("   ");

  public void shouldAlwaysHaveAName() {
    assertThat(playlist.getName(), is("New Playlist"));

  public void shouldAllowTheNameToChange() {
    validName = "New valid playlist name";
    assertThat(playlist.getName(), is(validName));


  public void shouldHaveADurationThatIsASummationOfTheDurationsOfEachSong() {
    assertThat(playlist.getDurationInSeconds(), is(339));
    verify(song1, times(1)).getDurationInSeconds();
    verify(song2, times(1)).getDurationInSeconds();

  public void shouldNotAllowASongToAppearMoreThanOnce() {
    assertThat( playlist.add(song1), is(true));
    assertThat( playlist.add(song2), is(false));

Now we can complete the Playlist class and round out more tests as we discover new requirements and behaviors. Rinse and repeat. And we’ve done it all with just a little convention and JUnit 4.4, meaning it works in our IDE and in our continuous integration system.

The real change that BDD brings is just thinking differently. So while there are some Java frameworks for BDD (e.g., JDave and JBehave), the real benefit comes from changing your testing behavior, not changing your tools.

I hope this long post has inspired you to rethink how you do testing and to give BDD a try. Let us know what you find!

Filed under: techniques, testing, tools

Continuous integration

JBoss DNA is now using automated continuous integration to make sure that everyone knows whether the codebase compiles and whether unit tests pass.

We’re currently running two jobs for DNA:

  • Continuous – Consists of compiling all DNA projects and executing all unit tests. This job is run only when changes have been made, and results of a run are published about every 30 minutes.
  • Nightly – Consists of compiling all DNA projects and executing all unit and integration tests. This job is run every morning, whether or not there are changes to the code.

By the way, this information along with the instructions for getting the code is on our project’s Subversion page.

Filed under: techniques, testing

Speedy unit testing with Jackrabbit

We’re using Apache Jackrabbit for one of the JCR implementations in our unit tests. Configuring Jackrabbit isn’t intuitive at first (like many libraries, it’s highly configurable and thus non-trivial to configure), so the trick for us was figuring out how we wanted to use it in our unit tests.

One of the more important qualities of a unit test is that its fast. We do a lot of unit testing, and so we run unit tests very frequently. Change, compile, run tests. Repeat. Repeat again, and again. So the slower the tests take to run, the more they interrupt this process and your train of thought. (More on our testing philosophy and techniques in a future post.)

So we’ve found that the easiest way to speed up Jackrabbit is to use the in-memory persistence manager and the in-memory file system implementations. Here’s a snippet of the XML configuration showing the in-memory file system for the “/repository” branch:

<filesystem class="org.apache.jackrabbit.core.fs.mem.MemoryFileSystem">
  <param name="path" value="${rep.home}/repository"/>

and here’s a snippet showing the XML configuration for the in-memory persistence manager:

<PersistenceManager class="org.apache.jackrabbit.core.persistence.mem.InMemPersistenceManager">
  <param name="persistent" value="false"/>

Remember, there are two persistence managers and three file system managers in the normal configuration, so make sure to change all of them.

Then in your test code, create an instance of the TransientRepository class by passing in the location of your configuration file and the location of the directory used for the repository data. We’re using Maven 2, so our configuration file goes in “./src/test/resources/” while we use “./target/testdata/jackrabbittest/repository” for the test data directory.

We’re also using JUnit (version 4.4), so one decision we had to make was whether to set up the repository in a @Before method and tear it down in an @After method. This makes all the tests easy to write, but it also means that the repository is set up and torn down for every test case. That means slower than necessary. And since I like to have a single test class for each class, my test cases often have a mixture of test methods that need a repository and test methods that don’t.

The pattern we’ve settled on is to create an abstract base class that sets up the repository in a “startRepository()” method, and in the @After method automatically tear it down if needed. That means in our unit test case classes that use Jackrabbit, simply extend the base class, and call “startRepository()” in those test methods that need the repository. Test methods that don’t need a repository don’t take the time to set it up. Plus, I personally like that this explicit call makes it more obvious which test needs the repository.

There’s one final twist. The TransientRepository cleans itself up when the last session is closed (not when the instance is garbage collected). Since some tests try saving saving changes in a session, closing the session and opening a new one can make all this data go away. To fix this, our “startRepository()” method creates a “keep alive” session, and our @After tear down method closes the session if it’s there.

Here’s the basics of our abstract base class:

private static Repository repository;private Session keepAliveSession;
    public static void beforeAll() throws Exception {
        // Clean up the test data ...

       // Set up the transient repository (this shouldn't do anything yet)...
       repository = new TransientRepository(REPOSITORY_CONFIG_PATH,REPOSITORY_DIRECTORY_PATH);

public static void afterAll() throws Exception {
    try {
        JackrabbitRepository jackrabbit = (JackrabbitRepository)repository;
    } finally {
        // Clean up the test data ...

public void startRepository() throws Exception {
    if (keepAliveSession == null) {
        keepAliveSession = repository.login();

public void shutdownRepository() throws Exception {
    if (keepAliveSession != null) {
        try {
        } finally {
            keepAliveSession = null;

So setting up unit tests is a piece of cake, and they run very quickly. Now we’re getting somewhere.

Filed under: techniques, testing, tools

ModeShape is

a lightweight, fast, pluggable, open-source JCR repository that federates and unifies content from multiple systems, including files systems, databases, data grids, other repositories, etc.

Use the JCR API to access the information you already have, or use it like a conventional JCR system (just with more ways to persist your content).

ModeShape used to be 'JBoss DNA'. It's the same project, same community, same license, and same software.