Pages

Tuesday, April 29, 2014

JUnit Testing with Jackrabbit

Writing unit tests for your code is not only best practices, it's essential for writing quality code. In order to write good unit tests, you should use mocking of code not under test. But what if you're using a technology or an API that would require quite a lot of complicated mocks?
In this article I'd like to describe how you write unit tests for code that accesses a JCR repository.

At first I really tried to mock the JCR API using Mockito, but stopped my attempt at the point where I had to mock the behavior of the Query Object Model. It became apparent, that writing mocks would outweigh the effort to write the actual production code by far. So I had to search for an alternative and found one.

The reference implementation of JCR is the Apache Jackrabbit project. This implementation comes with a set of JCR Repository implementations, one of these is the TransientRepository. The TransientRepository starts the repository on first login and shuts it down on the last session being closed. The repository is created in memory which works pretty fast and makes it the best solution for unit testing. But nevertheless, a directory structure is created for the repository and unless not specified a config file is created as well.

For writing unit tests against this repository, we need the following:
  • a temporary directory to locate the directory structure of the repository
  • a configuration file (unless you want one created on every startup)
  • the repository instance
  • a CND content model description to initialize the repository data model (optional) 
  • an admin session to perform administrator operations
  • a cleanup operation to remove the directory structure
  • the maven dependencies to satisfy all dependencies
Let's start with the Maven dependencies. You need the JCR Spec, the Jackrabbit core implementation and the Jackrabbit commons for setting up the repository.

<properties>
  <!-- JCR Spec -->
  <javax.jcr.version>2.0</javax.jcr.version>
  <!-- JCR Impl -->
  <apache.jackrabbit.version>2.6.5</apache.jackrabbit.version>
</properties>
...
<dependencies>
<!-- The JCR API -->
  <dependency>
    <groupId>javax.jcr</groupId>
    <artifactId>jcr</artifactId>
    <version>${javax.jcr.version}</version>
  </dependency>
  <!-- Jackrabbit content repository -->
  <dependency>
    <groupId>org.apache.jackrabbit</groupId>
    <artifactId>jackrabbit-core</artifactId>
    <version>${apache.jackrabbit.version}</version>
    <scope>test</scope>
  </dependency>
  <!-- Jackrabbit Tools like the CND importer -->
  <dependency>
    <groupId>org.apache.jackrabbit</groupId>
    <artifactId>jackrabbit-jcr-commons</artifactId>
    <version>${apache.jackrabbit.version}</version>
    <scope>test</scope>
  </dependency>
</dependencies> 

Now let's create the directory for the repository. I recommend to locate it in a temporary folder so multiple test runs don't affect each other if cleanup failed. We use the Java TempDirectory facility for that:
//prefix for the repository folder
import java.nio.file.Files;
import java.nio.file.Path;
...
private static final String TEST_REPOSITORY_LOCATION = "test-jcr_";
...
final Path repositoryPath = 
      Files.createTempDirectory(TEST_REPOSITORY_LOCATION);

Next, you require a configuration file. If you already have a configuration file available in the classpath, i.e. in src/test/resource, you should load it first:
final InputStream configStream = 
  YourTestCase.class.getResourceAsStream("/repository.xml");

Knowing the location and the configuration, we can create the repository:
import org.apache.jackrabbit.core.config.RepositoryConfig;
import org.apache.jackrabbit.core.TransientRepository;
...
final Path repositoryLocation = 
      repositoryPath.toAbsolutePath();
final RepositoryConfig config = 
      RepositoryConfig.create(configStream, repositoryLocation.toString());
final TransientRepository repository = 
  new TransientRepository(config);

If you ommit the config parameter, the repository is created in the working directory including the repository.xml file, which is good for a start, if you have no such file.

Now that we have the repository, we want to login to create a session (admin) in order to populate the repository. Therefore we create the credentials (default admin user is admin/admin) and perform a login:
final Credentials creds = 
  new SimpleCredentials("admin", "admin".toCharArray());
final Session session = repository.login(creds);

With the repository running and an open session we can initialize the repository with our content model if require some extensions beyond the standard JCR/Jackrabbit content model. In the next step I import a model defined in the Compact Node Definition (CND) Format, described in JCR 2.0
import org.apache.jackrabbit.commons.cnd.CndImporter;
...
private static final String JCR_MODEL_CND = "/jcr_model.cnd.txt";
...
final URL cndFile = YourTestCase.class.getResource(JCR_MODEL_CND);
final Reader cndReader = new InputStreamReader(cndFile.openStream());
CndImporter.registerNodeTypes(cndReader, session, true);

All the code examples above should be performed in the @BeforeClass annotated method so that the repository is only created once for the entire test class. Otherwise a lot of overhead will be generated. Nevertheless, in the @Before and @After annotated methods, you should create your node structures and erase them again (addNode() etc).

Finally, after you have performed you test, you should cleanup the test environment again. Because a directory was created for the transient repository, we have to remove the directory again, otherwise the temp folder will grow over time.
There are three options for cleaning it up.
  1. Cleaning up in @AfterClass annotated method
  2. Cleaning up using File::deleteOnExit()
  3. Cleaning up using shutdown hook
I prefer combining 1 and 3 for fail-safe deletion. For option 1 we require a method to destroy the repository and cleaning up the directory. For deleting the directory I use Apache Commons FileUtil as it allows deletion of directory structures containing files and subdirectories. The method could look like this:

import org.apache.commons.io.FileUtils;
...
@AfterClass
public static void destroyRepository(){
  repository.shutdown();
  String repositoryLocation = repository.getHomeDir();
  try {
    FileUtils.deleteDirectory(new File(repositoryLocation));
  } catch (final IOException e) {
   ...
  }
  repository = null;
}

As fail-safe operation I prefer to add an additional shutdown hook that is executed when the JVM shuts down. This will delete the repository even when the @AfterClass method is not invoked by JUnit. I do not use the deleteOnExit() method of File as it requires the directory to be empty while I could call any code in the shutdown hook using my own cleanup implementation.


A shutdown hook can easily be added to the runtime by specifying a Thread to be executed on VM shutdown. We simply add a call to the destroy methode to the run() method.
Runtime.getRuntime().addShutdownHook(new Thread("Repository Cleanup") {
  @Override
  public void run() {
    destroyRepository();
  }
});

Now you should have everything to set-up you Test JCR Repositoy and tear-down the test environment. Happy Testing!

Monday, April 28, 2014

Configuring SLF4J with Log4J2 using Maven

In our current project we had to decide for a logging framework. As it apears to be a common standard to use the Simple Logging Facade for Java (SLF4J) on caller side, there were various options on the logging framework itself. In the end we decided to use Log4J 2 as logging framework, mainly for reasons of better performance in comparison to other frameworks (see The Logging Olympics).
The setup of this combination is fairly ease, and there are good examples how to add a SLF4J binding to your Maven project as in this good blog article. But as Log4J is rather new and no final release has been announced yet (current is release candidate 1), it's hard to find a definitive configuration example, so I decided to put down ours.

<properties>
  <slf4j.version>1.7.6</slf4j.version>
  <!-- current log4j 2 release -->
  <log4j.version>2.0-rc1</log4j.version> 
</properties>
...
<dependencies>
  <dependency>
    <groupId>org.slf4j</groupId>
    <artifactId>slf4j-api</artifactId>
    <version>${slf4j.version}</version>
  </dependency>
  <!-- Binding for Log4J -->
  <dependency>
    <groupId>org.apache.logging.log4j</groupId>
    <artifactId>log4j-slf4j-impl</artifactId>
    <version>${log4j.version}</version>
  </dependency>
  <!-- Log4j API and Core implementation required for binding -->
  <dependency>
    <groupId>org.apache.logging.log4j</groupId>
    <artifactId>log4j-api</artifactId>
    <version>${log4j.version}</version>
  </dependency>
  <dependency>
    <groupId>org.apache.logging.log4j</groupId>
    <artifactId>log4j-core</artifactId>
    <version>${log4j.version}</version>
  </dependency>
</dependencies> 

And a basic log4j configuration (located at your src/main/resources path) could look like this
<?xml version="1.0" encoding="UTF-8"?>
<configuration status="OFF">
 <appenders>
  <Console name="Console" target="SYSTEM_OUT">
   <PatternLayout pattern="%d{HH:mm:ss} [%t] %-5level %logger{36} - %msg%n" />
  </Console>
 </appenders>
 <Loggers>
  <Logger name="yourLogger" level="debug" additivity="false">
   <AppenderRef ref="Console" />
  </Logger>
  <Root level="error">
   <AppenderRef ref="Console" />
  </Root>
 </Loggers>
</configuration>

Tuesday, April 1, 2014

Guard Conditions of UML Decision Nodes

In the beginning of the year I attended a requirements engineering class where we discussed - among other topics - activity diagrams. One element of an activity diagram is the decision node, the diamond shape, where you can branch the flow of activities depending on guarding condition for each outgoing edge. There we stumbled upon one - at least to me - completely new aspect on the guard conditions.

According to the UML specification the guard condition protects the entry of an edge. If the condition is not fulfilled, the edge is no entered. Before the above mentioned discussion I always assumed that an edge without a condition always evaluates to true and is always evaluated at last of all outgoing edged and is therefore entered only if none of the other edges' condition applies. But that is wrong!

The truth is, that according to the UML specification the order of the "processing" of the edges is not determined and therefore may be completely random. So if you define an outgoing edge of a decision node without a guarding condition, it may be the case, the edge is entered although another edge's condition may have been met but would be evaluated after the edge without a condition.

Conclusion is, if you want to have a determined behavior of your activities, always define all of the guard conditions of your outgoing edges of decision nodes.