An open-source, federated content repository

ModeShape isn’t your father’s JCR

It’s true: ModeShape is a JCR implementation that is pretty new.  Why on earth would we create another JCR implementation when other implementations have been around for so long?

For many years, the assumption in the persistent storage world is that each store should own all the information. Database vendors tried to sell their databases and claim how easy it would be to migrate all of your data into their system. ETL vendors talk about how to load up a data warehouse with all the useful information you need, so it’s all in once place. Document storage systems and other content management systems worked wonders, as long as everything was in their repository. And the JCR implementations followed suit by implementing the JCR API on top of silo repositories (that often used a relational database under the covers).

We see the world differently. We understand that you already have too many information stores, be they databases, file systems, repositories, document stores, or proprietary systems. We believe you shouldn’t need separate APIs to access all of it, and that you shouldn’t have to move all this information into one big silo (and rewrite the applications you already have).  Instead, your databases and repositories should federate all this existing information and provide the view of your information that your applications need [1].  And you should be able to write applications that can take advantage of the information you already have where it is today. And those applications should have to change as little as possible when you have new or different information tomorrow.

It all boils down to using the JCR API to access a variety of information in all kinds of places. The JCR API is an excellent abstraction with powerful features that make it very easy to work with the information in the shape it wants to be today while easily adapting to the shape it will take tomorrow.  This is what the ModeShape project is doing, and here’s how we’re doing it:

Killer Feature #1: Connectors

Implementing the full JCR API on top of multiple kinds of systems would be expensive, time consuming, and painful. We’ve created a simple connector framework that is simple enough that its easy to write new connectors, yet efficient enough to do many, many operations with one (potentially remote) call. Your applications can use ModeShape to make these existing system look and behave like a real JCR repository.

ModeShape JCR and connectors

We’ve begun building a library of connectors that allow us to store content on a data grid (Infinispan), on a distributed cache (JBoss Cache), in relational databases (via Hibernate), and in-memory within the Java process (for small transient use cases).  Our library also includes connectors that access existing file systems, SVN repositories, and even the schemas from existing JDBC databases.

ModeShape connector library

Of course, we’re already working on more connectors, including a connector to other JCR repositories.  And we envision lots of connectors, including connectors to other CMIS repositories, version control systems (like Git and CVS), document databases (like CouchDB and Cassandra), distributed file systems, customer management systems, Maven repositories, LDAP directories, and existing databases. Just to name a few. And we designed the connector framework so that you can write your own.

Killer Feature #2: Federation

Remember all those different silos of information? Using JCR to access each of these is pretty interesting, but what’s really killer is federating the information from multiple sources into a single virtual repository. To your applications, ModeShape looks and behaves like a regular JCR repository. They use the standard JCR API to navigate, search, create, change, and listen for changes in the content. But under the covers, ModeShape is able to federate content from multiple back-end systems using our connectors, ensuring that the repository content stays up-to-date and in-sync with those systems.  And those external systems can continue “owning” the information, and existing applications can continue using them, but new applications using ModeShape can easily access the unified and integrated information.

ModeShape federation

Killer Feature #3: Sequencing

A lot of repositories exist to store files and other important artifacts, and contained in all those files is a ton of very valuable information. Sure, the repository might process them for searching, but that just extracts the words and phrases. Or, your applications can read the files and process them one at a time. ModeShape sequencers are able to unlock this valuable structured information and put it back into the repository, where it’s accessible via navigation, queries, and searches.

Sequencing is fully automated and done in the background. Simply configure the sequencer and start uploading content.  ModeShape has a library of sequencers, including support for CND, DDL, XML, ZIP, MP3, images, Java source, Java class, text files (character-separated and fixed-width), and Microsoft Office® documents. Of course, we designed it so that you can write your own sequencers, too.

Killer Feature #4: JCR-SQL2

The JCR API provides a single mechanism for querying the repository content, using a variety of query languages. JSR-170 (aka “JCR 1.0”) requires repositories support the JCR XPath language (a subset of XPath 2.0), and defines the optional language called “JCR-SQL” that is a simple subset of SQL SELECT statements. JSR-283 (aka “JCR 2.0”) deprecates both XPath and JCR-SQL, and instead mandates support for an improved “JCR-SQL2” language that is better and more powerful adaptation of SQL.

ModeShape currently supports JCR 1.0, and thus it does support the XPath query language defined by the spec. However, ModeShape also supports the newer JCR-SQL2 query language, along with several major enhancements [2]. In fact, our enhanced JCR-SQL2 is so powerful that ModeShape implements the XPath support by translating XPath expressions into JCR-SQL2 queries.

Not your father’s JCR

Traditionally, applications that use JCR are working with content repositories and content management systems. But chances are you have a lot of valuable information that your JCR repository can’t get to. And you’ve probably come to really like the JCR API, and can imagine how nice it would be to use it to access all that existing information.

So chances are, you need ModeShape. Or at least you need to give it a try. After all, ModeShape is not your father’s JCR. It’s better. Much better.

[1] Federation is in our DNA. The ModeShape project actually came out of the team that built the MetaMatrix commercial data integration and federation engine. MetaMatrix was the first true EII product that allowed applications to access unified and integrated data housed in multiple disparate back-end systems through a single, scalable, virtual database using SQL via JDBC and ODBC. MetaMatrix was acquired by Red Hat in 2007, and seeded the Teiid and Teiid Designer open source projects.

[2] Though not included in JSR-283’s JCR-SQL2, ModeShape adds support for: all the JOIN operators; UNION, INTERSECT and UNION [ALL] set operations, removal of duplicates via SELECT DISTINCT; LIMIT and OFFSET clauses; new DEPTH and PATH dynamic operands for use in constraint clauses; constraints using IN and NOT IN and BETWEEN clauses; and arithmetic operations on dynamic operands. For details, see our Reference Guide.


Filed under: features, federation, jcr, repository

3 Responses

  1. Very cool. Congrats!

    It is very encouraging to see people use JCR “outside the box”.

  2. […] other JCR implementations, including the reference implementation. We’ve answered it in a previous blog post, but it’s important enough to give a more recent and succinct […]

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

ModeShape is

a lightweight, fast, pluggable, open-source JCR repository that federates and unifies content from multiple systems, including files systems, databases, data grids, other repositories, etc.

Use the JCR API to access the information you already have, or use it like a conventional JCR system (just with more ways to persist your content).

ModeShape used to be 'JBoss DNA'. It's the same project, same community, same license, and same software.



%d bloggers like this: