ModeShape

An open-source, federated content repository

New disk storage option for ModeShape

We’re introducing a new feature that allows ModeShape to store content directly on disk using the native file system. It’s called the Disk Connector, and is capable of storing any content that applications can put into a repository. It’s already in the ‘master’ branch and will be in the upcoming 2.6.0.Beta1 release of ModeShape. (If you want to give it a try before the release, grab the latest from our repository, run a local build to install it into your local Maven repository, and use the ‘2.6-SNAPSHOT’ version in your application’s POM file.)

So now ModeShape offers are five connectors that can store all valid JCR content (including ‘mix:referenceable’ and ‘mix:versionable’ nodes, REFERENCE properties, version histories, etc.) and can also find nodes by identifier. We’ve designed all these connectors to own their data, meaning other applications should not directly access the underlying storage system. But any one of these is a great fit for most applications:

  • JPA Connector – stores all content in one of the 17 relational DBMS systems supported by Hibernate, including DB2, Oracle, MySQL, PostgreSQL, and SQL Server (to name a few)
  • Infinispan Connector – stores all content in a fast, scalable, distributed, and fault-tolerant Infinispan data grid
  • JBoss Cache Connector – stores all content in a JBoss Cache instance, and useful for small-to-medium sized repositories when Infinispan is not available
  • In-memory Connector – stores all content in-memory, and is fast and useful for small transient repositories or when importing XML and using JCR to read and search the content
  • Disk Connector – stores all content on disk in a binary format defined by ModeShape

ModeShape also offers other connectors that enable accessing the information in external systems, even when other applications use those same systems:

  • File System Connector – reads and writes ‘nt:file’, ‘nt:folder’ and ‘nt:resource’ nodes on the native file system using regular files and directories, mapping the properties defined by these node types to the actual file and directory attributes, and storing extra properties added to nodes via mixins in UTF-8 files (BINARY properties stored encoded in hexadecimal) that your applications can even read
  • JCR Connector – reads and writes content into an external JCR repository, and is useful when migrating from other JCR implementations or when federating existing JCR repositories into a single repository
  • Subversion Connector – reads and writes ‘nt:file’, ‘nt:folder’ and ‘nt:resource’ nodes as files and directories in a SVN repository; unlike the File System Connector, this only supports the standard properties defined on the ‘nt:file’, ‘nt:folder’, and ‘nt:resource’ node types
  • JDBC Metadata Connector – a read-only connector that maps the JDBC metadata into nodes representing the databases, catalogs, schemas, tables, columns, procedures, and other metadata information, and is very useful if you want to have a JCR repository that contains an accurate schema representation of one or more databases

Filed under: features, jcr, news, techniques

Finding a JCR repository

Updated 6/21/2011: Added section describing the Seam JCR module
Updated 6/23/2011: Added more detail about the JNDI location when ModeShape is deployed to JBoss AS

Okay, you’re using JCR in your application, and you’re writing all of your code to the JCR API. That’s great, because your application doesn’t have any implementation-specific calls, and you can rely only upon the “javax.jcr” packages.

“But,” you ask, “how do I get a reference to the javax.jcr.Repository instance without using implementation-specific code in my app?”

If you’re using JCR 1.0, you’re basically out of luck. The spec didn’t specify how to do that, and so the implementations all do it differently.

But thankfully JCR 2.0 introduced the javax.jcr.RepositoryFactory interface and described how to use the Java SE Service Locator pattern to get that initial reference to your repository instance without any implementation-specific code. Here’s how that works.

Using the JCR 2.0 RepositoryFactory

Your application will have one (or more) JCR implementations on the classpath, and per JCR 2.0 they will each provide their own RepositoryFactory implementations and manifest entries so that the JVM can find them. Your application can find them by using the Service Locator pattern:

Map parameters = ...
Repository repository = null;
for (RepositoryFactory factory : ServiceLoader.load(RepositoryFactory.class)) {
  repository = factory.getRepository(parameters);
  if (repository != null) break;
}

This basically iterates over all of the RepositoryFactory implementations, and for each one asks that factory to return the JCR Repository instance given the map of parameters. Per JCR 2.0, if the RepositoryFactory understands the parameters, it will return a Repository instance; otherwise, it will return null. Now, each JCR implementation is allows to define their own parameters, so these definitely are still implementation-specific. But since they’re just properties, your application can remain independent of JCR implementation by simply loading them from a file:

Properties parameters = new Properties();
// Read from a file or from other input streams or readers ...
parameters.load(new FileInputStream(file));
// Find the Repository instance ...
Repository repository = null;
for (RepositoryFactory factory : ServiceLoader.load(RepositoryFactory.class)) {
  repository = factory.getRepository(parameters);
  if (repository != null) break;
}

Look, Ma! No implementation-specific code!

ModeShape parameters for RepositoryFactory

So what parameters does ModeShape expect? Just one:

org.modeshape.jcr.URL

If the value of this parameter is a URL that resolves to a ModeShape configuration file, the factory will actually start up a new ModeShape engine using that configuration file, and will look for the repository in the URL. For example:

file:config/configRepository.xml?repositoryName=MyRepository

will look for a ModeShape configuration file named “configRepository.xml” that is in the “config” directory relative to where the JVM was started, and will return the repository defined in the configuration file with the name “MyRepository”. (Remember that a single ModeShape engine can host multiple JCR repositories.) Other URLs are possible, as long as they can be resolved to the configuration file.

If the value of the “org.modeshape.jcr.URL” parameter is a URL that begins with “jndi:”, then the ModeShape factory will attempt to look for a ModeShape engine instance registered in JNDI, and will ask that engine for the named repository. For example:

jndi:name/in/jndi?repositoryName=MyRepository

will look in JNDI for a ModeShape engine at “name/in/jndi”, and will ask it for the repository named “MyRepository”.

The JNDI form is what you’ll use if you’ve deployed ModeShape to JBoss AS and your applications need to access the repositories. ModeShape runs as a service within JBoss AS, so when the app server is started ModeShape will be auto-registered the engine in JNDI at “jcr/local”. If you’ve not changed the configuration, there will be a repository called “repository” (with a default workspace called “default”, though you can create other workspaces using the JCR API), and you can use the following URL for the “org.modeshape.jcr.URL” parameter:

jndi:jcr/local?repositoryName=repository

Of course, you probably want to change the configuration to add other repositories or to control where and how the repositories store the content (by default it is stored in-memory). If you add repositories or change the name of the repository, you’ll need to change the URL accordingly.

Injecting JCR Repositories

If you’re building an application that uses CDI, there’s another option for getting a hold of your Repository instance. The Seam JCR project is a portable extension to CDI that provides annotations for automatically injecting a javax.jcr.Repository object into your application, and Seam JCR works with ModeShape and Jackrabbit. Simple ensure that Seam JCR and your JCR implementation are on your classpath, and then simply use annotations to provide the same parameters normally supplied to the RepositoryFactory. Here’s an example of injecting ModeShape with the same “file:” URL used above:

  @Inject @JcrConfiguration(name="org.modeshape.jcr.URL",
                            value="file:config/configRepository.xml?repositoryName=MyRepository")
  Repository repository;

Seam JCR also makes it easy to inject a JCR Session into your application:

  @Inject @JcrConfiguration(name="org.modeshape.jcr.URL",
                            value="file:config/configRepository.xml?repositoryName=MyRepository")
  Session session;

This code will obtain a Session using the default workspace and no credentials, but the Seam JCR team is working on supporting Credentials and workspace names.

Of course, Seam JCR also works with Jackrabbit, but uses Jackrabbit-specific parameters. For more details, see the Seam JCR site.

Filed under: features, jcr, repository, techniques

What distinguishes ModeShape?

One question we often get about ModeShape is what makes ModeShape different than other JCR implementations, including the reference implementation. We’ve answered it in a previous blog post, but it’s important enough to give a more recent and succinct answer.

Here’s a really brief, very high-level summary of what ModeShape is and where our emphases lie:

ModeShape is a lightweight, embeddable, extensible open source JCR repository implementation that federates and unifies content from multiple systems, including files systems, databases, data grids, other repositories, etc. You can use the JCR API to access the information you already have, or use it like a conventional JCR system. It’s useful for portals, for knowledge bases, for storing/versioning artifacts, for managing configuration, for managing metadata, and more. ModeShape is easy to configure, easy to cluster, and easy to extend.

Of course, we can look at some of the ModeShape features to get an even better understanding of what it does and why it rocks:

  • Supports all the JCR 2.0 required features: repository acquisition; authentication; reading/navigating; query; export; node type discovery; permissions and capability checking
  • Supports most of the JCR 2.0 optional features: writing; import; observation; workspace management; versioning; locking; node type management; same-name siblings; orderable child nodes; shareable nodes; and mix:etag, mix:created and mix:lastModified mixins with autocreated properties.
  • Supports the JCR 1.0 and JCR 2.0 languages (e.g., XPath, JCR-SQL, JCR-SQL2, and JCR-QOM) plus a full-text search language based upon the JCR-SQL2 full-text search expression grammar. Additionally, ModeShape supports some very useful extensions to JCR-SQL2:
    • subqueries in criteria
    • set operations (e.g, “UNION“, “INTERSECT“, “EXCEPT“, each with optional “ALL” clause)
    • limits and offsets
    • duplicate removal (e.g., “SELECT DISTINCT“)
    • depth, reference and path criteria
    • set and range criteria (e.g., “IN“, “NOT IN“, and “BETWEEN“)
    • arithmetic criteria (e.g., “SCORE(t1) + SCORE(t2)“)
    • full outer join and cross joins
    • and more
  • Choose from multiple storage options, including RDBMSes (via Hibernate), data grids (e.g., Infinispan), file systems, or write your own storage connectors as needed.
  • Use the JCR API to access information in existing services, file systems, and repositories. ModeShape connectors project the external information into a JCR repository, potentially federating the information from multiple systems into a single workspace. Write custom connectors to access other systems, too.
  • Upload files and have ModeShape automatically parse and derive structured information representative of what’s in those files and then store this derived information in the repository so you can query and access it just like any other content. ModeShape supports a number of file types out-of-the-box , including: CND, XML, XSD, WSDL, DDL, CSV, ZIP/JAR/EAR/WAR, Java source, Java classfiles, Microsoft Office, image metadata, and Teiid models and VDBs. Writing sequencers for other file types is also very easy.
  • Automated and extensible MIME type detection, with out-of-the-box detection using file extensions and content-based detection using Aperture.
  • Extensible text extraction framework, with out-of-the-box support for Microsoft Office, PDF, HTML, plain text, and XML files using Tika.
  • Simple clustering using JGroups.
  • Embed ModeShape into your own application, or deploy on JBoss Application Server, or use in any other application server.
  • RESTful API (requires deployment into an application server).
  • WebDAV support

These are just some of the highlights. For details on these and other ModeShape features, please see the ModeShape documentation.

Filed under: features, jcr, repository

ModeShape is

a lightweight, fast, pluggable, open-source JCR repository that federates and unifies content from multiple systems, including files systems, databases, data grids, other repositories, etc.

Use the JCR API to access the information you already have, or use it like a conventional JCR system (just with more ways to persist your content).

ModeShape used to be 'JBoss DNA'. It's the same project, same community, same license, and same software.

ModeShape

Topics