The ModeShape team is happy to announce that we’ve issued the first alpha release of ModeShape 3. This is the first alpha release we’ve ever made, and it’s still rough around the edges. But we’re so excited about ModeShape 3 that we had to share. (And, yes, this post is really long, but it’s a good read.)
Our goal for ModeShape 3 is for it to be the seriously fast, very scalable, and highly available JCR implementation. To do that, we’ve made some pretty significant architectural changes. Some of these are:
- We’re using Infinispan for all caching and storage. This gives the foundation we need to meet our goals while giving us the flexibility for how to store the content (via cache stores). ModeShape can still be embedded into applications, but Infinispan will help us scale out to create truly distributed, multi-site, content grids. This completely replaces our old connector framework.
- So far our tests show ModeShape 3 is ridiculously fast. It’s all around faster than 2.7 – in fact, most operations are at least one (if not several!) orders of magnitude faster. We’ll publish proper performance and benchmarking results closer to the final release.
- Scalability not only includes clustering (and “scaling out”), but it also means handling a wider range of node structures. We’ve tested our new approach with 100s of thousands of child nodes under a single parent, even when those nodes have ordered children with same-name-siblings. Yet it’s still almost just as fast as nodes with just a few child nodes!
- Configuring repositories is hopefully much easier. There is no more global configuration of the engine; instead, each repository is configured with a separate JSON file that conforms to a JSON Schema and that your application can validate with one method call. Check out this entirely valid sample configuration file. You can deploy new repositories at runtime, and can even change a repository’s configuration while it is running (some restrictions apply). For example, you can add/change/remove sequencers, authorization providers, and many other configuration options while the repository is being actively used.
- ModeShape continues to have great options for storing your content. ModeShape 2 had its own connector framework, but with ModeShape 3 we’re simply using Infinispan’s cache stores, with a number of great options out-of-the-box:
- In-memory (no cache store)
- BerkleyDB, which is quite fast but has license restrictions
- JDBM, a free alternative to BerkleyDB
- Relational databases (via JDBC), including in-memory, disk-based, or remote
- File system
- Cassandra
- Cloud storage (e.g., Amazon’s S3, Rackspace’s Cloudfiles, or any other provider supported by JClouds)
- Remote Infinispan grid
- Every session now immediately sees all changes persisted/committed by other sessions, although transient changes of the session still take precedence. This behavior is different from in 2.x, and when combined with the new way node content is being store will hopefully reduce the potential for conflicts during session save operations. This means that all the Sessions using a given workspace can share the cache of persisted content, resulting in faster performance and smaller memory footprint. That means that ModeShape can handle more sessions at the same time in a single process.
- Our Session, Workspace, NodeTypeManager and other components are thread safe. The JCR specification only requires that the Repository and RepositoryFactory interfaces are thread-safe. But making our implementations thread-safe means that it’s possible for multiple threads to share one Session for reading. Of course, Session is inherently stateful, so sharing a Session for writes is still a bad thing to do.
- We have a new public API for monitoring the history, activity and health of ModeShape.
- We’ve changed our sequencing API to use the JCR API. This should make it much easier to create your own sequencers, plus sequencers can also dynamically register namespaces and node types. We’ve already migrated most of our 2.x sequencers to this new API, and will be migrating the rest over the next few weeks.
- Handling of binary values is greatly improved with a new facility that can store binary values of all sizes, including those that are (much) larger than available memory. In fact, only small binary values are stored in memory (this is configurable), while all other binary value are only streamed. We’ve started out with a file system store that will work even in clustered environments, but we also plan to add stores that use Infinispan and DBMSes.
- We’re still using Lucene for our indexes, but we’re now using Hibernate Search to give us durable and fast ways to update the indexes, even in a cluster. Note that Hibernate Search is part of the Hibernate family, but it’s a small library that does not use, depend on, or require JPA or the Hibernate ORM.
As if that’s not enough, we still have a lot to do:
- Kits for deploying ModeShape 3 as a service in JBoss AS7, allowing you to use the AS7 tooling to configure, deploy, manage, monitor, and undeploy your JCR repositories. Infinispan and JGroups are also built-in services in AS7 and can be managed the same way. Plus, ModeShape clustering will work out of the box using AS7’s built-in clustering (domain management) mechanism. ModeShape and JBoss AS7 will be the easiest way to deploy, manage and operate enterprise-grade repositories.
- JTA support will allow JCR Sessions to participate in XA and container-managed transactions. We’re already using JTA transactions internally with Infinispan, so we’re already a good way toward this feature.
- Map-Reduce is a great way to process in parallel large amounts of information. ModeShape will let you validate the entire repository content against the current set of node types or even a proposed set of node types, making it far easier to safely and confidently change the node types in a large repository. And we’ll provide a way for you to write your own mappers, reducers, and collectors to implement any kind of (read-only) analysis you want.
Hopefully you’re just as excited as we are. We love how far we’ve able to come with ModeShape 3, and we’re only part way there.
The good news is that you can start kicking the tires and seeing for yourself just how fast ModeShape 3 is. Most of the JCR features are working and are ready for trial and testing. In fact, please file bug reports if you find anything that doesn’t work. But unfortunately a few things still aren’t complete or working well enough:
- Queries will parse but can’t be executed. Most of it works, but a few key pieces don’t work. Consequently, the JDBC drivers don’t work.
- Clustering and shareable nodes don’t work.
- AS7 kits are incomplete and not yet usable.
- The RESTful and WebDAV services aren’t working as we’d like, so we excluded them from the alpha.
- Federation is not yet working; see this discussion for how we want to expand federation capabilities.
We’re also overhauling our documentation to make it even more useful. But it’s a little sparse at the moment, we’re focusing on the code. Our What’s New and Getting Started pages are pretty useful, though, and should help you get your testing going. We also have some sample (and stand-alone) example Maven projects on GitHub that you can clone and hack to start putting ModeShape 3 through its paces.
What’s next? Well, we’re continuing to implement the missing and incomplete features, and we plan to release a second alpha in the next few weeks. We’ll follow that up over the following month with a couple of feature-complete beta releases and the final 3.0. release. Stay tuned!
Now, wasn’t that worth a few minutes of your time? We’re really excited about ModeShape 3, and think you’ll really like it, too.
Filed under: features, jcr, news, releases, repository, testing