It’s almost a certainty that you will have multiple applications and multiple threads within those applications simultaneously update data in your database. The speed of your application will depend significantly on how fast your database can perform these simultaneous updates.
If you’re using ModeShape, the first thing to know is that reading content does not require any locks. In other words, applications or threads that are reading content can always do so with no contention. (ModeShape doesn’t need read locks because it via Infinispan uses MVCC to isolate readers from writers. See the details for more.)
The second thing to know is that, because ModeShape is a hierarchical database, all data is stored in a tree-like structure of nodes and properties, and any transaction updating content must obtain locks for all nodes being updated. Much of the time, applications and threads that change content do tend to update different parts (subtrees) of the database, which means completely different write locks are acquired by the different transactions. In other words, updates to different parts of the database never block each other.
There are times, however, when multiple applications and/or threads do attempt to update the same node at the same time. In this case, the transactions do compete for the node’s lock, and these transactions complete in essentially a serialized fashion. (Again, they still do not block any reading operations or any transactions updating other areas of the repository.) Occasionally two transactions may deadlock, because they each obtain a lock on separate nodes and then try to obtain a lock on the node currently locked by the other. If you run into this situation, you can enable deadlock detection to automatically detect such cases and roll back one of the deadlocked transactions, which your application can simply re-try by performing the save again.
It’s nice to know that most of the time, application will not have any contention. And when there is contention for concurrent writes to the same areas, ModeShape does the logical thing by serializing the transactions. (Isn’t ACID behavior nice?!)
But even after all this, you may find that your applications are still highly contentious while trying to concurrently update the same nodes. In these cases, you have several options:
- Can you initialize the highly-contentious area when the database is created? If so, then the different transactions will update different areas of the database.
- Can you alter the hierarchical design of your database to eliminate the contention? Consider if your hierarchy would improve by adding one or more time-based levels. Or consider inserting a level for different contexts (e.g., users, groups, customers, etc.).
- Can you centralize where/how your application is updating these areas? For example, a hierarchy that includes a level for users might have contention when adding users. Try centralizing the process of adding users. (Queues often work great for these kinds of patterns.)
By the way, how does ModeShape compare to other hierarchical data stores? Really well, actually. One of the more popular JCR implementations uses a single, cluster-wide, global write lock that guarantees that only one write will proceed at a time. Yikes.