javadesign-patternsconcurrencytddidentity-map

Is there a concurrency problem here? How to test it during development?


Scenario: There exists 'n' teams who each work on their virtual 'wall' (like facebook's wall). Each team sees only their own wall and the posts on it. The posts can be edited by the author of the post or another team member (if so configured. Assuming this is indeed the case since it's a must have).

Design/technology decisions: RESTful web-app using Restlet+ Glassfish/Java + Mysql (EDIT: Using Apache DBUtils for DB access. No ORM - seemed an overkill)

Question: Multiple teams log on T1, T2 and T3 (say) each with some number of members. There is concurrency at the team-level data access, but not across teams - i.e., different teams access disjoint data sets. To optimize frequent read/writes from/to the DB we are considering a TeamGateway that controls access to DB for handling concurrency. The web-server would cache the data retrieved by the teams to speed up reads (and also to help updating the list of wall posts)

If 6 people from each of T1 - T3 log on then would ONLY 3 TableGateways be created and would it help catch concurrent writes (simple timestamp comparison before committing or a "conflict-flagged" append) and effectively manage the caching (We plan on having identity maps for the entities - there are 4-5 different entities that need to be tracked. 4 entities for a composition hierarchy and another one is associated to each of the 4)?

How would one unit test the gateway (TDD based or after the fact)?

Thanks in advance!


Solution

  • If you just write to the DB or to a cache solution on top the DB (e.g. Spring+Hibernate+EhCache etc.), you don't need to worry about corrupting your tables etc. I.e. no concurrency issue from a low-level point of view.

    If you want to write a cache yourself and deal with concurrency issues yourself, then that would involve some effort. If you shard your cache and have a "global lock" (i.e. synchronized on a common mutex) per partition, and acquire this lock for any access then that would work, while it's not the most performant way to do it. But doing something else than a global lock would involve quite a lot of work.

    While this is trivial, not sure why you'd want to use a identity hash map... I can't think of any particular reason you want to do that (if you are thinking about performance, then performance of a normal hash map would be the last thing you need to be worried about in this situation!).

    If your entities are articles, then you probably have another form of concurrency issue. Like the one that is solved by version controlling software like SVN, Mercurial etc. I.e. if you don't put merging capability to your app., it becomes an annoyance if somebody edits somebody's article only to find that somebody else has "committed" another edit before you etc. Whether you need to add such capability would depend on the use case.

    As for testing your app. for concurrency, unit testing is not bad. By writing concurrent unit-tests, it is much more easy to catch concurrency bugs. Writing concurrent tests is very tough, so I recommend that you go through good books like "Java Concurrency in Practice" before writing them. Better catch your concurrency bugs before integration testing when it becomes hard to guess what the hell is going on!

    UPDATE:
    @Nupul: That's a difficult question to answer. However,if you just have 18 humans typing stuff, my bet is writing every time to the DB would be just fine.

    If you don't store any state elsewhere (i.e. only in the DB), you should get rid of any unnecessary mutex (and you should not store any state anywhere else than the DB unless you have very good reason to do so in your situation IMO).

    It's easy to make a mistake and acquire a mutex while doing something like a network operation and hence cause extreme usability issues (e.g. app does not respond for many seconds etc.). And it's also easy to have nasty concurrency bugs like thread dead-locks etc.

    So my recommendation would be to keep your app. stateless and just write to the DB every time. Should you find any performance issues due to DB access, then turning to cache solutions like EhCache would be the best bet.

    Unless you want to learn from the project or have to deliver an app. with extreme performance requirement, I don't think writing your own cache layer will be justified.