erlangets

What is the best distributed Erlang in-memory cache?


I need some suggestion for the erlang in-memory cache system.

  1. The cache item is key-value based storage. key is usually an ASCII string; value is erlang's types include number / list / tuple / etc.
  2. The cache item can be set by any of the node.
  3. The cache item can be get by any of the node.
  4. The cache item is shared cross all nodes even on different servers
  5. dirty-read is permitted, I don't want any lock or transaction to reduce the performance.
  6. Totally distributed, no centralized machine or service.
  7. Good performance
  8. Easy install and deployment and configuration and maintenance

First choice seems to me is mnesia, but I have no experence on it. Does it meet my requirement? How the performance can I expect?

Another option is memcached -- But I am afraid the performance is lower than mnesia because extra serialization/deserialization are performed as memcached daemon is from another OS process.


Solution

  • Yes. Mnesia meets your requirements. However, like you said, a tool is good when the one using it understands it in depth. We have used mnesia on a distributed authentication system and we have not experienced any problem thus far. When mnesia is used as a cache it is better off than memcached, for one reason "Memcached cannot guarantee that what you write, you can read at any time, due to memory swap out issues and stuff" (follow here).

    However, this means that your distributed system is going to be built over Erlang. Indeed mnesia in your case beats most NoSQL cache solutions because their systems are Eventually consistent. Mnesia is consistent, as long as network availability can be ensured across the cluster. For a distributed cache system, you dont want a situation where you read different values for the same key from different nodes, hence mnesia's consistency comes in handy here.

    Something you should think about, is that, it is possible to have a centralised Memory cache for a distributed system. This works like this: You have RABBITMQ server running and accessible by AMQP clients on each Cluster node. Systems interact over the AMQP interface. Because, the cache is centralised, consistency is ensured by the process/system responsible for writing and reading from the cache. The other systems just place a request for a key, onto the AMQP message bus, and the system responsible for cache receives this message and replies it with the value.

    We have used the Message bus Architecture using RABBITMQ for a recent system which involved integration with banking systems, an ERP system and Public online service. What we built was responsible for fusing all these together and we are glad that we used RABBITMQ. The details are many but what we did is to come up with a message format, and a system identification mechanism. All systems must have a RABBITMQ client for writing and reading from the message bus. Then you would create a read Queue for each system, so that other system write their requests into that queue, whose name inside RABBITMQ, is the same as the system owning it. Then, later, you must encrypt the messages passing over the bus. In the end, you have systems bound together over large distance/across states, but with an efficient network, you wont believe how fast RABBITMQ binds these systems. Anyhow, RABBITMQ can also be clustered, and i should tell you that it is Mnesia which powers RABBITMQ (that tells you how good mnesia can be).

    Another thing is that, you should do some reading and write many programs until you are comfortable with it.