clojurearchitecturejvmhigh-load

JVM: Using global atom as an application cache storage in Clojure - Is it appropriate?


I have a high load app with many users requesting it with various GET params. Imagine giving different answers to different polls. Save vote, show latest poll results.

To mitigate the back pressure issue I was thinking about creating a top-level atom to store all the latest poll results for all polls.

So the workflow is like this:

  1. boot an app => app pulls in the latest poll results and populates the atom.

  2. new request comes in => increment votes counter in that atom for the specific poll, add vote payload to the core.async queue listener(working in a separate thread) to persist it to the database eventually.

The goal I'm trying to achieve:

each new request gets the latest poll results with near-instant response time(avoid network call to persistent storage)

An obvious drawback of this approach is in case we need to redeploy it will result in some temporary data loss. It is not very critical, deploys could be postponed.

The reason why I'm interested in this tricky approach and not just using RabbitMQ/Kafka is that it sounds like a really cool and simple architecture with very few "moving parts"(just JVM + database) to get the job done.


Solution

  • More data is always good. Let's time incrementing a counter in an atom:

    (ns tst.demo.core
      (:use demo.core tupelo.core tupelo.test)
      (:require
        [criterium.core :as crit]))
    
    (def cum (atom 0))
    
    (defn incr []
      (swap! cum inc))
    
    (defn timer []
      (spy :time
        (crit/quick-bench
          (dotimes [ii 1000] incr))))
    
    (dotest
      (timer))
    

    with result

    -------------------------------
       Clojure 1.10.1    Java 14
    -------------------------------
    
    Testing tst.demo.core
    Evaluation count : 1629096 in 6 samples of 271516 calls.
                 Execution time mean : 328.476758 ns
        Execution time std-deviation : 37.482750 ns
       Execution time lower quantile : 306.738888 ns ( 2.5%)
       Execution time upper quantile : 393.249204 ns (97.5%)
                       Overhead used : 1.534492 ns
    

    So 1000 calls to incr take only about 330 ns. How long does it take to ping google.com?

    PING google.com (172.217.4.174) 56(84) bytes of data.
    64 bytes from lax28s01-in-f14.1e100.net (172.217.4.174): icmp_seq=1 ttl=54 time=14.6 ms
    64 bytes from lax28s01-in-f14.1e100.net (172.217.4.174): icmp_seq=2 ttl=54 time=14.9 ms
    64 bytes from lax28s01-in-f14.1e100.net (172.217.4.174): icmp_seq=3 ttl=54 time=15.0 ms
    64 bytes from lax28s01-in-f14.1e100.net (172.217.4.174): icmp_seq=4 ttl=54 time=17.8 ms
    64 bytes from lax28s01-in-f14.1e100.net (172.217.4.174): icmp_seq=5 ttl=54 time=16.9 ms
    

    Let's call it 15 ms. So the ratio is:

    ratio = 15e-3 / 330e-9  =>  45000x
    

    Your operations with the atom are overwhelmed by the network I/O time, so there is no problem storing the application state in the atom, even for a large number of users.

    You may also be interested to know that the Datomic database have stated that the concurrency in the database is managed by a single atom.