authenticationhaskellconcurrencystm

Concurrent Authentication Token Renewal in Haskell


I am developing a web service in Haskell. It makes requests to other web services, some of which require authentication. Currently, I have something like this:

renewToken :: Credentials -> IORef Token -> IO Token
renewToken credentials tokenRef = do
    token <- readIORef tokenRef
    time <- getCurrentTime
    if time < expiration token then pure token else do
        newToken <- request "http://example.com/login" credentials
        writeIORef tokenRef newToken
        pure newToken

The renewToken function can then be used before requests which require authentication:

do
    token <- renewToken credentials tokenRef
    request "http://example.com/data" token

This setup does not handle concurrency very well. Concurrent uses of renewToken may renew the token multiple times and overwrite each other's results.

How can I implement the token renewal process in a way that is safe and efficient in a concurrent setting?

I have heard that STM is a great way of doing concurrency in Haskell, so I wanted to use that. However, I do not understand how to use it here, since the token renewal process performs IO. As far as I understood, there is no sensible way to do IO in an STM transaction, since it may retry. I have often heard STM proposed as a higher-level alternative to MVars, but I do not understand how this would work if there is IO.

I could use withMVar to ensure that only one thread can enter the token renewal process at a time. However, this would needlessly serialize all the read-only accesses to the token when it has not expired yet. I am afraid that this might lead to increased latency when there are many concurrent requests. Moreover, MVar seems like a very low-level primitive that does not have the composability of something like STM. Am I really supposed to use MVar directly in a situation like this?

Finally, I considered the solution of having a separate thread that is responsible for token renewal. However, this again serializes all accesses to the token. On top of that, it creates significant complexity and potential overhead via communication between threads.


Solution

  • this would needlessly serialize all the read-only accesses to the token

    You would serialize checking whether the token is expired or not, but you don't need to serialize the network requests that are made using the token. The cost is tiny compared to the network request itself.

    We can do this in STM without needing to put any IO inside the transactions. We use a small state machine with a "pending" state so that a thread can see when another thread is already renewing the token. So a TVar with something like this in it.

    data TokenState = None | Pending | Got Token
    

    (If you always get a token at startup maybe you can do without the None state.)

    Our thread checks if there is a token we can use. If there isn't, we set the state to Pending. If the state was already set to Pending by some other thread, we block.

    do mt <- atomically
               (do tokenState <- readTVar var
                   case tokenState of
                     None -> do writeTVar var Pending
                                pure Nothing
                     Got t | expired t -> do writeTVar var Pending
                                             pure Nothing
                           | otherwise -> pure (Just t)
                     Pending -> retry)
    

    If mt is Nothing it means we decided we need a new token, and made ourselves responsible for fetching it. Once we do so we set the state to Got.

       t <- case mt of
              Nothing -> do newToken <- ... fetch new token ...
                            atomically (writeTVar var (Got newToken))
                            pure newToken
              Just existingToken -> pure existingToken
    
       ... use t ...
    

    (Another way to structure such code is to have a transaction return an IO action which you immediately execute, outside of that transaction. For example here instead of returning a Maybe from the first atomically block we would return either an action for renewing the token or an action for doing nothing and using the existing token. I don't know if it would make things clearer in this case but it's worth considering sometimes.)

    Finally, I considered the solution of having a separate thread that is responsible for token renewal.

    I think this would also work fine, especially if you are keeping a token available at all times. Maybe a little more complicated if you only want the token to be maintained while threads are using it.