haskellconduitnetwork-conduit

Haskell server does not reply to client


I tried building a simple client-server program following this tutorial about Haskell's network-conduit library.

This is the client, which concurrently sends a file to the server and receives the answer:

{-# LANGUAGE OverloadedStrings #-}

import Control.Concurrent.Async (concurrently)
import Data.Functor (void)

import Conduit
import Data.Conduit.Network

main = runTCPClient (clientSettings 4000 "localhost") $ \server ->
    void $ concurrently
        (runConduitRes $ sourceFile "input.txt" .| appSink server)
        (runConduit $ appSource server .| stdoutC)

And this is the server, which counts the occurrences of each word and sends the result back to the client:

{-# LANGUAGE OverloadedStrings #-}

import Data.ByteString.Char8 (pack)
import Data.Foldable (toList)
import Data.HashMap.Lazy (empty, insertWith)
import Data.Word8 (isAlphaNum)

import Conduit
import Data.Conduit.Network
import qualified Data.Conduit.Combinators as CC

main = runTCPServer (serverSettings 4000 "*") $ \appData -> do
    hashMap <- runConduit $ appSource appData 
        .| CC.splitOnUnboundedE (not . isAlphaNum)
        .| foldMC insertInHashMap empty
    runConduit $ yield (pack $ show $ toList hashMap)
        .| iterMC print
        .| appSink appData

insertInHashMap x v = do
    return (insertWith (+) v 1 x)

The problem is that the server doesn't reach the yield phase until I manually shut down the client and therefore never answers to it. I noticed that removing the concurrency from the client and keeping only the part in which it sends data to the server, everything works fine.

So, how can I preserve the receiving part of the client without breaking the flow?


Solution

  • You have a deadlock: the client is waiting for the server to respond before it closes the connection, but the server is unaware that the client is done sending data and is waiting for more. This is basically the problem described at https://cr.yp.to/tcpip/twofd.html:

    When the generate-data program finishes, the same fd is still open in the consume-data program, so the kernel has no idea that it should send a FIN.

    In your case, the fix needs to go on the client side. You need to call shutdown with ShutdownSend on the socket once conduit is done sending the contents of input.txt over it.

    Here's one way to do so (I'm not sure if there's a nicer one):

    {-# LANGUAGE OverloadedStrings #-}
    
    import Control.Concurrent.Async (concurrently)
    import Data.Functor (void)
    import Data.Foldable (traverse_)
    
    import Conduit
    import Data.Conduit.Network
    
    import Data.Streaming.Network (appRawSocket)
    import Network.Socket (shutdown, ShutdownCmd(..))
    
    main = runTCPClient (clientSettings 4000 "localhost") $ \server ->
        void $ concurrently
            ((runConduitRes $ sourceFile "input.txt" .| appSink server) >> doneWriting server)
            (runConduit $ appSource server .| stdoutC)
    
    doneWriting = traverse_ (`shutdown` ShutdownSend) . appRawSocket
    

    Side note: you don't really need concurrency in the client in this case, since there will never be anything to read from the server until you're done writing to the server. You could just do the reading after the writing and shutdown.