haskellbytestringhttp-conduit

Constructing RequestBodyStream from Lazy ByteString when length is known


I am trying to adapt this AWS S3 upload code to handle Lazy ByteString where length is already known (so that it is not forced to be read in its entirety in memory - it comes over the network where length is sent beforehand). It seems I have to define a GivesPopper function over Lazy ByteString to convert it to RequestBodyStream. Because of the convoluted way GivesPopper is defined, I am not sure how to write it for Lazy ByteString. Will appreciate pointers on how to write it. Here is how it is written for reading from the file:

let file ="test"
-- streams large file content, without buffering more than 10k in memory
let streamer sink = withFile file ReadMode $ \h -> sink $ S.hGet h 10240

streamer in the code above is of type GivesPopper () if I understand it correctly. Given a Lazy ByteString with known length len, what would be a good way to write GivesPopper function over it? We can read one chunk at a time.


Solution

  • Is this what you're looking for?

    import qualified Data.ByteString as S
    import qualified Data.ByteString.Lazy as L
    import System.IO
    
    file = "test"
    -- original streamer for feeding a sink from a file
    streamer :: (IO S.ByteString -> IO r) -> IO r
    streamer sink = withFile file ReadMode $ \h -> sink $ S.hGet h 10240
    
    -- feed a lazy ByteString to sink    
    lstreamer :: L.ByteString -> (IO S.ByteString -> IO r) -> IO r
    lstreamer lbs sink = sink (return (L.toStrict lbs))
    

    lstreamer type checks but probably doesn't do exactly what you want it to do. It simply returns the same data every time the sink calls it. On the other hand S.hGet h ... will eventually return the empty string.

    Here is a solution which uses an IORef to keep track of if we should start returning the empty string:

    import Data.IORef
    
    mklstream :: L.ByteString -> (IO S.ByteString -> IO r) -> IO r
    mklstream lbs sink = do
      ref <- newIORef False
      let fetch :: IO S.ByteString
          fetch = do sent <- readIORef ref
                     writeIORef ref True
                     if sent
                       then return S.empty
                       else return (L.toStrict lbs)
      sink fetch
    

    Here fetch is the action which gets the next chunk. The first time you call it you will get the original lazy Bytestring (strict-ified). Subsequent calls will always return the empty string.

    Update

    Here's how to give out a small amount at a time:

    mklstream :: L.ByteString -> (IO S.ByteString -> IO r) -> IO r
    mklstream lbs sink = do
      ref <- newIORef (L.toChunks lbs)
      let fetch :: IO S.ByteString
          fetch = do chunks <- readIORef ref
                     case chunks of
                       [] -> return S.empty
                       (c:cs) -> do writeIORef ref cs
                                    return c
      sink fetch