haskellnetwork-programminghttp-conduit

Downloading large files from the Internet in Haskell


Are there any suggestions about how to download large files in Haskell? I figure Http.Conduit is is the library is a good library for this. However, how does it solve this? There is an example in its documentation but it is not fit for downloading large files, it just downloads a file:

 import Data.Conduit.Binary (sinkFile)
 import Network.HTTP.Conduit
 import qualified Data.Conduit as C

 main :: IO ()
 main = do
      request <- parseUrl "http://google.com/"
      withManager $ \manager -> do
          response <- http request manager
          responseBody response C.$$+- sinkFile "google.html"

What I want is be able to download large files and not run out of RAM, e.g. do it effectively in terms of performance, etc. Preferably, being able to continue downloading them "later", meaning "some part now, another part later".

I also found the download-curl package on hackage, but I'm not positive this is a good fit, or even that it downloads files chunk by chunk like I need.


Solution

  • Network.HTTP.Conduit provides three functions for performing a request:

    Out of the three functions, the first two functions will make the entire response body to live in memory. If you want to operate in constant memory, then use http function. The http function gives you access to a streaming interface through ResumableSource

    The example you have provided in your code uses interleaved IO to write the response body to a file in constant memory space. So, you will not run out of memory when downloading a large file.