haskellconduitjsonstream

httpSink and parsing json


I think this is a bit advanced for me, but my goal would be to get the raw json from an http API, parse a first list from it, do whatever I need to do with that then move on to the next list, and so on. My hope being that this should allow for only one list at a time to be loaded in memory (each list is pretty small, but there are a LOT of lists in the json). I tried it with Aeson, and it ate up all the ram and processed endlessly for hours, I ended up having to kill it.

If I understand it correctly, httpSink should be the way to go, with maybe json-stream to do the actual parsing. I read the tutorial about conduits, but I'm clearly not understanding it properly since I can't make that work.

I know how to use parseByteString to decode a ByteString the way I need (at least my tests seem to work), but I can't figure out a way to use parseByteString as a Sink for httpSink's second parameter. Am I missing something obvious, or am I mistaken about the way conduit works ?

Thanks


Solution

  • I haven't tested this, since I'm honestly not that familiar with the library, but I think this adapter function will make it work with conduit:

    module Data.JsonStream.Parser.Conduit
      ( jsonConduit
      , JsonStreamException (..)
      ) where
    
    import Data.Conduit
    import Data.JsonStream.Parser
    import Data.ByteString (ByteString)
    import Control.Monad.Catch
    import Data.Typeable
    
    jsonConduit
      :: MonadThrow m
      => Parser a
      -> ConduitM ByteString a m ()
    jsonConduit =
        go . runParser
      where
        go (ParseYield x p) = yield x >> go p
        go (ParseNeedData f) = await >>= maybe
          (throwM JsonStreamNotEnoughData)
          (go . f)
        go (ParseFailed str) = throwM $ JsonStreamException str
        go (ParseDone bs) = leftover bs
    
    data JsonStreamException
      = JsonStreamException !String
      | JsonStreamNotEnoughData
      deriving (Show, Typeable)
    instance Exception JsonStreamException