I am reading a csv file with pipes-csv library. I want to read first line and read the rest later. Unfortunately after Pipes.Prelude.head function returns. pipe is being closed somehow. Is there a way to read head of the csv first and read the rest later.
import qualified Data.Vector as V
import Pipes
import qualified Pipes.Prelude as P
import qualified System.IO as IO
import qualified Pipes.ByteString as PB
import qualified Data.Text as Text
import qualified Pipes.Csv as PCsv
import Control.Monad (forever)
showPipe :: Proxy () (Either String (V.Vector Text.Text)) () String IO b
showPipe = forever $ do
x::(Either String (V.Vector Text.Text)) <- await
yield $ show x
main :: IO ()
main = do
IO.withFile "./test.csv"
IO.ReadMode
(\handle -> do
let producer = (PCsv.decode PCsv.NoHeader (PB.fromHandle handle))
headers <- P.head producer
putStrLn "Header"
putStrLn $ show headers
putStrLn $ "Rows"
runEffect ( producer>->
(showPipe) >->
P.stdoutLn)
)
If we do not read the header first, we can read whole csv without any problem:
main :: IO ()
main = do
IO.withFile "./test.csv"
IO.ReadMode
(\handle -> do
let producer = (PCsv.decode PCsv.NoHeader (PB.fromHandle handle))
putStrLn $ "Rows"
runEffect ( producer>->
(showPipe) >->
P.stdoutLn)
)
Pipes.Csv
has material for handling headers, but I think that this question is really looking for a more sophisticated use of Pipes.await
or else Pipes.next
. First next
:
>>> :t Pipes.next
Pipes.next :: Monad m => Producer a m r -> m (Either r (a, Producer a m r))
next
is the basic way of inspecting a producer. It is sort of like pattern matching on a list. With a list the two possibilities are []
and x:xs
- here they are Left ()
and Right (headers, rows)
. The latter pair is what you are looking for. Of course an action (here in IO
) is needed to get one's hands on it:
main :: IO ()
main = do
handle <- IO.openFile "./test.csv" IO.ReadMode
let producer :: Producer (V.Vector Text.Text) IO ()
producer = PCsv.decode PCsv.NoHeader (PB.fromHandle handle) >-> P.concat
e <- next producer
case e of
Left () -> putStrLn "No lines!"
Right (headers, rows) -> do
putStrLn "Header"
print headers
putStrLn $ "Rows"
runEffect ( rows >-> P.print)
IO.hClose handle
Since the Either
values are distraction here, I eliminate Left
values - the lines that don't parse - with P.concat
next
does not act inside a pipeline, but directly on the Producer
, which it treats as a sort of "effectful list" with a final return value at the end. The particular effect we got above can of course be achieved with await
, which acts inside a pipeline. I can use it to intercept the first item that comes along in a pipeline, do some IO based on it, and then forward the remaining elements:
main :: IO ()
main = do
handle <- IO.openFile "./grades.csv" IO.ReadMode
let producer :: Producer (V.Vector Text.Text) IO ()
producer = PCsv.decode PCsv.NoHeader (PB.fromHandle handle) >-> P.concat
handleHeader :: Pipe (V.Vector Text.Text) (V.Vector Text.Text) IO ()
handleHeader = do
headers <- await -- intercept first value
liftIO $ do -- use it for IO
putStrLn "Header"
print headers
putStrLn $ "Rows"
cat -- pass along all later values
runEffect (producer >-> handleHeader >-> P.print)
IO.hClose handle
The difference is just that if producer
is empty, I won't be able to declare this, as I do with No lines!
in the previous program.
Note by the way that showPipe
can be defined as P.map show
, or simply as P.show
(but with the specialized type you add.)