I'm currently working on a 3rd year project involving data from Twitter. The department have provided me with .lzo's of a months worth of Twitter. The smallest is 4.9gb and when decompressed is 29gb so I'm trying to open the file and read as I'm going. Is this possible or do I need to decompress and work with the data that way?
EDIT: Have attempted to read it line by line and decompress the read line
UPDATE: Found a solution - reading the STDOUT of lzop -dc works like a charm
How about starting an lzop
binary in a subprocess with -c
switch and then read its STDOUT line by line?