I'm creating data dumps from my site for others to download and analyze. Each dump will be a giant XML file.
I'm trying to figure out the best compression algorithm that:
I know the basics of compression, but haven't a clue as to which algo fits the bill. I'll be using MySQL and Python to generate the dump, so I'll need something with a good python library.
GZIP with standard compression level should be fine for most cases. Higher compression levels=more CPU time. BZ2 is packing better but is also slower. Well, there is always a trade-off between CPU consumption/running time and compression efficiency...all compressions with default compression levels should be fine.