Monday, December 9, 2013

Massive power of (liblzma based) XZ archiver

I've recently restarted gathering of bitcoin market data. I'm grabbing samples of market depth every 3 seconds and I'm collecting trade events.

Market depth samples can be quite large. Every mtgox sample appears to be about 90 kilobytes big. So 1 hour of samples is about 100 megs of data. And month is about 3 gigs. Which is a bit too much.

gzip is able to compress that about 5x. But that's still a bit too large.

I've found xz to really shine on that kind of data. More than 1 gig of data gets squeezed down to less than a meg! And what's extra cool is xz is very quick to decompress. For static data like btc market archive that's very useful.

So quality compression does matter. And I just wanted to express my ultimate respect to authors of that extremely useful software.

Have a nice day and happy hacking!