I've recently restarted gathering of bitcoin market data. I'm grabbing samples of market depth every 3 seconds and I'm collecting trade events.
Market depth samples can be quite large. Every mtgox sample appears to be about 90 kilobytes big. So 1 hour of samples is about 100 megs of data. And month is about 3 gigs. Which is a bit too much.
gzip is able to compress that about 5x. But that's still a bit too large.
I've found xz to really shine on that kind of data. More than 1 gig of data gets squeezed down to less than a meg! And what's extra cool is xz is very quick to decompress. For static data like btc market archive that's very useful.
So quality compression does matter. And I just wanted to express my ultimate respect to authors of that extremely useful software.
Have a nice day and happy hacking!
Market depth samples can be quite large. Every mtgox sample appears to be about 90 kilobytes big. So 1 hour of samples is about 100 megs of data. And month is about 3 gigs. Which is a bit too much.
gzip is able to compress that about 5x. But that's still a bit too large.
I've found xz to really shine on that kind of data. More than 1 gig of data gets squeezed down to less than a meg! And what's extra cool is xz is very quick to decompress. For static data like btc market archive that's very useful.
So quality compression does matter. And I just wanted to express my ultimate respect to authors of that extremely useful software.
Have a nice day and happy hacking!
No comments:
Post a Comment