Is LZMA Algoritm is Better than Bzip2?

There are few tools that can be used to compress LZMA (like P7ZIP archiver), but I chose [url=http://tukaani.org/lzma/[/url]LZMA Utils[/url] because it has a command line compatible with gzip and bzip2, so replacing them with LZMA is simple. The command is called lzma and produces .lzma files by default.

Comparison

First thing I used LZMA for was compressing my mail archive. The spam file (mail in mbox format) I chose is 528MB big and I will use maximum compression ratio. During compression the lzma process was 370MB big, that’s much :) bzip2 was below 7MB. It took almost 15 minutes to compress the file by lzma and less than 4 minutes by bzip2. Compression ration was very similar: output file is 373MB for bzip2 and 370MB for lzma. Decompression time is 1m12s for lzma and 1m48s for bzip2.

Not very impressive, but compressing text files is easy. Everyone who tried to implement or invent a simple compression algorithm can achieve good results with text files, so what about binary data? I’ve created a tar archive from /usr/bin directory on my laptop. It’s 308MB big. Bzip2 file is 127MB big (59% ratio) and LZMA is 83MB (73% ratio). This is a real difference!

Integration with software

Since my mail archive is now lzma compressed because of faster access time I has a need to teach mutt to open such mailboxes. This was simple, just copy & paste support for gzip archives into ~/.muttrc because lzma command line is the same:

open-hook \\.lzma$ “lzma -cd ‘%f’ > ‘%t’”
close-hook \\.lzma$ “lzma -c ‘%t’ > ‘%f’”
append-hook \\.lzma$ “lzma -c ‘%t’ >> ‘%f’”

Fresh versions of tar archiver (from 1.20 version) also have –lzma switch.

Share
This entry was posted in Linux Tricks. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>