I have a bunch of very huge files located on a Linux machine that I would like to compress and save some space. I have tried using the tar/gzip combination and I have noticed that the compression ratio is not very good. A 1.2GB
file was compressed into a 1.1GB
file. I have tried increasing the compression level as suggested here: How to specify level of compression when using tar -zcvf?
but it still wasn't any better. I've copied the same file to a Windows machine and ran WinRar on it. The resulting compressed file was only 0.45GB
in size.
Is there a reason for such a huge discrepancy? Is there a better compressing tool for Linux?
UPDATE: I've even tried lzma and still not much better
Answer
Gzip is not a very good algorithm compared to Rar.
A more common method for linux these days is bzip2 which is installed by default on almost all linux distributions.
You can switch the tar
archiver to use bzip2 compression by changing your command line to tar -cvjf
rather than tar -cvzf
the key being the replacement of the z
with j
in the options.
This should hopefully yield a good increase in compression ratio.
The reason for the discrepancy is because they are fundamentally different algorithms for compression. Gzip is an older algorithm and older algorithms tend to be less computationally intensive so that they would finish in a reasonable time. This is an effect of more readily available processing power, better and more computationally intensive algorithms can be used that finish in a similar time than an older algorithm did on an older computer. Conversely the older algorithms will complete compression much faster on a newer computer.
Almost any Windows archiver has an equivalent on Linux. 7zip is a nice archiver that gets good results on Windows and has an unofficial Linux version.
No comments:
Post a Comment