14-Jan-2021 | Like this? Dislike this? Let me know |
I have been using gzip for ... well, forever. It's practically muscle memory at this point. Many utils (and emacs) recognize the magic bytes at the start of the file and will automatically decompress before doing what they need to do, a classic example being tar tvf myfile.tar.gz.
Having done some experimenting recently, it turns out that the popular compressors have different output file performance and size characteristics. This is not a surprise but in some cases the difference is significant enough to warrant choosing one vs. the other. In particular, the pbzip2 util seems to be not part of the standard install on some images yet it appears to outperform regular bzip, gzip, and zip.
[Event "Rated Classical game"] [Site "https://lichess.org/j1dkb5dw"] [White "BFG9k"] [Black "mamalak"] [Result "1-0"] [UTCDate "2012.12.31"] [UTCTime "23:01:03"] [WhiteElo "1639"] [BlackElo "1403"] [WhiteRatingDiff "+5"] [BlackRatingDiff "-8"] [ECO "C00"] [Opening "French Defense: Normal Variation"] [TimeControl "600+8"] [Termination "Normal"] 1. e4 e6 2. d4 b6 3. a3 Bb7 4. Nc3 Nh6 5. Bxh6 gxh6 6. Be2 Qg5 7. Bg4 h5 8. Nf3 Qg6 9. Nh4 Qg5 10. Bxh5 Qxh4 11. Qf3 Kd8 12. Qxf7 Nc6 13. Qe8# 1-0 [Event "Rated Classical game"] [Site "https://lichess.org/a9tcp02g"] [White "Desmond_Wilson"] [Black "savinka59"] [Result "1-0"] [UTCDate "2012.12.31"] [UTCTime "23:04:12"] [WhiteElo "1654"] [BlackElo "1919"] [WhiteRatingDiff "+19"] [BlackRatingDiff "-22"] [ECO "D04"] [Opening "Queen's Pawn Game: Colle System, Anti-Colle"] [TimeControl "480+2"] [Termination "Normal"] 1. d4 d5 2. Nf3 Nf6 3. e3 Bf5 4. Nh4 Bg6 5. Nxg6 hxg6 6. Nd2 e6 7. Bd3 Bd6 8. e4 dxe4 9. Nxe4 Rxh2 10. Ke2 Rxh1 11. Qxh1 Nc6 12. Bg5 Ke7 13. Qh7 Nxd4+ 14. Kd2 Qe8 15. Qxg7 Qh8 16. Bxf6+ Kd7 17. Qxh8 Rxh8 18. Bxh8 1-0
Original file size: 5,766,951,106 bytes (5.76GB). Times in wall clock time measured from shell using shell builtin 'time'
Util | Mode | Compress Time | Compress Size | Compress Ratio | Decompress Time |
---|---|---|---|---|---|
pbzip2 | -9 (best) | 221 sec | 0.998 GB | 5.8X | 67 sec |
pbzip2 | -1 (fast) | 152 sec | 1.225 GB | 4.7X | 53 sec |
gzip | -1 (fast) | 92 sec | 2.062 GB | 2.8X | 43 sec |
zip | -1 (fast) | 81 sec | 2.062 GB | 2.8X | 48 sec |
bzip2 | -1 (fast) | 490 sec | 1.225 GB | 4.7X | 157 sec |
bzip2 | -9 (best) | 588 sec | 0.998 GB | 5.8X | 210 sec |
gzip | -9 (best) | 551 sec | 1.631 GB | 3.5X | 37 sec |
zip | -9 (best) | 542 sec | 1.631 GB | 3.5X | 42 sec |
Special test with m5.16xlarge (64 VCPU, 256G RAM) | |||||
pbzip2 | -9 (best) | 16 sec | 0.998 GB | 5.8X | 8 sec |
curl -o /tmp/pbzip2.rpm https://ftp.tu-chemnitz.de/pub/linux/dag/redhat/el7/en/x86_64/rpmforge/RPMS/pbzip2-1.0.5-1.el7.rf.x86_64.rpm && sudo rpm -Uvh /tmp/pbzip2.rpm
Like this? Dislike this? Let me know