related topics
{math, number, function}
{system, computer, user}
{rate, high, increase}
{day, year, event}

bzip2 is a free and open source lossless data compression algorithm and program developed by Julian Seward. Seward made the first public release of bzip2, version 0.15, in July 1996. The compressor's stability and popularity grew over the next several years, and Seward released version 1.0 in late 2000.


Compression efficiency

bzip2 compresses most files more effectively than the older LZW (.Z) and Deflate (.zip and .gz) compression algorithms, but is considerably slower. LZMA is generally more efficient than bzip2, while having much faster decompression.[2]

bzip2 compresses data in blocks of size between 100 and 900 kB and uses the Burrows–Wheeler transform to convert frequently-recurring character sequences into strings of identical letters. It then applies move-to-front transform and Huffman coding. bzip2's ancestor bzip used arithmetic coding instead of Huffman. The change was made because of a software patent restriction.[3]

bzip2 is asymmetric, as decompression is relatively fast. Motivated by the large CPU time required for compression, a modified version was created in 2003 called pbzip2 that supported multi-threading, giving almost linear speed improvements on multi-CPU and multi-core computers.[4] As of May 2010, this functionality has not been incorporated into the main project.

Like gzip, bzip2 is only a data compressor. It is not an archiver like RAR or ZIP; the program itself has no facilities for multiple files, encryption or archive-splitting, but, in the UNIX tradition, relies instead on separate external utilities such as tar and GnuPG for these tasks.

Compression stack

Bzip2 uses several layers of compression techniques stacked on top of each other, which occur in the following order during compression and the reverse order during decompression:

Full article ▸

related documents
Object database
Java Database Connectivity
Jackson Structured Programming
Comparison of Java and C++
Convolutional code
Prototype-based programming
Interpreter (computing)
Header file
Object-relational mapping
List of computing topics
Macro (computer science)
Conway's Game of Life
Threaded code
Pseudorandom number generator
Unicode and HTML
Sheffer stroke
Associative array
Brute force attack
Minimum spanning tree
Even and odd permutations
Cauchy's integral formula
Positive-definite matrix
Finite difference
Hypercomplex number
Bresenham's line algorithm
Liouville number