Tar (file format)

related topics
{math, number, function}
{system, computer, user}
{work, book, publish}
{language, word, form}
{@card@, make, design}
{style, bgcolor, rowspan}

In computing, tar (derived from tape archive and commonly referred to as "tarball") is both a file format (in the form of a type of archive bitstream) and the name of a program used to handle such files. The format was created in the early days of Unix and standardized by POSIX.1-1988 and later POSIX.1-2001.

Initially developed to be written directly to sequential I/O devices for tape backup purposes, it is now commonly used to collect many files into one larger file for distribution or archiving, while preserving file system information such as user and group permissions, dates, and directory structures.


Compression and naming

Conventionally, uncompressed tar archive files have names ending in ".tar". Unlike ZIP archives, tar files (somefile.tar) are commonly compressed as a whole rather than piecemeal. Applying a compression utility such as gzip, bzip2, lzip, lzma or compress to a tar file produces a compressed tar file, typically named with an extension indicating the type of compression (e.g.: somefile.tar.gz).

Popular tar programs like the BSD and GNU versions of tar support the command line options -z (gzip), and -j (bzip2) to automatically compress or decompress the archive file it is currently working with. GNU tar from version 1.20 onwards also supports --lzma (LZMA). 1.21 also supports lzop via --lzop, 1.22 adds support for xz via --xz or -J, and 1.23 adds support for lzip via --lzip. Both will automatically extract compressed gzip and bzip2 archives with or without these options.

Full article ▸

related documents
Lightweight Directory Access Protocol
Self (programming language)
Pascal (programming language)
Information theory
Fuzzy control system
Exception handling
Reference counting
Structured programming
Garbage collection (computer science)
Object-oriented programming
Aspect-oriented programming
White noise
Programming language
Hamming code
Abstraction (computer science)
Data model
Control flow
Busy beaver
Arithmetic coding
Gray code
Communication complexity