In algorithmic information theory (a subfield of computer science), the Kolmogorov complexity of an object, such as a piece of text, is a measure of the computational resources needed to specify the object.
Kolmogorov complexity is also known as descriptive complexity, KolmogorovChaitin complexity, stochastic complexity, algorithmic entropy, or programsize complexity.
For example, consider the following two strings of length 64, each containing only lowercase letters, numbers, and spaces:
abababababababababababababababababababababababababababababababab
4c1j5b2p0cv4w1x8rx2y39umgw5q85s7uraqbjfdppa0q7nieieqe9noc4cvafzf
The first string has a short Englishlanguage description, namely "ab 32 times", which consists of 11 characters. The second one has no obvious simple description (using the same character set) other than writing down the string itself, which has 64 characters.
This image illustrates part of the Mandelbrot set fractal. Simply storing the 24bit color of each pixel in this image would require 1.62 million bits; but a small computer program can reproduce these 1.62 million bits using the definition of the Mandelbrot set. Thus, the Kolmogorov complexity of the raw file encoding this bitmap is much less than 1.62 million.
More formally, the complexity of a string is the length of the string's shortest description in some fixed universal description language. The sensitivity of complexity relative to the choice of description language is discussed below. It can be shown that the Kolmogorov complexity of any string cannot be more than a few bytes larger than the length of the string itself. Strings whose Kolmogorov complexity is small relative to the string's size are not considered to be complex. The notion of Kolmogorov complexity can be used to state and prove impossibility results akin to Gödel's incompleteness theorem and Turing's halting problem.^{[citation needed]}
[edit] Definition
To define Kolmogorov complexity, we must first specify a description language for strings. Such a description language can be based on any programming language, such as Lisp, Pascal, or Java Virtual Machine bytecode. If P is a program which outputs a string x, then P is a description of x. The length of the description is just the length of P as a character string. In determining the length of P, the lengths of any subroutines used in P must be accounted for. The length of any integer constant n which occurs in the program P is the number of bits required to represent n, that is (roughly) log_{2}n.
We could alternatively choose an encoding for Turing machines, where an encoding is a function which associates to each Turing Machine M a bitstring <M>. If M is a Turing Machine which on input w outputs string x, then the concatenated string <M> w is a description of x. For theoretical analysis, this approach is more suited for constructing detailed formal proofs and is generally preferred in the research literature. The binary lambda calculus may provide the simplest definition of complexity yet. In this article we will use an informal approach.
Any string s has at least one description, namely the program
function GenerateFixedString()
return s
If a description of s, d(s), is of minimal length—i.e. it uses the fewest number of characters—it is called a minimal description of s. Then the length of d(s)—i.e. the number of characters in the description—is the Kolmogorov complexity of s, written K(s). Symbolically,
We now consider how the choice of description language affects the value of K and show that the effect of changing the description language is bounded.
Theorem. If K_{1} and K_{2} are the complexity functions relative to description languages L_{1} and L_{2}, then there is a constant c (which depends only on the languages L_{1} and L_{2}) such that
Proof. By symmetry, it suffices to prove that there is some constant c such that for all bitstrings s,
Now, suppose there is a program in the language L_{1} which acts as an interpreter for L_{2}:
function InterpretLanguage(string p)
where p is a program in L_{2}. The interpreter is characterized by the following property:
Full article ▸
