The star height problem in formal language theory is the question whether all regular languages can be expressed using regular expressions of limited star height, i.e. with a limited nesting depth of Kleene stars. Specifically, is a nesting depth greater than 1 required? If so, is there an algorithm to determine how many are required? The problem was raised by Eggan (1963).
Contents
Families of regular languages with unbounded star height
The first question was answered in the negative when in 1963, Eggan gave examples of regular languages of star height n for every n. Here, the star height h(L) of a regular language L is defined as the minimum star height among all regular expressions representing L. The first few languages found by Eggan (1963) are described in the following, by means of giving a regular expression for each language:
The construction principle for these expressions is that expression e_{n + 1} is obtained by concatening two copies of e_{n}, appropriately renaming the letters of the second copy using fresh alphabet symbols, concatenating the result with another fresh alphabet symbol, and then by surrounding the resulting expression with a Kleene star. The remaining, more difficult part, is to prove that for e_{n} there is no equivalent regular expression of star height less than n; a proof is given in (Eggan 1963).
However, Eggan's examples use a large alphabet, of size 2^{n}1 for the language with star height n. He thus asked whether we can also find examples over binary alphabets. This was proved to be true shortly afterwards by Dejean & Schützenberger (1966). Their examples can be described by an inductively defined family of regular expressions over the binary alphabet {a,b} as follows–cf. Salomaa (1981):
Again, a rigorous proof is needed for the fact that e_{n} does not admit an equivalent regular expression of lower star height. Proofs are given by (Dejean & Schützenberger 1966) and by (Salomaa 1981).
Computing the star height of regular languages
In contrast, the second question turned out to be much more difficult, and the question became a famous open problem in formal language theory for over two decades (Brzozowski 1980). In fact, the problem remained open for more than 25 years until it was settled by Hashiguchi, who in 1988 published an algorithm to determine the star height of any regular language. The algorithm wasn't at all practical, being of nonelementary complexity. To illustrate the immense resource consumptions of that algorithm, Lombardy and Sakarovitch (2002) give some actual numbers:
Full article ▸
