LR parser

related topics
{math, number, function}
{area, part, region}

In computer science, an LR parser is a parser that reads input from Left to right (as it would appear if visually displayed) and produces a Rightmost derivation. The term LR(k) parser is also used; where the k refers to the number of unconsumed "look ahead" input symbols that are used in making parsing decisions. Usually k is 1 and the term LR parser is often intended to refer to this case.

The syntax of many programming languages can be defined by a grammar that is LR(1), or close to being so, and for this reason LR parsers are often used by compilers to perform syntax analysis of source code.

An LR parser can be created from a context-free grammar by a program called a parser generator or hand written by a programmer. A context-free grammar is classified as LR(k) if there exists an LR(k) parser for it, as determined by the parser generator.

An LR parser is said to perform bottom-up parsing because it attempts to deduce the top level grammar productions by building up from the leaves.

A deterministic context-free language is a language for which some LR(k) grammar exists. Every LR(k) grammar for k > 1 can be mechanically transformed into an LR(1) grammar for the same language, while an LR(0) grammar for the same language may not exist; the LR(0) languages are a proper subset of the deterministic ones.

An LR parser is based on an algorithm which is driven by a parser table, a data structure which contains the syntax of the computer language being parsed. So the term LR parser actually refers to a class of parser that can process almost any programming language, as long as the parser table for a programming language is supplied. The parser table is created by a program called a parser generator.

LR parsing can be generalized as arbitrary context-free language parsing without a performance penalty, even for LR(k) grammars. This is because most programming languages can be expressed with LR(k) grammars, where k is a small constant (usually 1). Note that parsing of non-LR(k) grammars is an order of magnitude slower (cubic instead of quadratic in relation to the input length).

LR parsing can handle a larger range of languages than LL parsing, and is also better at error reporting, i.e. it detects syntactic errors when the input does not conform to the grammar as soon as possible. This is in contrast to an LL(k) (or even worse, an LL(*) parser) which may defer error detection to a different branch of the grammar due to backtracking, often making errors harder to localize across disjunctions with long common prefixes.

LR parsers are difficult to produce by hand and they are usually constructed by a parser generator or a compiler-compiler. Depending on how the parsing table is generated, these parsers can be called simple LR parsers (SLR), look-ahead LR parsers (LALR), or canonical LR parsers. LALR parsers have more language recognition power than SLR parsers. Canonical LR parsers have more recognition power than LALR parsers.

Contents

Full article ▸

related documents
Limit (category theory)
Elliptic curve cryptography
Binary search tree
Recurrence relation
Calculus
Numeral system
Fourier series
Kolmogorov complexity
Scheme (programming language)
Functional programming
Collatz conjecture
Topological space
Computational complexity theory
Markov chain
Variable
Hash table
Adjoint functors
Zermelo–Fraenkel set theory
Matrix multiplication
Tensor product
Peano axioms
Limit superior and limit inferior
Axiom
Design Patterns
Naive set theory
Axiom of choice
Inverse function
Finite field
Recursion
Word problem for groups