De Bruijn index

In mathematical logic, the De Bruijn index is a notation invented by the Dutch mathematician Nicolaas Govert de Bruijn for representing terms in the λ calculus.^[1] Terms written using these indexes are invariant with respect to α conversion, so the check for α-equivalence is the same as that for syntactic equality. Each De Bruijn index is a natural number that represents an occurrence of a variable in a λ-term, and denotes the number of binders that are in scope between that occurrence and its corresponding binder. The following are some examples:

The term λx. λy. x, sometimes called the K combinator, is written as λ λ 2 with De Bruijn indexes. The binder for the occurrence x is the second λ in scope.
The term λx. λy. λz. x z (y z) (the S combinator), with De Bruijn indexes, is λ λ λ 3 1 (2 1).
The term λz. (λy. y (λx. x)) (λx. z x) is λ (λ 1 (λ 1)) (λ 2 1). See the following illustration, where the binders are coloured and the references are shown with arrows.
Pictorial depiction of example

De Bruijn indexes are commonly used in higher-order reasoning systems such as automated theorem provers and logic programming systems.^[2]

Formal definition

Formally, λ-terms (M, N, …) written using De Bruijn indexes have the following syntax (parentheses allowed freely):

M, N, … ::= n | M N | λ M

where n — natural numbers greater than 0 — are the variables. A variable n is bound if it is in the scope of at least n binders (λ); otherwise it is free. The binding site for a variable n is the nth binder it is in the scope of, starting from the innermost binder.

The most primitive operation on λ-terms is substitution: replacing free variables in a term with other terms. In the β-reduction (λ M) N, for example, we must:

find the variables n₁, n₂, …, n_k in M that are bound by the λ in λ M,
decrease the free variables of $M$ to match the removal of the outer $\lambda$ -binder, and
replace n₁, n₂, …, n_k with N, suitably increasing the free variables occurring in N each time, to match the number of λ-binders the corresponding variable occurs under when substituted.

To illustrate, consider the application

(λ λ 4 2 (λ 1 3)) (λ 5 1)

which might correspond to the following term written in the usual notation

(λx. λy. z x (λu. u x)) (λx. w x).

After step 1, we obtain the term λ 4 □ (λ 1 □), where the variables that are substituted for are replaced with boxes. Step 2 lowers the free variables, giving λ 3 □ (λ 1 □). Finally, in step 3 we replace the boxes with the argument; the first box is under one binder, so we replace it with λ 6 1 (which is λ 5 1 with the free variables increased by 1); the second is under two binders, so we replace it with λ 7 1. The final result is λ 3 (λ 6 1) (λ 1 (λ 7 1)).

Formally, a substitution is an unbounded list of term replacements for the free variables, written M₁.M₂…, where M_i is the replacement for the ith free variable. The increasing operation in step 3 is sometimes called shift and written ↑^k where k is a natural number indicating the amount to increase the variables; For example, ↑⁰ is the identity substitution, leaving a term unchanged.

The application of a substitution s to a term M is written M[s]. The composition of two substitutions s₁ and s₂ is written s₁ s₂ and defined by

M [s₁ s₂] = (M [s₁]) [s₂].

The rules for application are as follows:

${\begin{aligned}n[N_{1}\ldots N_{n}\ldots ]=&N_{n}\\(M_{1}\;M_{2})[s]=&(M_{1}[s])(M_{2}[s])\\(\lambda \;M)[s]=&\lambda \;(M[1.1[s'].2[s'].3[s']\ldots ])\\&{\text{where }}s'=s\uparrow ^{1}\end{aligned}}$

The steps outlined for the β-reduction above are thus more concisely expressed as:

(λ M) N →_β M [N.1.2.3…].

Alternatives to De Bruijn indexes

When using the standard "named" representation of λ-terms, where variables are treated as labels or strings, one has to explicitly handle α-conversion when defining any operation on the terms. The standard Variable Convention^[3] of Barendregt is one such approach where α-conversion is applied as needed to ensure that:

bound variables are distinct from free variables, and
all binders bind variables not already in scope.

In practice this is cumbersome, inefficient, and often error-prone. It has therefore led to the search for different representations of such terms. On the other hand, the named representation of λ-terms is more pervasive and can be more immediately understandable by others because the variables can be given descriptive names. Thus, even if a system uses De Bruijn indexes internally, it will present a user interface with names.

De Bruijn indexes are not the only representation of λ-terms that obviates the problem of α-conversion. Among named representations, the nominal approaches of Pitts and Gabbay is one approach, where the representation of a λ-term is treated as an equivalence class of all terms rewritable to it using variable permutations.^[4] This approach is taken by the Nominal Datatype Package of Isabelle/HOL.^[5]

Another common alternative is an appeal to higher-order representations where the λ-binder is treated as a true function. In such representations, the issues of α-equivalence, substitution, etc. are identified with the same operations in a meta-logic.

References

^ De Bruijn, Nicolaas Govert (1972). "Lambda Calculus Notation with Nameless Dummies: A Tool for Automatic Formula Manipulation, with Application to the Church-Rosser Theorem" (PDF). Indagationes Mathematicae. 34. Elsevier: 381–392. ISSN 0019-3577.
^ Gabbay, Murdoch J. (1999). "A New Approach to Abstract Syntax and Binding". 14th Annual Symposium on Logic in Computer Science. pp. 214–224. {{cite conference}}: Unknown parameter |booktitle= ignored (|book-title= suggested) (help); Unknown parameter |coauthors= ignored (|author= suggested) (help)
^ Barendregt, Henk P. (1984). The Lambda Calculus: Its Syntax and Semantics. North Holland. ISBN 0-444-87508-5.
^ Pitts, Andy M. (2003). "Nominal Logic: A First Order Theory of Names and Binding". Information and Computation. 186: 165–193. doi:10.1016/S0890-5401(03)00138-X. ISSN 0890-5401.
^ "Nominal Isabelle web-site". Retrieved 2007-03-28.

[de_bruijn_72-1] De Bruijn, Nicolaas Govert (1972). "Lambda Calculus Notation with Nameless Dummies: A Tool for Automatic Formula Manipulation, with Application to the Church-Rosser Theorem" (PDF). Indagationes Mathematicae. 34. Elsevier: 381–392. ISSN 0019-3577.

[gabbay_03-2] Gabbay, Murdoch J. (1999). "A New Approach to Abstract Syntax and Binding". 14th Annual Symposium on Logic in Computer Science. pp. 214–224. {{cite conference}}: Unknown parameter |booktitle= ignored (|book-title= suggested) (help); Unknown parameter |coauthors= ignored (|author= suggested) (help)

[barendregt_84-3] Barendregt, Henk P. (1984). The Lambda Calculus: Its Syntax and Semantics. North Holland. ISBN 0-444-87508-5.

[pitts_03-4] Pitts, Andy M. (2003). "Nominal Logic: A First Order Theory of Names and Binding". Information and Computation. 186: 165–193. doi:10.1016/S0890-5401(03)00138-X. ISSN 0890-5401.

[nominal_isabelle-5] "Nominal Isabelle web-site". Retrieved 2007-03-28.

[1]

[2]

[3]

[4]

[5]

Formal definition

Alternatives to De Bruijn indexes

See also

References