The computational content of equality in Martin-Löf type theory
If you ask a type theorist what "axioms" their system assumes, you may get a look of indignation. Type theorists take pride in assuming no axioms. Instead, a type theory has grammatical rules which allow the user to define types and terms. An "axiom" in the context of type theory refers to an assumption that a certain type is inhabited, even when it is impossible to construct a term of the given type purely by following the grammatical rules. The distinction may sound like mere semantics; but a type theory is a programming language, and the constructions of the grammar have computational content even when their logical meaning is ignored. One need not appeal to any kind of logical principles in justifying the terms; one can justify them only in terms of programming.
Equality in (intensional) Martin-Löf type theory is strangely behaved. In MLTT, proofs of equality are mathematical objects in their own right that can be referred to within the theory. In particular, if \(A\) is a type, \(a, b\) are elements of \(A\), and \(p, q\) are proofs that \(a=b\), we may ask whether \(p=q\). It is known that this statement is not provable in general, although it holds, for example, when \(A\) is \(\mathbb{N}\) or the set of booleans \(tt,ff\). This often does cause genuine problems in trying to reason within the theory. So why not add that as an extra axiom, i.e., assume that any two proofs of \(a=b\) are equal? This is known to be consistent, it does not lead to a proof of \(0=1\). The response is usually: "You can always assume that the type is inhabited, but you cannot execute the inhabitant if it does not really exist; the proofs of equality constructed this way do not have computational content."
Now, this response is utterly unconvincing if you do not know what it means for equality to have computational content! The goal of this post is to explain what this means, in order that we can understand why someone would prefer to work in the pure system and not assume additional axioms governing equality.
More generally the subject of this post is induction and recursion, two key computational concepts.
First, let us note that the principle of induction in the usual sense, on the natural numbers, absolutely has computational content. The natural numbers are an inductively defined data structure. Proof by induction is proof by appeal to recursion on this structure. Probably Church numerals represent the first formalization of this idea: in the \(\lambda\) calculus, the natural number \(n\) is represented by the ability to iterate an arbitrary function \(f\) \(n\) many times, i.e., as the higher order operator that sends an arbitrary function \(f\) to the \(n\)-fold iteration of \(f\), \(f\mapsto f^{(n)}\), or \(\lambda f.\lambda x.ffff\dots f x\). From the point of view of constructive mathematics, we can see Church numerals as encoding the computational content of the principle of induction: for some property \(P(n)\) of the natural numbers, if \(t\) is a proof of \(P(0)\), and \(s\) is a proof of \(\forall n, P(n)\implies P(n+1)\), then the conclusion \(\forall n, P(n)\) can be proven directly by using Church numerals: to prove \(P(n)\), apply the \(n\)-th Church numeral \(\overline{n}\) to \(s\) and \(t\); this then \(\beta\) reduces as \(\lambda f.\lambda x.fff\dots f x)st\to_{\beta}ssss\dots st\), which by assumption is a proof of \(P(n)\).
In categorical logic, it is often said that the principle of induction for \(\mathbb{N}\) can be equivalently stated as the observation that the triple \((\mathbb{N},0,S)\) is the initial algebra for a theory with one constant symbol and one unary operation, see https://ncatlab.org/nlab/show/induction. To me this is an inappropriate statement of the induction principle as it is too weak. The problem is that it states a universal property that allows one to construct a map into some fixed object (a \((0,S)\)-algebra). But this is not what the principle of induction states: it states that we can construct evidence for a family of propositions \(\{P(n)\}_{n\in\mathbb{N}}\), or construct elements of a family of sets. Thus the principle of induction on the linked \(n\)lab page is inadequate, because it is not dependent.
A correct formulation of the principle of induction is as follows. Let \(P= \{P(n)\}_{n\in\mathbb{N}}\) be a family of sets indexed by the natural numbers. If we are given \(x\in P(0)\), and if we are given a family of functions \(f : \prod_{n\in\mathbb{N}} P(n)\to P(n+1)\), then we can construct an element \(\mathsf{natind}\;f\;x : \prod_{n\in\mathbb{N}} P(n)\) of the Cartesian product of all \(P(n)\). It is obvious that this principle of induction has computational significance: we know how to write a program which implements this. On input \(0\), return \(x\); on input \(S(n)\); return \(f(\mathsf{natind}\;f\;x\;n)\).
A similar computational principle can be given for the standard data type of Booleans, although this may seem too trivial to be worth mentioning, in fact having a good understanding of the 'degenerate' case of induction is vital. There are only two rules to construct a Boolean: \(tt\) is a Boolean, and \(ff\) is a Boolean. Given a family of sets \(\{P(b)\}_{b\in \mathsf{bool}}\), to construct an element of \(\prod_{b\in \mathsf{bool}}P(b)\), it suffices to give \(x\in P(tt)\) and \(y\in P(ff)\). This is the induction principle of the Booleans: case analysis! It is certainly of computational significance, as it is the basis of the "if-then-else" construction in all programming languages. Note again that for full expressiveness it is essential that \(P\) be allowed to depend on the Boolean, i.e., it is a predicate on the set of Booleans rather than a constant.
Before moving onto equality we will last consider a more complicated example - my suspicion is that the inductive principle of equality is "too simple to be simple" and one must see a more complex example in order to fully understand the trivial example. Let \(G\) be a directed graph, i.e, it consists of a set of vertices \(V\) and for each pair of vertices \(v,v'\) a set of edges \(E(v,v')\) from \(v\) to \(v'\). Fix some vertex \(v_0\), which we will hold constant. We would all agree that the concept of "the set of paths through \(G\) rooted at \(v_0\)" is an inductively defined set. My claim is that, like any other inductively defined data structure, it has computational content, namely, one should be able to define a function on paths through \(G\) by structural recursion on the path.
Actually, I would argue that it is more natural to inductively define the set of paths through \(G\) from \(v_0\) to \(v\), with \(v\) as an argument, and then take the disjoint union (coproduct) across all \(v\). This is what is most interesting about this example: we are not just defining a single set by induction, but a family of sets by induction, simultaneously.
So, how do we construct a path through \(G\) from \(v_0\) to \(v\), where \(v\) is variable?
- There is an empty path \(\mathsf{nil}\) from \(v_0\) to \(v_0\).
- If \(p\) is a path from \(v_0\) to \(v'\), and \(e\in E(v',v)\) is an edge from \(v'\) to \(v\), then \(\mathsf{cons}\; v'\;p\; e\) is a path from \(v_0\) to \(v\).
These are the only ways in which we can construct a path from \(v_0\) to \(v\).
What is the computational content of this inductive definition? It is that we can define functions on the set of paths by structural recursion on the path, and I will spell this out explicitly:
Let \(P(v,p)\) be a family of sets indexed by all pairs \((v,p)\) where \(v\in V\) and \(p : v_0\to v\). To construct a family \(k(v,p\) \in P(v,p\) for all such pairs \((v,p)\), the following suffices:
- Give an element of \(P(v_0,\mathsf{nil})\)
- For all \(v',v\), for all \(e \in E(v', v)\), for all paths \(p\) from \(v_0\) to \(v'\), give an element of \(P(v,\mathsf{cons}\; v'\;p\; e)\), possibly depending on \(k(v', p)\).
I encourage you to think about why this is valid as an induction principle and a computational principle, and how this principle can be realized by simple structural recursion on the body of the path \(p\).
Now, what does this have to do with equality? Because equality is, in a way, a special case of the previous example. Let \(V\) be a set, and \(v_0\) be a fixed element of \(V\), which we will hold constant. But, this time we do not have a set of edges. What are the rules by which we can construct a proof of equality \(v_0=v\), where \(v\) is some arbitrary element of \(V\)? There is only one such rule, the reflexivity rule:
- There is a proof of equality \(\mathsf{refl} \) that \(v_0=v_0\).
Notice that this rule is exactly the same as the base case rule generating the set of paths through a graph! And what is the computational significance of this rule? It should be analogous to the computational rule we have given for structural recursion on paths through a graph, except without the recursion step, and it is:
Let \(P(v, p)\) be a family of sets indexed by all possible pairs (\(v,p)\), where \(v\) is an element of \(V\) and \(p\) is a proof of equality \(v_0=v\). Then to construct a family \(k(v,p)\in P(v,p)\) for all pairs \((v,p)\), the following suffices:
- Give an element of \(P(v_0,\mathsf{refl})\)
And that's it. That is what is meant by the computational content of equality. The purpose of this post is to convince you that if you believe the recursion principle for paths through a graph has computational content, you should also believe this about recursion principle for equality, because equality is just the special case of the empty path through a graph.
Note that in the special case where \(P\) happens not to depend on \(p\) explicitly, and \(P(v)\) represents the set of evidence for a proposition about \(v\) so \(P\) can purely be regarded as a predicate on \(V\), then the principle we have just stated reduces to:
To construct a piece of evidence \(k(v,p)\) for \(P(v)\), it suffices to construct a piece of evidence for \(P(v_0)\).
Or:$$\forall v, \forall P, v_0=v \implies P(v_0)\implies P(v)$$
which is the usual substitution principle of equality, and which entails symmetry, transitivity and all the usual other properties of equality.
Comments
Post a Comment