A response to an undergrad - What do I study?
An undergrad that I was a T.A. for recently asked me what my research is in. I decided to write up an answer as a blog post, in a way that is accessible to the average early undergrad student in STEM.
I am interested in the
"meanings" of these logics - geometric logic, forcing logic, modal
logic, constructive logic - making them easier to understand as
deductive systems by giving concrete mathematical structures that allow
us to "interpret" the things we are talking about in those structures.
Are there settings which support a logic very similar to
traditional/classical logic, except that the law of excluded middle or
other nonconstructive proof techniques may fail? Categorical logic gives
ways to attack each of these questions, which turn out to be all
related to each other, and I find this very interesting.
I study logic. More specifically, I study semantics. In linear
algebra, you start off with a certain formal algebraic language, the
language of vector spaces - it has certain distinguished constant
symbols you are permitted to use, like "0" for the zero vector, or any
real number; you are permitted to use variables v, w, ... that range
across the vectors; there are also special symbols that represent
functions, like "+", for addition. There are also certain axioms,
written in the language. One also has a list of formal logical laws that
allow you to combine the axioms in various ways to derive theorems in
the theory of linear algebra. Linear algebra from this point of view is
the collection of all formal statements in the language of linear
algebra that can be derived from the axioms under the accepted logical
laws. All this is considered "syntax" - it is the purely formal, textual
part of mathematics. This can all be coded into a computer system - a
computer can check whether a sentence in the language of linear algebra
is grammatically correct, or whether a proof is correct (i.e. the
logical laws are applied correctly and in the right order.)
But
this is only half of it. The subject of linear algebra only exists
because we think of these axioms, constants, function symbols, and
theorems as having meaning. From this point of view, a vector space is a
mathematical structure which serves as a "model" for the theory - it is
a set V of elements, which we call vectors, for which the various
constants (0,...) in the language can be interpreted and given meaning;
for which the function symbols ("+", ...) can be interpreted as actual
functions, for which the axioms and theorems can be translated into
statements about these structures, and where the axioms hold true, and
the theorems tell us something about the structure in question.
Semantics imbues syntax with content. Without actual structures to talk
about and describe, syntax is meaningless. The notion of
"provable/unprovable" belongs to syntax; the notion of "true/false"
belongs to semantics.
The study of semantics is
also of some technical use to the study of syntax. One of the worries a
mathematician might experience when seeing the theory of linear algebra
for the first time is: what if, from the axioms, it is possible to
derive an absurdity - like "There exists some vector v which is not
equal to itself", or to derive every sentence in the language
whatsoever, both "P" and "Not P" for every possible sentence P. This
would clearly render the whole theory garbage, so it is important to try
and work out a way to assure oneself that this does not happen. A good
way to do this is by working out the relationship between syntax and
semantics. It is a standard segment of an introductory logic class to
develop an interpretation of each formula and sentence in the language
of linear algebra into each particular vector space; where it can now be
assigned a value of "true" or "false" in the context of that particular
vector space; it is then possible to show that if a sentence P is
proveable from the vector space axioms, it will be true when you
interpret it in each particular vector space. We refer to this as the soundness
of the logic. This is, of course, as we would hope! But it is still
necessary to show it. (Question: What do you think about the converse
direction, which is called completeness? Suppose that some
sentence P holds true in every vector space in existence. Must there
necessarily be a proof of P from the axioms? Or is it the case that
whenever P is unprovable, there necessarily exists a counterexample to
P? Could it be the case that there are true statements about all vector
spaces which nevertheless cannot be proven to be true?)
Once
this is done, you can use provability to figure things out about truth,
and vice versa. In the context of the theorem I just mentioned, that if
P is proveable, it is true in every vector space, here is a proof that
the sentence "There exists a vector v which is not equal to itself" is
not provable from the vector space axioms.
1.
If the sentence "There exists a vector v which is not equal to itself"
is proveable from the axioms, it is true when interpreted in every
vector space.
2. Take the specific vector space (R^3, 0,+),
equipped with the usual structure. If the sentence above is proveable
from the axioms, it is true when interpreted in this vector space.
3. But each vector in R^3 is equal to itself; so the sentence cannot be provable.
Of
course this is just the notion of "counterexample", which is a thing
that makes sense only in light of the relationship between truth and
provability which we commented on above.
I
could go on about many of the key results of classical logic - Godel's
Incompleteness theorems for the formal theory of arithmetic, Cantor's
diagonal argument about the existence of "multiple infinities" and
Hilbert's hotel, the unsolvability of the halting problem, Godel's
completeness theorem for the semantics of first-order logic using set
theory, and so on. I am sure most of these have very nice Youtube
explainers by Numberphile or 3blue1brown or someone like that;
absolutely worth checking out sometime. But I should move on for the
moment.
Now you might think all this is a
little silly as we are talking about these very simple theories, where
the relationships are perhaps so obvious that they are not worth
commenting on. However, as the mathematical formal language one wants to
work in gets more and more complicated, the problem of actually
figuring out what the semantics are grows more and more complicated. The
reason you might want a more complicated language is because it's more
expressive, and easier to work in and to translate your proofs into. It
could even be the case that the new language lets you say things or
phrase arguments that couldn't be phrased at all in the old language.
This is not really a problem for mathematicians; because they
don't actually ever translate their arguments into formal systems in
practice. So there is a formal system called ZFC set theory + first
order logic, which is pretty simple from a technical sense, into which
all of existing mathematics could theoretically be translated. However, it is so simple (i.e. technically non-expressive) that this would be excruciating
in actual practice, and far more work than it was worth. Mathematicians
pull this sleight of hand where they tell their students "this can be
rigorously justified in the language of ZFC set theory" when this would
be such an unfeasibly monumental task that their bluff can never be
called. The benefit is that first-order logic has a very
easy-to-understand semantics (meaning); and it is easy to study, which
has given rise to a huge field of mathematical logic which studies first
order logic, called model theory.
However, there is a group of workers who actually do work in complex formal languages all the time: programmers.
A programming language is a formal system. And because programmers want
their jobs to be easier, and their code to be better and cleaner and
easier to understand and less buggy, they are constantly clamoring for
better, more expressive programming languages in which it is easier to
express their thoughts. (This is a total lie that i'm telling for the
purpose of my story. Programmers are human beings, and human beings hate
change, so you have to force new programming languages and ideas in
programming language down their throats half the time. Also, thinking is
hard, and if you give people languages in which it's possible to
express richer and more complex ideas, they might just be like "well,
having richer and more complex ideas seems like a lot of work." This
isn't even me being cynical, I don't like thinking hard more than a few
hours a day either!)
So programming
languages are much more complicated than mathematical languages. The
problem of determining the semantics of a programming language - what is
"meant" by all the types and terms and polymorphic operators and so on -
is much harder to specify. This is what I study - the semantics of
programming languages, and how to use this as a tool to reason about
programs written in the language, and prove things about programming
languages.
Some developments in mathematics
of this past century have helped to clarify how this should be done. A
modern perspective on linear algebra is that it is equally important to
study both vector spaces and linear transformations
between them. A great deal of the structure of vector spaces is
reflected in the study of maps which preserve that structure. Take, for
example, dimension, which is intimately related to the problem of
whether there exists an injective linear transformation between two
spaces.
One school of thought is that linear
algebra is really the study of a huge, interconnected network or
directed graph, whose nodes are all of the vector spaces, and whose
edges or links are the linear transformations between them. A huge
hairball of a network, massively entangled. This hairball is called the category
of linear spaces and linear transformations. Category theory is a field
which studies branches of math through their "hairball", trying to see
how much of the subject can be understood, phrased or argued from the
point of view of the network. For example, whenever you have two sets,
both equipped with a notion of distance between two points, like R^n
does (called a metric) you can formulate the epsilon-delta
definition of continuity, and thus define a continuous function between
them. If we call a set equipped with a distance function a metric space, then there is a category whose nodes are all the metric spaces and whose edges are the continuous maps between them.
My
research deals with trying to understand a programming language as
having an associated "category", whose nodes are the types (integers,
booleans, arrays, structs, ...) and whose arrows are the functions
between them coded in the language. The semantics of the programming
language - what a program "means" - I try to express in terms of this
graph. As the programming language grows more complex, the structural
requirements of the associated graph do, too. This is categorical logic.
I
will attach some notes if you're curious. Be warned that category
theory is infamous for how abstract it is; it makes linear algebra and
calculus look downright concrete. Most students of mathematics don't
study it until grad school, when they have picked up enough of the
terrain of mathematics that they are able to understand the purpose of
all this abstraction and have seen enough examples that they can root
the abstraction in their own personal experiences.
There
are a couple very closely related areas which I also find interesting.
Often when dealing with a specific concrete problem or class of
problems, it becomes necessary or useful to introduce a customized logic
which reflects the subtleties of the issue at hand. For example, if you
were investigating some dynamical system which was evolving in time,
you might find it useful to invent a logical system for which time was
an explicit part of the logic, and you could have "next", "now",
"later", "eventually", and "always" as modifiers in your logic. Instead
of just "P(x)" or "not P(x)" you could have "eventually, P(x)" or
"always P(x)" or "never P(x)" Similarly you could design a logical
system which allowed us to deal with incoming information which could
change our perspective. In addition to firmly concluding "P(x)" or "not
P(x)", you could add qualifiers to deal with the effects of new
information not yet seen - "necessarily P(x)" or "possibly P(x)". This
is called modal logic.
The central method of
set theory revolves around such a custom logic, which is a logic
dedicated to building and constructing things in stages, reasoning about
them as they are built. So the logic doesn't just revolve around a set
of "things", it has both a set of "things" and partial approximations to
those things, or partial specifications of those things. The logic is
meant to help relate what we know about the object we're building at
finite initial stages to what the properties of the final logic would
be. This is called "forcing logic."
There are
also logics that arise in geometry. If you had some space like R^n, and
studying functions on it (say, smooth continuous functions from R^n into
the reals) there might be a difference between "f : R^n -> R has
property P" and "f : R^n -> R has property P, locally", i.e., at
every point x\in R^n there is some open ball containing x where P(f)
holds if you restrict the domain of f to that open ball. For example,
maybe you could take "f is bounded", i.e. |f(x) | <M for all x, vs. f
is locally bounded - for every x there is some open ball U containing
x, and |f(y)|< M for some M and for all y in U. Here, then, "locally"
becomes a modifier of property P in the same way that "eventually P"
and "possibly P" modify P, in a way that is unintelligible to classical
logic. Call this "geometric logic."
Here is
another example. In homework you often argue as follows. "We will show
that this system of equations has a solution. Suppose, for the sake of a
contradiction, that it does not.... (proof proof proof)... but this is
impossible, because we can derive this absurdity, 0=1! So there must,
indeed, be a solution to the system of equations." Say an engineer hired
you to find a solution to this system of equations, and you delivered
them this. What would they say? What is the use of them to this, which
claims that there exists a solution but does not tell them what
the solution is. How bizarre! The proof is devoid of numerical meaning!
We have shown that classical logic fails to have the following
desirable property:
1. whenever we prove that there exists some x with property P, we construct a specific x with property P.
It also fails to have this (sometimes desirable) property:
2. Whenever we prove that "A or B" is true, we have a proof of A or a proof of B.
Indeed,
you used many times in your homework that "A or not A" must be true,
even before you had found a proof of A or a proof of not A.
A constructive logic (or intuitionistic
logic) is a logic which drops some of the permitted nonconstructive
proof techniques like proof by contradiction or the law of the excluded
middle in order to make the statements that it proves acquire more
numerical meaning.
Comments
Post a Comment