III Logic - Computability theory

3Computability theory

III Logic

3.1 Computability

We begin by discussing some number theory. Consider the Diophantine equation

+ y

= z

. (∗)

We can certainly write down some solutions to this equation, e.g.

+ 4

= 5

But is it possible to find all solutions? It is a well-known result that we can. Of

course, to do so, it suffices to enumerate all solutions where

x, y, z

are coprime

It turns out every such solution of (∗) can be written as

x = m

− n

, y = 2mn, z = m

+ n

for some

m, n ∈ N

, and conversely, for any

m, n ∈ N

, the (

x, y, z

) given by this

formula is a solution. Thus, if we want to list all solutions to (

∗

), then it suffices

to plug in all possible values of

and

. In other words, we have an algorithm

that enumerates all solutions to (∗).

Given any Diophantine equation, it would be great to have some similar

algorithm that enumerates all solutions to the equation. But even before that, it

would be great to have some algorithms that tell us whether there is any solution

at all. This was Hilbert’s 10th problem — is there an algorithm that determines

whether or not a Diophantine equation has a solution?

If such an algorithm existed, then to prove this, we merely have to exhibit

such an algorithm, and prove that it works? But how could we prove that there

isn’t one. To do so, we must formulate an “algorithm” as a mathematical object.

Alternatively, we want a formal notion of what is a “computable” function.

One way to do so is to define “computable functions” recursively. We will be

concerned with functions

→ N

for different

and

, and the idea is to list

some functions that any sensible computer should be able to compute, and then

close it under operations that really should preserve computability. We then

hope that the set of functions produced this way is exactly the functions that

can be “computed” by an “algorithm”.

We start with a naive attempt, and what we obtain is known as “primitive

recursive” functions.

Definition

(Primitive recursive functions)

The class of primitive recursive

functions N

→ N

for all n, m is defined inductively as follows:

– The constantly zero function f(x) = 0, f : N

→ N is primitive recursive.

–

The successor function

succ

N → N

sending a natural number to its

successor (i.e. it “plus one”) is primitive recursive.

– The identity function id : N → N is primitive recursive.

– The projection functions

proj

(x) : N

, · · · , x

) x

are primitive recursive.

Moreover,

–

Let

→ N

and

, · · · , g

→ N

be primitive recursive. Then

the function

, · · · , x

) 7→ f(g

, · · · , x

), · · · , g

, · · · , x

)) : N

→ N

is primitive recursive.

Finally, we have closure under primitive recursion

–

→ N

and

m+k+1

→ N

are primitive recursive, then so is

the function h : N

k+1

→ N

defined by

h(0, x) = g(x)

h(succ n, x) = f(h(n, x), n, x).

Do these give us everything we want? We first notice that we can define

basic arithmetic using primitive recursion by

plus(0, n) = n

plus(succ m, n) = succ(plus(m, n))

mult(1, n) = n

mult(succ m, n) = plus(n, mult(m, n)),

Similarly, we can define exp etc.

What else can we do? We might want to define the Fibonacci function

Fib(0) = Fib(1) = 1, Fib(n + 1) = Fib(n) + Fib(n − 1)?

But the definition of primitive recursion only allows us to refer to

(

n, x

) when

computing

(

succ n, x

). This isn’t really a problem. The trick is to not define

Fib directly, but to define the function

Fib

(n) =



Fib(n)

Fib(n − 1)



instead, using primitive recursion, and then projecting back to the first compo-

nent. But this requires us to fix the number of things we are required to refer

back to. What if we want to refer back to everything less than n?

The solution to this is that we can actually encode pairs as natural numbers

themselves:

Proposition.

There exists primitive recursion functions

pair

→ N

and

unpair : N → N

such that

unpair(pair(x, y)) = (x, y)

for all x, y ∈ N.

Proof. We can define the pairing function by

pair(x, y) =



x + y + 1



+ y.

The unpairing function can be shown to be primitive recursive, but is more

messy.

The benefit of this is that we can now encode lists as naturals. What would

it mean to be a list? We will informally denote a list by square brackets, e.g.

, · · · , x

]. The idea is that we have functions head, tail, cons such that

head [x

, · · · , x

] = x

tail [x

, · · · , x

] = [x

, · · · , x

]

cons(x, [x

, · · · , x

]) = [x, x

, · · · , x

]

If we think a bit harder, then what we are really looking for is really the following

property:

Corollary.

There exists

cons

N → N, head

N → N

and

tail

N → N

such that

cons(head x, tail x) = x

Proof.

Take

cons

pair

;

head

to be the first projection of the unpairing and

tail

to be a second projection.

So we can encode lists as well. We can lock ourselves in a room with enough

pen and paper, and try to prove that most functions

N → N

we encounter “in

daily life” are indeed primitive recursive.

However, it turns out there are some contrived examples that really should

be computable, but not primitive recursive.

Definition (Ackermann function). The Ackermann function is defined to be

A(0, n) = n + 1

A(m, 0) = A(m − 1, 1)

A(m + 1, n + 1) = A(m, A(m + 1, n)).

Proposition. The Ackermann function is well-defined.

Proof.

To see this, note that computing

(

+ 1

, n

+ 1) requires knowledge of

the values of A(m + 1, n), and A(m, A(m + 1, n)).

Consider

N × N

ordered lexicographically. Then computing

(

+ 1

, n

+ 1)

requires knowledge of the values of

at pairs lying below

+ 1

, n

+ 1

in this

order. Since the lexicographic order of

N × N

is well-ordered (it has order type

ω × ω), by transfinite induction, we know A is well-defined.

We can indeed use this definition to compute

, and it is going to take

forever. But we can still do it.

However, this function is not primitive recursive! Intuitively, this is because

the definition of

does “transfinite recursion over

ω × ω

”, while the definition

of primitive recursion only allows us to do “transfinite recursion over ω”.

That is the idea, but obviously not a proof. To prove it properly, this is done

by proving that the Ackermann “grows too fast”.

Definition

(Dominating function)

Let

f, g

N → N

be functions. Then we

write

f < g

if for all sufficiently large integer

, we have

(

)

< g

(

). We say

dominates f.

It is a remarkable theorem that

Theorem.

The function

n 7→ A

(

n, n

) dominates all primitive recursive functions.

We will not prove this, as it is messy, and not really the point of the course.

The point of bringing up the Ackermann function is just to know that we are

not doing good enough!

The obvious fix would be to allow some more complicated form of recursion,

and hope that we will be safe. But this doesn’t really do the job.

It turns out the problem is that we are only defining total functions. If we

tried to compute any of the primitive recursive functions, we are guaranteed

that we can do it in finite time. In particular, our functions are always defined

everywhere. But anyone with programming experience knows very well that

most programs we write are not everywhere defined. Often, they are nowhere

defined, and just doesn’t run, or perhaps gets stuck in a loop for ever! But we

did write some code for this program, so surely this nowhere defined function

should be computable as well!

Our way of thinking about functions that are not everywhere defined is via

non-termination. We imagine we have a computer program trying to compute

a function

. If

(42) is undefined, instead of the program running for a while

and then saying “I give up”, the program keeps running forever,

It turns out we can “fix” the problem merely by introducing one operation —

inverses. This is slightly subtle. Imagine we are writing a compute program, and

suppose we have a (computable) function

N → N

. We want to compute an

inverse to

. Say we want to find

−1

(17). To compute this, we just compute

(0),

(1),

(2), etc., and if 17 is in the image, then we will eventually find

something that gets sent to 17.

There are two minor problems and one major problems with this:

– f

might not be injective, and there are multiple values that get sent to 17.

If this happens, we just agree that we return the first natural number that

gets sent to 17, which is what is going to happen if we perform the above

procedure.

– f

might not be surjective, and we can never find something that gets sent

to 17. But this is not a problem, since we just decided that our functions

don’t have to be total! If

is not surjective, we can just let the program

keep running forever.

But there is still a major problem. Since we decided functions need not be total,

what happens if

is total? We might have

(23) = 17, but

(11) never halts.

So we never learn that f(23) = 17.

We might think we can get around this problem by “diagonalizing”. We

assume our computers perform computations for discrete time steps. Then we

compute

(1) for one step; then

(1) and

(2) for two steps; then

(1),

(2)

and

(3) for three steps, etc. Then if

(23) = 17, and it takes 100 steps of

computation to compute this, then we will figure this out on the 100th iteration

of this process.

This doesn’t really solve our problem, though. If

is not injective, then this

doesn’t guarantee to return the minimum

such that

(

) = 17. It just returns

some arbitrary x such that f (x) = 17. This is bad.

So we make a compromise. The obvious compromise is to just allow computing

inverses of total functions, but since total functions are hard to invert, we shall

make a further compromise to just allow computing inverse of primitive recursive

functions.

Before we continue, we define some useful conventions:

Notation.

We write

(

)

↑

(

) is undefined, or alternatively, after we define

what this actually means, if the computation of

(

) doesn’t halt. We write

f(x) ↓ otherwise.

Definition

(Partial recursive function)

The class of partial recursive functions

is given inductively by

– Every primitive recursive function is partial recursive.

–

The inverse of every primitive recursive function is partial recursive, where if

k+n

→ N

, then the/an inverse of

is the function

−1

k+m

→ N

given by

−1

(x; y) =

(

min{z : f(x, z) = y} if exists

↑ otherwise

–

The set of partial recursive functions is closed under primitive recursion

and composition.

Of course, this allows us to compute the inverse of functions of several inputs

using pairing and unpairing functions.

It turns out this definition is “correct”. What do we mean by this?

What we have described so far is syntactic declarations of functions. We

describe how we can build up our functions. We can also define computability

semantically, by describing some rules for constructing machines that compute

functions. In other words, we actually define what a “computer” is, and then a

function is computable if it can be computed by a computer. What we want to

prove is that the partial recursive functions are exactly the functions that can

be computed by a machine.

We will not go into details about how we can define these machines. It is pos-

sible to do so; it doesn’t really matter; and the actual details are messy. However,

any sensible construction of machines will satisfy the following description:

–

A machine can be described by a finite string. In particular, it is possible

to number all the possible machines. Thus, each machine can be specified

by a G¨odel number uniquely. We write

{m}

for the machine corresponding

to the numbering m.

–

A machine has a countable (enumerated) number of possible states, and

one of them is called

HALT

. A machine also has a register that can store a

natural number.

–

A machine can take in a fixed number of inputs. Afterwards, in every

discrete time step, it (potentially) changes the content of its register, and

moves to a new state. However, if the current state is

HALT

, then it does

nothing.

For inputs x

, · · · , x

into the machine {m}, we write

{m}(x

, · · · , x

) =

(

value in the register after halting if machine halts

↑ otherwise

Thus, we will also write {m} for the function computed by {m}.

–

The above process is “computable” in the following precise sense — we

define Kleene’s

function

(

m, i, t

) :

→ N

whose value encodes (in a

primitive recursive way) the states and register contents during the first

steps in the evaluation of

{m}

(

). Then

is a primitive recursive function.

–

For any partial recursive function

, there is a machine

{m}

such that

{m}(x) = f(x) for all x ∈ N. In particular {m}(x) always halts.

Now of course, given our requirements for what a “machine” should be, it is

absolutely obvious that functions computable by programs in the above sense is

exactly the same as the functions that are partial recursive. The only perhaps

slightly non-trivial part is to show that every function computed by a machine

is partial recursive. To prove this, given a machine

{m}

, inversion gives us the

function

g(i) = min{t : T (m, i, t) has final state HALT},

and then we can define

(

) by computing

(

m, i, g

(

)) and then extracting the

process. Then h(i) = {m}(i), and so {m} is a partial recursive function.