1Classical computation theory

III Quantum Computation

1 Classical computation theory

To appreciate the difference b etween quantum and classical computing, we need

to first understand classical computing. We will only briefly go over the main

ideas instead of working out every single technical detail. Hence some of the

definitions might be slightly vague.

We start with the notion of “computable”. To define computability, one

has to come up with a sensible mathematical model of a computer, and then

“computable” means that theoretical computer can compute it. So far, any two

sensible mathematical models of computations we manage to come up with are

equivalent, so we can just pick any one of them. Consequently, we will not spend

much time working out a technical definition of computable.

Example.

Let

N

be an integer. We want to figure out if

N

a prime. This is

clearly computable, since we can try all numbers less than

N

and see if it divides

N.

This is not to o surprising, but it turns out there are some problems that are

not computable! Most famously, we have the Halting problem.

Example

(Halting problem)

.

Given the code of a computer program, we want

to figure out if the computer will eventually halt. In 1936, Turing proved that

this problem is uncomputable! So we cannot have a program that determines if

an arbitrary program halts.

For a less arbitrary problem, we have

Example.

Given a polynomial with integer coefficients with many variables,

e.g. 2

x

2

y −

17

zw

19

+

x

5

w

3

+ 1, does this have a root in the integers? It was

shown in 1976 that this problem is uncomputable as well!

These results are all for classical computing. If we expect quantum computing

to be somehow different, can we get around this problems? This turns out not to

be the case, for the very reason that all the laws of quantum physics (e.g. state

descriptions, evolution equations) are all computable on a classical computer

(in principle). So it follows that quantum computing, being a quantum process,

cannot compute any classical uncomputable problem.

Despite this limitation, quantum computation is still interesting! In practice,

we do not only care about computability. We care about how efficient we are at

doing the computation. This is the problem of complexity — the complexity of

a quantum computation might be much simpler than the classical counterpart.

To make sense of complexity, we need to make our notion of computations a

bit more precise.

Definition

(Input string)

.

An input bit string is a sequence of bits

x

=

i

1

i

2

···i

n

,

where each

i

k

is either 0 or 1. We write

B

n

for the set of all

n

-bit string, and

B

=

S

n∈N

B

n

. The input size is the length

n

. So in particular, if the input is

regarded as an actual number, the size is not the number itself, but its logarithm.

Definition (Language). A language is a subset L ⊆ B.

Definition

(Decision problem)

.

Given a language

L

, the decision problem is to

determine whether an arbitrary

x ∈ B

is a member of

L

. The output is thus 1

bit of information, namely yes or no.

Of course, we can have a more general task with multiple outputs, but for

simplicity, we will not consider that case here.

Example.

If

L

is the set of all prime numbers, then the corresponding decision

problem is determining whether a number is prime.

We also have to talk about models of computations. We will only give an

intuitive and classical description of it.

Definition

(Computational model)

.

A computational model is a process with

discrete steps (elementary computational steps), where each step requires a

constant amount of effort/resources to implement.

If we think about actual computers that works with bits, we can imagine a

step as an operation such as “and” or “or”. Note that addition and multiplication

are not considered a single step — as the number gets larger, it takes more effort

to add or multiply them.

Sometimes it is helpful to allow some randomness.

Definition

(Randomized/probabilistic computation)

.

This is the same as a usual

computational model, but the process also has access to a string

r

1

, r

2

, r

3

, ···

of independent, uniform random bits. In this case, we will often require the

answer/output to be correct with “suitably good” probability.

In computer science, there is a separate notion of “non-deterministic” com-

putation, which is different from probabilistic computation. In probabilistic

computation, every time we ask for a random number, we just pick one of the

possible output and follows that. With a non-deterministic computer, we simul-

taneously consider all possible choices with no extra overhead. This is extremely

powerful, and also obviously physically impossible, but it is a convenient thing

to consider theoretically.

Definition

(Complexity of a computational task (or an algorithm))

.

The com-

plexity of a computational task or algorithm is the “consumption of resources as

a function of input size n”. The resources are usually the time

T (n) = number of computational steps needed,

and space

Sp(n) = number of memory/work space needed.

In each case, we take the worse case input of a given size n.

We usually consider the worst-case scenario, since, e.g. for primality testing,

there are always some numbers which we can easily rule out as being not

prime (e.g. even numbers). Sometimes, we will also want to study the average

complexity.

In the course, we will mostly focus on the time complexity, and not work

with the space complexity itself.

As one would imagine, the actual time or space taken would vary a lot on the

actual computational model. Thus, the main question we ask will be whether

T (n) grows polynomially or super-polynomially (“exponentially”) with n.

Definition (Polynomial growth). We say T (n) grows polynomially, and write

T (n) = O(poly(n)) = O(n

k

)

for some

k

, if there is some constant

c

, and some integer

k

and some integer

n

0

such that T (n) < cn

k

for all n > n

0

.

The other possible cases are exponential growth, e.g.

T

(

n

) =

c

1

2

c

2

n

, or

super-polynomial and sub-exp onential growth such as T (n) = 2

√

n

or n

log n

.

We will usually regard polynomial time proce sses as “feasible in practice”,

while super-polynomial ones are considered “infeasible”. Of course, this is not

always actually true. For example, we might have a polynomial time of

n

10

10

10

,

or an exponential time of 2

0.0000...0001n

. However, this distinction of polynomial

vs non-polynomial is robust, since any computational model can “simulate” other

computational models in polynomial time. So if something is polynomial in one

computational model, it is polynomial in all models.

In general, we can have a more refined complexity classes of decision problems:

(i) P

(polynomial time): The class of decision problems having deterministic

polynomial-time algorithm.

(ii) BPP

(bounded error, probabilistic polynomial time): The class of decision

problems having probabilistic polynomial time algorithms such that for

every input,

Prob(answer is correct) ≥

2

3

.

The number

2

3

is sort of arbitrary — we see that we cannot put

1

2

, or

else we can just randomly guess a number. So we need something greater

than

1

2

, and “bounded” refers to it being bounded away from

1

2

. We could

replace

2

3

with any other constant

1

2

+

δ

with 0

< δ <

1

2

, and

BPP

is the

same. This is because if we have a

1

2

+

δ

algorithm, we simply repeat the

algorithm

K

times, and take the majority vote. By the Chernoff bound (a

result in probability), the probability that the majority vote is correct is

>

1

− e

−2δ

2

K

. So as we do more and more runs, the probability of getting

a right answer grows exponentially. This can be bigger than an 1

− ε

by

a suitably large

K

. Since

K

times a polynomial time is still polynomial

time, we still have a polynomial time algorithm.

These two are often considered as “classically feasible computations”, or “com-

putable in practice”. In the second case, we tolerate small errors, but that is fine

in practice, since in genuine computers, random cosmic rays and memory failures

can also cause small errors in the result, even for a deterministic algorithm.

It is clear that

P

is contained in

BPP

, but we do not know about the other

direction. It is not known whether

P

and

BPP

are the same — in general, not

much is known about whether two complexity classes are the same.

Example

(Primality testing)

.

Let

N

be an integer. We want to determine if it

is prime. The input size is

log

2

N

. The naive method of primality testing is to

test all numbers and see if it divides

N

. We only need to test up to

√

N

, since if

N

has a factor, there must be one below

√

N

. The is not polynomial time, since

we need

√

N = 2

1

2

log N

operations, we see that this is exponential time.

How about a probabilistic algorithm? We can choose a random

k < N

, and

see if

k

divides

N

. This is a probabilistic, polynomial time algorithm, but it is

not bounded, since the probability of getting a correct answer is not >

1

2

.

In reality, primality testing is known to be in

BPP

(1976), and it is also

known to be in P (2004).

Finally, we quickly describe a simple model of (classical) computation that we

will use to build upon later on. While the most famous model for computation

is probably the Turing machine, for our purposes, it is much simpler to work

with the circuit model.

The idea is simple. In general, we are working with bits, and a program is a

function

f

:

B

m

→ B

n

. It is a mathematical fact that any such function can be

constructed by combinations of boolean

AND

,

OR

and

NOT

gates. We say that

this is a universal set of gates. Thus a “program” is a specification of how to

arrange these gates in order to give the function we want, and the time taken by

the circuit is simply the number of gates we need.

Of course, we could have chosen a different universal set of gates, and the

programs would be different. However, since only a fixed number of gates is

needed to construct

AND

,

OR

and

NOT

from any universal set, and vice versa,

it follows that the difference in time is always just polynomial.