3Some quantum algorithms
III Quantum Computation
3.3 Shor’s algorithm
All that was a warm up for Shor’s algorithm. This is a quantum algorithm that
factorizes numbers in p olynomial time. The crux of the algorithm will be a
modified version of the quantum Fourier transform.
The precise statement of the problem is as follows — given an integer
N
with
n
=
log N
digits, we want to find a factor 1
< K < N
. Shor’s algorithm will
achieve this with constant probability (1
− ε
) in
O
(
n
3
) time. The b est known
classical algorithm is e
O(n
1/3
(log n)
2/3
)
.
To do this, we will use the periodicity algorithm. However, there is one
subtlety involved. Instead of working in
Z/nZ
, we need to work in
Z
. Since
computers cannot work with infinitely many numbers, we will have to truncate
it somehow. Since we have no idea what the period of our function will be, we
must truncate it randomly, and we need to make sure we can control the error
introduced by the truncation.
We shall now begin. Given an
N
, we first choose some 1
< a < N
uniformly
randomly, and compute
hcf
(
a, N
). If it is not equal to 1, then we are finished.
Otherwise, by Euler’s theorem, there is a least power
r
of
a
such that
a
r
≡
1
mod N
. The number
r
is called the order of
a
mod
N
. It follows that the
function
f
:
Z → Z/N Z
given by
f
(
k
) =
a
k
mod N
has period
r
, and is injective
in each period.
Note that
f
(
k
) can be efficiently computed in
poly
(
log k
) time, by repeated
squaring. Also note that classically, it is hard to find
r
, even though
f
has a
simple formula!
It was known to Legendre in 1800 that knowing
r
means we can factor
n
.
Suppose we can find r, and further suppose r is even. Then we have
a
r
− 1 ≡ (a
r/2
+ 1)(a
r/2
− 1) ≡ 0 (mod N).
So
N
exactly divides the product. By minimality of
r
, we know
N
does not
divide
a
r/2
−
1. So if
N
does not divide
a
r/2
+ 1 as well, then
hcf
(
N, a
r/2
±
1)
are nontrivial factors of N .
For this to work, we needed two assumptions –
r
is even, and
a
r/2
6≡ −
1
(
mod N
). Fortunately, there is a theorem in number theory that says if
N
is
odd and not a prime power, and
a
is chosen uniformly at random, then the
probability that these two things happen is at least
1
2
. In fact, it is
≥
1
−
1
2
m−1
,
where m is the number of prime factors of N.
So if we repeat this
k
times, the probability that they all fail to give a factor
is less than
1
2
k
. So this can be as small as we wish.
What about the other possibilities? If
N
is even, then we would have noticed
by looking at the last digit, and we can just write down 2. If
N
=
c
`
for
c, ` >
2,
then there is a classical p olynomial time algorithm that outputs
c
, which is a
factor. So these are the easy cases.
Everything we’ve done so far is classical! The quantum part comes in when
we want to compute
r
. We know that
f
(
k
) =
a
k
is periodic on
Z
, which is an
infinite domain. So we cannot just apply our periodicity algorithm.
By number theory, we know that
r
is at most
N
. But other than that, we
have no idea what
r
actually is, nor do we know of any multiple of
r
. So we
cannot apply the periodicity argument directly. Instead, we pick a big number
2
m
, and work on the domain
D
=
{
0
,
1
, ··· ,
2
m
−
1
}
=
Z/
2
m
Z
. How do we
choose
m
? The idea is that we want 0
, ··· ,
2
m
−
1 to contain
B
full periods,
plus some extra “corrupt” noise b, so
2
m
= Br + b,
with 0
≤ b < r
. Since we want to separate out the periodicity information from
the corrupt noise, we will want
b
to be relatively small, compared to
Br
. We
know the size of
b
is bounded by
r
, hence by
N
. So we need 2
m
to be “much
larger” than
N
. It turns out picking 2
m
> N
2
is enough, and we will pick
m
to
be the smallest number such that this holds.
We now study the effect of corruption on the periodicity algorithm. We again
make the state
fi =
1
√
2
m
X
xif (x)i.
and measure the value of f. We then get
peri =
1
√
A
A−1
X
k=0
x
0
+ kri,
where
A
=
B
or
B
+ 1, depending on whether
x
0
≤ b
or not. As before, we
apply QFT
2
m
to obtain
QFT
2
m
peri =
2
n
−1
X
c=0
˜
f(c) ci.
When we did this before, with an exact period, most of the
ˆ
f
(
c
) is zero. However,
this time things are a bit more messy. As before, we have
˜
f(c) =
ω
cx
0
√
A
√
2
m
[1 + α + ··· + α
A−1
], α = e
2πicr/2
m
.
The important question is, when we measure this, which
c
’s will we see with
“good probability”? With exact periodicity, we knew that
2
m
r
=
A
is an exact
integer. So
˜
f
(
c
) = 0 except when
c
is a multiple of
A
. Intuitively, we can think
of this as interference, and we had totally destructive and totally constructive
interference respectively.
In the inexact case, we will get constructive interference for those
c
such that
the phase
α
is c lose to 1. These are the
c
’s with
cr
2
m
nearest to integers
k
, and
the powers up to
α
A−1
don’t spread too far around the unit circle. So we avoid
cancellations.
So we look at those special
c
’s having this particular prop erty. As
c
increases
from 0 to 2
m
−
1, the angle
cr
2
m
increments by
r
2
m
each time from 0 up to
r
. So
we have c
k
’s for each k = 0, 1, ··· , r − 1 such that
c
k
r
2
m
− k
<
1
2
·
r
2
m
.
In other words, we have
c
k
− k
2
m
r
<
1
2
.
So the c
k
are the integers nearest to the multiples of 2
m
/r.
In
˜
f
(
c
), the
α
’s corresponding to the
c
k
’s have the smallest phases, i.e. nearest
to the positive real axis. We write
c
k
r
2
m
= k + ξ,
where
k ∈ Z, ξ <
1
2
r
2
m
.
Then we have
α
n
= exp
2πi
c
k
r
2
m
n
= exp (eπi(k + ξ)n) = exp(2πiξn)
Now for
n < A
, we know that

2
ξn < π
, and thus 1
, α, α
2
, ··· , α
A−1
all lie in
the lower half plane or upper half plane.
Doing all the algebra needed, we find that if
QFT peri
is measured, then for
any c
k
as above, we have
Prob(c
k
) >
γ
r
,
where
γ =
4
π
2
≈ 0.4.
Recall that in the exact periodicity case, the points
c
k
hit the integers exactly,
and instead of γ we had 1. The distribution of the c’s then look like:
With inexact periods, we obtain something like
Now how do we get r from a c
k
? We know
c
k
2
m
−
k
r
<
1
2
m+1
<
1
2N
2
.
We claim that there is at most 1 fraction
k
r
with denominator
< N
such that
this inequality holds. So this inequality do es uniquely determine k/r.
Indeed, suppose
k
r
and
k
0
r
0
both work. Then we have
k
0
r
0
−
k
r
=
k
0
r − r
0
k
rr
0
>
1
rr
0
>
1
N
2
.
However, we also have
k
0
r
0
−
k
r
≤
k
0
r
0
−
c
k
2
m
+
c
k
2
m
−
k
r
<
1
2N
2
+
1
2N
2
=
1
N
2
.
So it follows that we must have
k
0
r
0
=
k
r
.
We introduce the notion of a “good”
c
k
value, which is when
k
is coprime to
r. The probability of getting a good c
k
is again
O(1/ log log r) > O(1/ log log N).
Note that this is the same rate as the case of exact periodicity, since we have
only lost a constant factor of
γ
! If we did have such a
c
k
, then now
r
is uniquely
determined.
However, there is still the problem of finding
r
from a good
c
k
value. At this
point, this is just classical number theory.
We can certainly try all
k
0
r
0
with
k
0
< r
0
< N
and find the closest one to
c
k
/
2
m
, but there are
O
(
N
2
) fractions to try, but we want a
O
(
poly
(
log N
))
algorithm. Indeed, if we were to do it this way, we might as well try all numbers
less than N and see if they divide N . And this is just O(N )!
The answer comes from the nice theory of continued fractions. Any rational
number
s
t
< 1 has a continued fraction expansion
s
t
=
1
a
1
+
1
a
2
+
1
a
3
+ ···
.
Indeed to do this, we simply write
s
t
=
1
t
s
=
1
a
1
+
s
1
t
1
,
where we divide
t
by
s
to get
t
=
a
1
s
+
s
1
, and then put
t
1
=
s
. We then keep
going on with
s
1
t
1
. Since the numbers
s
i
, t
i
keep getting smaller, it follows that
this process will eventually terminate.
Since it is very annoying to type these continued fractions in L
A
T
E
X, we often
write the continued fraction as
s
t
= [a
1
, a
2
, a
3
, ··· , a
n
].
We define the kth convergent of
s
t
to be
p
k
q
k
= [a
1
, a
2
, ··· , a
k
].
There are some magic results from number theory that gives us a simple recur
rence relation for the convergents.
Lemma. For a
1
, a
2
, ··· , a
`
any positive reals, we set
p
0
= 0 q
0
= 1
p
1
= 1 q
1
= a
1
We then define
p
k
= a
k
p
k−1
+ p
k−2
q
k
= a
k
q
k−1
+ q
k−2
Then we have
(i) We have
[a
1
, ··· , a
k
] =
p
k
q
k
.
(ii) We also have
q
k
p
k−1
− p
k
q
k−1
= (−1)
k
.
In particular, p
k
and q
k
are coprime.
From a bit more number theory, we find that
Fact.
If
s < t
are
m
bit integers, then the continued fraction has length
O
(
m
),
and all convergents
p
k
q
k
can be computed in O(m
3
) time.
More importantly, we have the following result:
Fact. Let 0 < x < 1 be rational, and suppose
p
q
is rational with
x −
p
q
<
1
2q
2
.
Then
p
q
is a convergent of the continued fraction of x.
Then by this theorem, for a goo d
c
, we know
k
r
must be a conve rgent of
c
2
m
.
So we compute all convergents find a (unique) one whose denominator is less
than
N
and is within
1
2N
2
of
c
2
m
. This gives us the value of
r
, and we are done.
In fact, this last classical part is the slowest part of the algorithm.
Example.
Suppose we want to factor
N
= 39. Suppose the random
a
we chose
is
a
= 7
<
39, which is coporime to
N
. Let
r
be the period of
f
(
x
) = 7
x
mod
39.
We notice
1024 = 2
10
< N
2
= 1621 < 2
11
= 2048.
So we pick m = 11. Suppose the measurement of QFT
2
m
peri yeilds c = 853.
By the theory, this has a constant probability (approximately 0
.
4) to satisfy
853
2
m
−
k
r
<
1
2
m+1
=
1
2
12
<
1
2N
2
.
We also have a probability of
O
(1
/ log log r
) to have
k
and
r
coprime. In this
case, c is indeed “good”. So there is a unique
k
r
satisfying
853
2048
−
k
r
<
1
2
12
.
So to find
k
r
, we do the continued fraction expansion of
853
2048
. We have
853
2048
=
1
2048
853
=
1
2 +
342
853
=
1
2 +
1
853
342
=
1
2 +
1
2 +
169
342
= ··· = [2, 2, 2, 42, 4].
We can then compute the convergents
[2] =
1
2
[2, 2] =
2
5
[2, 2, 2] =
5
12
[2, 2, 2, 42] =
212
509
[2, 2, 2, 42, 4] =
853
2048
Of all these numbers, only
5
12
is within
1
2
12
of
853
2048
and whose denominator is
less than N = 39.
If we do not assume k and r are coprime, then the possible
k
r
are
5
12
,
10
24
,
15
36
.
If we assume that
k
r
are coprime, then r = 12. Indeed, we can try that
7
12
≡ 1 (mod 39).
So we now know that
39  (7
6
+ 1)(7
6
− 1).
We now hope/expect with probability
>
1
2
exactly that it goes partly into each
factor. We can compute
7
6
+ 1 = 117650 ≡ 26 (mod 39)
7
6
− 1 = 117648 ≡ 24 (mod 39)
We can then compute
hcf(26, 39) = 13, hcf(24, 39) = 3 (mod 39).
We see that 3 and 13 are factors of 39.