Part III Percolation and Random Walks on
Graphs
Based on lectures by P. Sousi
Notes taken by Dexter Chua
Michaelmas 2017
These notes are not endorsed by the lecturers, and I have modified them (often
significantly) after lectures. They are nowhere near accurate representations of what
was actually lectured, and in particular, all errors are almost surely mine.
A phase transition means that a system undergoes a radical change when a continuous
parameter passes through a critical value. We encounter such a transition every day
when we boil water. The simplest mathematical model for phase transition is percolation.
Percolation has a reputation as a source of beautiful mathematical problems that are
simple to state but seem to require new techniques for a solution, and a number of
such problems remain very much alive. Amongst connections of topical importance
are the relationships to so-called Schramm–Loewner evolutions (SLE), and to other
mo dels from statistical physics. The basic theory of percolation will be described in
this course with some emphasis on areas for future development.
Our other major topic includes random walks on graphs and their intimate connection
to electrical networks; the resulting discrete potential theory has strong connections
with classical potential theory. We will develop tools to determine transience and
recurrence of random walks on infinite graphs. Other topics include the study of
spanning trees of connected graphs. We will present two remarkable algorithms to
generate a uniform spanning tree (UST) in a finite graph
G
via random walks, one
due to Aldous-Broder and another due to Wilson. These algorithms can be used to
prove an important property of uniform spanning trees discovered by Kirchhoff in the
19th century: the probability that an edge is contained in the UST of
G
, equals the
effective resistance between the endpoints of that edge.
Pre-requisites
There are no essential pre-requisites beyond probability and analysis at undergraduate
levels, but a familiarity with the measure-theoretic basis of probability will be helpful.
Contents
0 Introduction
1 Percolation
1.1 The critical probability
1.2 Correlation inequalities
1.3 Two dimensions
1.4 Conformal invariance and SLE in d = 2
2 Random walks
2.1 Random walks in finite graphs
2.2 Infinite graphs
3 Uniform spanning trees
3.1 Finite uniform spanning trees
3.2 Infinite uniform spanning trees and forests 0 Introduction
This course is naturally divided into two parts percolation, and random
walks on graphs. Percolation is one of the simplest models that experience phase
transition — an abrupt change in quantitative feature due to a continuous change
of a parameter. More sophisticated examples of percolation the boiling of water
and the loss of long-range correlation in magnets when temperature increases.
But we are not physicists. So let’s talk about percolation. For reasons that
become clear later, consider an n × (n + 1) lattice connected by edges:
We now fix some
p
[0
,
1], and for each edge in the graph, we either keep it
or remove it with probability
p
. There are many questions we may ask. For
example, we may ask for the probability that there is a left-to-right crossing of
open edges.
For example, we have f
n
(0) = 0 and f
n
(1) = 1.
An interesting choice of
p
to consider is
p
=
1
2
. We can argue that
f
n
(
1
2
)
must be
1
2
, by symmetry. More precisely, consider the dual lattice:
Note that this lattice is isomorphic to the original lattice we were thinking about,
by applying a rotation. Now each edge in the dual lattice crosses exactly one edge
in the original lattice, and we can set the edge to be open iff the corresponding
edge in the original lattice is open. This gives rise to a percolation on the dual
lattice with p =
1
2
as well.
Now notice that that there is a left-right crossing of open edges in the original
lattice iff there is no top-bottom crossing in the dual graph. But since the dual
and original lattices are isomorphic, it follows that the probability that there is
a left-right crossing in the original lattice is equal to the probability that there
is no left-right crossing in the original lattice. So both of these must be
1
2
.
The ability to talk about the dual graph is a very important property that
is only true in 2 dimensions. In general, there are many things known for 2
dimensions via the dual, which do not generalize to higher dimensions.
The other topic we are going to discuss is random walks in graphs. In IB
Markov chains, and maybe IA Probability, we considered random walks on the
integer lattice
Z
d
. Here we shall consider random walks on any graph. We
shall mostly think about finite graphs, but we will also talk about how certain
results can be extended to infinite graphs. It turns out a rather useful way of
Then many concepts familiar from high school physics translate to interesting
properties about the graph, and importing well-known (and elementary) results
from electrical networks helps us understand graphs better.
1 Percolation 1.1 The critical probability
There are two models of percolation bond percolation and site percolation. In
this course, we will focus on bond percolation, but we will look at site percolation
in the example sheets.
The very basic set up of percolation theory involves picking a graph
G
=
(
V, E
), where
V
is the set of vertices and
E
is the set of edges. We also pick a
percolation probability
p
[0
,
1]. For each edge
e E
, we keep it with probability
p
and throw it with probability 1
p
. In the first case, we say the edge is open,
and in the latter, we say it is closed.
More precisely, we define the probability space to be Ω =
{
0
,
1
}
E
, where 0
denotes a closed edge and 1 denotes an open one (in the case of site percolation,
we have Ω =
{
0
,
1
}
V
). We endow with the
σ
-algebra generated by cylinder
sets
{ω Ω : ω(e) = x
e
for all e A},
where
A
is a finite set and
x
e
{
0
,
1
}
for all
e
. In other words, this is the
product
σ
-algebra. As probability measure, we take the product measure
P
p
, i.e.
every edge is 1 with probability
p
and 0 with probability 1
p
. We will write
η
p
{0, 1}
E
for the state of the system.
Now what can we say about the graph resulting from this process? One
question we may ask is whether we can connect two points in the graphs via the
edges that remain. To further the discussion, we introduce some notation.
Notation. We write x y if there is an open path of edges from x to y.
Notation. We write C(x) = {y V : y x}, the cluster of x.
Notation. We write x if |C(x)| = .
From now on, we shall take
G
=
L
d
= (
Z
d
, E
(
Z
d
)), the
d
-dimensional integer
lattice. Then by translation invariance,
|C
(
x
)
|
has the same distribution as
|C
(0)
|
for all x. We now introduce a key piece of notation:
Definition (θ(p)). We define θ(p) = P
p
(|C(0)| = ).
Most of the questions we ask surround this
θ
(
p
). We first make the most
elementary observations:
Example. θ(0) = 0 and θ(1) = 1.
A natural question to ask is then if we can find
p
(0
,
1) such that
θ
(
p
)
>
0.
But even before answering that question, we can ask a more elementary one
is θ an increasing function of p?
Intuitively, it must be. And we can prove it. The proof strategy is known as
coupling. We have already seen coupling in IB Markov Chains, where we used it
to prove the convergence to the invariant distribution under suitable conditions.
Here we are going to couple all percolation processes for different values of P .
Lemma. θ is an increasing function of p.
Proof.
We let (
U
(
e
))
eE(Z
d
)
be iid
U
[0
,
1] random variables. For each
p
[0
,
1],
we define
η
p
(e) =
(
1 U(e) p
0 otherwise
Then
P
(
η
p
(
e
) = 1) =
P
(
U
(
e
)
< p
) =
p
. Since the
U
(
e
) are independent, so are
η
p
. Thus η
p
has the law of bond percolation with probability p.
Moreover, if p q, then η
p
(e) η
q
(e). So the result follows.
Note that this is not only useful as a theoretical tool. If we want to simulate
percolation with different probabilities
p
, we can simply generate a set of
U
[0
,
1]
variables, and use it to produce a percolation for all p.
If we wish, we can provide an abstract definition of what coupling is, but the
detailed definition is not of much practical use:
Definition
(Coupling)
.
Let
µ
and
ν
be two probability measures on (potentially)
different probability spaces. A coupling is a pair of random variables (
X, Y
)
defined on the same probability space such that the marginal distribution of
X
is µ and the marginal distribution of Y is ν.
With the lemma, we can make the definition
Definition (Critical probability). We define p
c
(d) = sup{p [0, 1] : θ(p) = 0}.
θ
(
p
) can be non-zero for
p
(0
,
1). We
can now rephrase and strengthen this question by asking for the value of p
c
(d).
c
and θ(p).
For example, we know that
θ
(
p
) is a
C
function on (
p
c
,
1]. However, we
do not know if
θ
is continuous at
p
c
in
d
= 3. We will see soon that
p
c
=
1
2
in
d = 2, but the exact value of p
c
is not known in higher dimensions.
Let’s start actually proving things about p
c
. We previously noted that
Proposition. p
c
(1) = 1.
The first actually interesting theorem is the following:
Theorem. For all d 2, we have p
c
(d) (0, 1).
We shall break this up into two natural parts:
Lemma. For d 2, p
c
(d) > 0.
Proof.
Write Σ
n
for the number of open self-avoiding paths of length
n
starting
at 0. We then note that
P
p
(|C(0)| = ) = P
p
(n 1 : Σ
n
1) = lim
n→∞
P
p
n
1) lim
n→∞
E
p
n
].
We can now compute
E
p
n
]. The point is that expectation is linear, which
makes this much easier to compute. We let
σ
n
be the number of self-avoiding
paths of length n from 0. Then we simply have
E
p
n
] = σ
n
p
n
.
We can bound
σ
n
by 2
d ·
(2
d
1)
n1
, since we have 2
d
choices of the first step,
and at most 2d 1 choices in each subsequent step. So we have
E
p
n
] 2d(2d 1)
n1
p
n
=
2d
2d 1
(p(2d 1))
n
.
So if p(2d 1) < 1, then θ(p) = 0. So we know that
p
c
(d)
1
2d 1
.
Before we move on to the other half of the theorem, we talk a bit more about
self-avoiding paths.
Definition
(
σ
n
)
.
We write
σ
n
for the number of self-avoiding paths of length
n
starting from 0.
In the proof, we used the rather crude bound
σ
n
2d · (2d 1)
n1
.
More generally, we can make the following bound:
Lemma. We have σ
n+m
σ
n
σ
m
.
Proof.
A self-avoiding path of length
n
+
m
can be written as a concatenation of
self-avoiding paths of length
n
starting from 0 and another one of length
m
.
Taking the logarithm, we know that
log σ
n
is a subadditive sequence. It turns
out this property alone is already quite useful. It is an exercise in analysis to
prove the following lemma:
Lemma
(Fekete’s lemma)
.
If (
a
n
) is a subadditive sequence of real numbers,
then
lim
n→∞
a
n
n
= inf
n
a
k
k
: k 1
o
[−∞, ).
In particular, the limit exists.
This allows us to define
Definition (λ and κ). We define
λ = lim
n→∞
log σ
n
n
, κ = e
λ
.
κ is known as the connective constant.
Then by definition, we have
σ
n
= e
(1+o(1))
= κ
n+o(n)
.
as n .
Thus, asymptotically, the growth rate of
σ
n
is determined by
κ
. It is natural
to then seek for the value of
κ
, but unfortunately we don’t know the value of
κ
for the Euclidean lattice. A bit more on the positive side, the value for the
hexagonal lattice has been found recently:
Theorem (Duminil-Copin, Smirnov, 2010). The hexagonal lattice has
κ
hex
=
q
2 +
2.
We might want to be a bit more precise about how
σ
n
grows. For
d
5, we
have the following theorem:
Theorem
.
For
d
5, there exists a constant
A
such
that
σ
n
=
n
(1 + O(n
ε
))
for any ε <
1
2
.
We don’t really know what happens when
d <
5, but we have the following
conjecture:
Conjecture.
σ
n
n
11/32
κ
n
d = 2
n
γ
κ
n
d = 3
(log n)
1/4
κ
n
d = 4
One can also instead try to bound
σ
n
from above. We have the following
classic theorem:
Theorem (Hammersley and Welsh, 1962). For all d 2, we have
σ
n
Cκ
n
exp(c
0
n)
for some constants C and c
0
.
In fact, a better bound was recently found:
Theorem (Hutchcroft, 2017). For d 2, we have
σ
n
Cκ
n
exp(o(
n)).
This will be proved in the example sheet.
What would be a good way to understand self-avoiding walks? Fixing an
n
,
there are only finitely many self-avoiding walks of length
n
. So we can sample
such a self-avoiding walk uniformly at random. In general, we would expect
the total displacement of the walk to be
n
. Thus, what we can try to do
is to take
n
while simultaneously shrinking space by a factor of
1
n
. We
would then hope that the result converges toward some Brownian motion-like
trajectory. If we can characterize the precise behaviour of this scaling limit,
then we might be able to say something concrete about
κ
and the asymptotic
behaviour of σ
n
.
But we don’t really know what the scaling limit is. In the case
d
= 2, it is
conjectured to be
SLE
(
8
3
). Curiously, it was proven by Gwynne and Miller in
2016 that if we instead looked at self-avoiding walks on a random surface, then
the scaling limit is SLE(
8
3
).
That’s enough of a digression. Let’s finish our proof and show that
p
c
(
d
)
<
1.
A first observation is that it suffices to show this for
d
= 2. Indeed, since
Z
d
embeds into
Z
d+1
for all
d
, if we can find an infinite cluster in
Z
d
, then the same
is true for
Z
d+1
. Thus, it is natural to restrict to the case of
d
= 2, where duality
will be prove to be an extremely useful tool.
Definition
(Planar graph)
.
A graph
G
is called planar if it can be embedded
on the plane in such a way that no two edges cross.
Definition
(Dual graph)
.
Let
G
be a planar graph (which we call the primal
graph). We define the dual graph by placing a vertex in each face of
G
, and
connecting 2 vertices if their faces share a boundary edge.
Example. The dual of Z
2
is isomorphic to Z
2
The dual lattice will help us prove a lot of properties for percolation in Z
2
.
Lemma. p
c
(d) < 1 for all d 2.
Proof.
It suffices to show this for
d
= 2. Suppose we perform percolation on
Z
2
.
Then this induces a percolation on the dual lattice by declaring an edge of the
dual is open if it crosses an open edge of Z
2
, and closed otherwise.
Suppose
|C
(0)
| <
in the primal lattice. Then there is a closed circuit in
the dual lattice, given by the “boundary” of
C
(0). Let
D
n
be the number of
closed dual circuits of length
n
that surround 0. Then the union bound plus
Markov’s inequality tells us
P
p
(|C(0)| < ) = P
p
(n D
n
1)
X
n=4
E
p
[D
n
],
using the union bound and Markov’s inequality.
It is a simple exercise to show that
Exercise.
Show that the number of dual circuits of length
n
that contain 0 is
at most n · 4
n
.
From this, it follows that
P
p
(|C(0)| < )
X
n=4
n · 4
n
(1 p)
n
.
Thus, if
p
is sufficiently near 1, then
P
p
(
|C
(0)
| <
) is bounded away from 1.
By definition, if
p < p
c
(
d
), then 0 is almost surely not contained in an infinite
cluster. If
p > p
c
(
d
), then there is a positive probability that 0 is contained
in an infinite cluster. However, it is of course not necessarily the case that
0 is connected to
with probability 1. In fact, there is at least probability
(1
p
)
2d
that 0 is not connected to
, since 0 cannot be connected to
if all
its neighbouring edges are closed. However, it is still possible that there is some
infinite cluster somewhere. It’s just that it does not contain 0.
Proposition. Let A
be the event that there is an infinite cluster.
(i) If θ(p) = 0, then P
p
(A
) = 0.
(ii) If θ(p) > 0, then P
p
(A
) = 1.
Proof.
(i) We have
P
p
(A
) = P
p
(x : |C(x)| = )
X
xZ
d
P
p
(|C(x)| = ) =
X
θ(p) = 0.
(ii)
We need to apply the Kolmogorov 0-1 law . Recall that if
X
1
, X
2
, . . .
are
independent random variables, and
F
n
=
σ
(
X
k
:
k n
),
F
=
T
n0
F
n
,
Then F
is trivial, i.e. for all A F
, P(A) {0, 1}.
So we order the edges of Z
d
as e
1
, e
2
, . . . and denote their states
w(e
1
), w(e
2
), . . . .
These are iid random variables. We certainly have
P
p
(
A
)
θ
(
p
)
>
0.
So if we can show that
A
F
, then we are done. But this is clear,
since changing the states of a finite number of edges does not affect the
occurrence of A
.
The next follow up questions is how many infinite clusters do we expect to
get?
Theorem
(Burton and Keane)
.
If
p > p
c
, then there exists a unique infinite
cluster with probability 1.
This proof is considerably harder than the ones we have previously done. We
might think we can use the Kolmogorov 0-1 law, but we can’t, since changing
a finite number of edges can break up or join together infinite clusters, so the
event that there are
k
infinite clusters for
k >
0 is not in
F
. However, we can
exploit the fact that N is translation invariant.
Exercise.
Let
A
be an event that is translation invariant. Then
P
p
(
A
) = 0 or 1
almost surely.
Proof.
Let
N
be the number of infinite clusters. Then by the lemma, we know
N
is constant almost surely. So there is some
k N {∞}
such that
P
p
(
N
=
k
) = 1.
First of all, we know that
k 6
= 0, since
θ
(
p
)
>
0. We shall first exclude 2
k <
,
and then exclude k = .
Assume that
k <
. We will show that
P
p
(
n
= 1)
>
0, and hence it must
be the case that P
p
(n = 1) = 1.
To bound this probability, we let
B
(
n
) = [
n, n
]
d
Z
d
(which we will
sometimes write as B
n
), and let B(n) be its boundary. We know that
P
p
(all infinite clusters intersect B(n)) 1
as
n
. This is since with probability 1, there are only finitely many clusters
by assumption, and for each of these configurations, all infinite clusters intersect
B(n) for sufficiently large n.
In particular, we can take n large enough such that
P
p
(all infinite clusters intersect B(n))
1
2
.
We can then bound
P
p
(N = 1) P
p
(all infinite clusters intersect B(n)
and all edges in B(n) are open).
Finally, note that the two events in there are independent, since they involve
different edges. But the probability that all edges in
B
(
n
) are open is just
p
E(B(n))
. So
P
p
(N = 1)
1
2
p
E(B(n))
> 0.
So we are done.
We now have to show that
k 6
=
. This involves the notion of a trifurcation.
The idea is that we will show that if k = , then the probability that a vertex
is a trifurcation is positive. This implies the expected number of trifurcations
is
n
d
. We will then show deterministically that the number of trifurcations
inside
B
(
n
) must be
|B
(
n
)
|
, and so there are
O
(
n
d1
) trifurcations, which is
We say a vertex x is a trifurcation if the following three conditions hold:
(i) x is in an infinite open cluster C
;
(ii) There exist exactly three open edges adjacent to x;
(iii) C
\ {x} contains exactly three infinite clusters and no finite ones.
This is clearly a translation invariant notion. So
P
p
(0 is a trifurcation) = P
p
(x is a trifurcation)
for all x Z
d
.
Claim. P
p
(0 is a trifurcation) > 0.
We need to use something slightly different from
B
(
n
). We define
S
(
n
) =
{x Z
d
: kxk
1
n}.
The crucial property of this is that for any
x
1
, x
2
, x
3
S
(
n
), there exist three
disjoint self-avoiding paths joining
x
i
to 0 (exercise!). For each triple
x
1
, x
2
, x
3
,
we arbitrarily pick a set of three such paths, and define the event
J(x
1
, x
2
, x
3
) = {all edges on these 3 paths are open
and everything else inside S(n) is closed}.
Next, for every possible infinite cluster in
Z
d
\ S
(
n
) that intersects
S
(
n
) at at
least one point, we pick a designated point of intersection arbitrarily.
Then we can bound
P
p
(0 is a trifurcation) P
p
(∃C
1
, C
2
, C
3
Z
d
\ S(n)
infinite clusters which intersect S(n) at x
1
, x
2
, x
3
, and J(x
1
, x
2
, x
3
)).
Rewrite the right-hand probability as
P
p
(J(x
1
, x
2
, x
3
) | ∃C
1
, C
2
, C
3
Z intersecting S(n))
× P
p
(∃C
1
, C
2
, C
3
Z
d
\ S(n))
We can bound the first term by
min(p, 1 p)
E(S(n))
.
To bound the second probability, we have already assumed that
P
p
(
N
=
) = 1.
So
P
p
(
∃C
1
, C
2
, C
3
Z
d
\ S
(
n
)
intersecting S
(
n
))
1 as
n
. We can
then take
n
large enough such that the probability is
1
2
. So we have shown
that c P
p
(0 is a trifurcation) > 0.
Using the linearity of expectation, it follows that
E
p
[number of trifurcations inside B(n))] c|B(n)| n
d
.
On the other hand, we can bound the number of trifurcations in
B
(
n
) by
|B
(
n
)
|
.
To see this, suppose
x
1
is a trifurcation in
B
(
n
). By definition, there exists
3 open paths to
B
(
n
). Fix three such paths. Let
x
2
be another trifurcation.
It also has 3 open paths to the
B
(
n
), and its paths to the boundary could
intersect those of
x
1
. However, they cannot create a cycle, by definition of a
trifurcation. For simplicity, we add the rule that when we produce the paths for
x
2
, once we intersect the path of x
1
, we continue following the path of x
1
.
Exploring all trifurcations this way, we obtain a forest inside B(n), and the
boundary points will be the leaves of the forest. Now the trifurcations have
degree 3 in this forest. The rest is just combinatorics.
Exercise.
For any tree, the number of degree 3 vertices is always less than the
number of leaves. 1.2 Correlation inequalities
In this section, we are going to prove some useful inequalities and equalities,
and use them to prove some interesting results about
θ
(
p
) and the decay of
P
p
(0 B
n
).
To motivate our first inequality, suppose we have 4 points
x, y, u, v
, and we
want to ask for the conditional probability
P
p
(x y | u v).
Intuitively, we expect this to be greater than
P
p
(
x y
), since
u v
tells us
there are some open edges around, which is potentially helpful. The key property
underlying this intuition is that having more edges is beneficial to both of the
events. To quantify this, we need the notion of an increasing random variable.
Again, let
G
= (
V, E
) be a graph and Ω =
{
0
,
1
}
E
. We shall assume that
E
is countable.
Notation (). Given ω, ω
0
Ω, we write ω ω
0
if ω(e) ω
0
(e) for all e E.
This defines a partial order on Ω.
Definition
(Increasing random variable)
.
A random variable
X
is increasing if
X(ω) X(ω
0
) whenever ω ω
0
, and is decreasing if X is increasing.
Definition
(Increasing event)
.
An event
A
is increasing (resp. decreasing) if
the indicator 1(A) is increasing (resp. decreasing)
Example. {|C(0)| = ∞} is an increasing event.
An immediate consequence of the definition is that
Theorem. If N is an increasing random variable and p
1
p
2
, then
E
p
1
[N] E
p
2
[N],
and if an event A is increasing, then
P
p
1
(A) P
p
2
(A).
Proof. Immediate from coupling.
What we want to prove is the following result, which will be extremely useful.
Theorem
(Fortuin–Kasteleyn–Ginibre (FKG) inequality)
.
Let
X
and
Y
be
increasing random variables with E
p
[X
2
], E
p
[Y
2
] < . Then
E
p
[XY ] E
p
[X]E
p
[Y ].
In particular, if A and B are increasing events, then
P
p
(A B) P
p
(A)P
p
(B).
Equivalently,
P
p
(A | B) P
p
(A).
Proof.
The plan is to first prove this in the case where
X
and
Y
depend on a
finite number of edges, and we do this by induction. Afterwards, the desired
result can be obtained by an application of the martingale convergence theorem.
In fact, the “real work” happens when we prove it for
X
and
Y
depending on
a single edge. Everything else follows from messing around with conditional
probabilities.
If
X
and
Y
depend only on a single edge
e
1
, then for any
ω
1
, ω
2
{
0
,
1
}
, we
claim that
(X(ω
1
) X(ω
2
))(Y (ω
1
) Y (ω
2
)) 0.
Indeed,
X
and
Y
are both increasing. So if
ω
1
> ω
2
, then both terms are
positive; if
ω < ω
2
, then both terms are negative, and there is nothing to do if
they are equal.
In particular, we have
X
ω
1
2
∈{0,1}
(X(ω
1
) X(ω
2
))(Y (ω
1
) Y (ω
2
))P
p
(ω(e
1
) = ω
1
)P
p
(ω(e
1
) = ω
2
) 0.
Expanding this, we find that the LHS is 2(
E
p
[
XY
]
E
p
[
X
]
E
p
[
Y
]), and so we
are done.
Now suppose the claim holds for
X
and
Y
that depend on
n < k
edges. We
shall prove the result when they depend on k edges e
1
, . . . , e
k
. We have
E
p
[XY ] = E
p
[E
p
[XY | ω(e
1
), . . . , ω(e
k1
)]].
Now after conditioning on
ω
(
e
1
)
, . . . , ω
(
e
k1
), the random variables
X
and
Y
become increasing random variables of ω(e
k
). Applying the first step, we get
E
p
[XY | ω(e
1
), . . . , ω(e
k1
)]
E
p
[X | ω(e
1
), . . . , ω(e
k1
)]E
p
[Y | ω(e
1
), . . . , ω(e
k1
)]. ()
But
E
p
[
X | ω
(
e
1
)
, . . . , ω
(
e
k1
)] is a random variable depending on the edges
e
1
, . . . , e
k1
, and moreover it is increasing. So the induction hypothesis tells us
E
p
E
p
[X | ω(e
1
), . . . , ω(e
k1
)]E
p
[Y | ω(e
1
), . . . , ω(e
k1
)]
E
p
E
p
[X | ω(e
1
), . . . , ω(e
k1
)]
E
p
E
p
[Y | ω(e
1
), . . . , ω(e
k1
)]
Combining this with (the expectation of) () then gives the desired result.
Finally, suppose
X
and
Y
depend on the states of a countable set of edges
e
1
, e
2
, . . .. Let’s define
X
n
= E
p
[X | ω(e
1
), . . . , ω(e
n
)]
Y
n
= E
p
[Y | ω(e
1
), . . . , ω(e
n
)]
Then
X
n
and
Y
n
are martingales, and depend only on the states of only finitely
many edges. So we know that
E
p
[X
n
Y
n
] E
p
[X
n
]E
p
[Y
n
] = E
p
[X]E
p
[Y ].
By the
L
2
-martingale convergence theorem,
X
n
X
,
Y
n
Y
in
L
2
and almost
surely. So taking the limit n , we get
E
p
[XY ] E
p
[X]E
p
[Y ].
What we want to consider next is the notion of disjoint occurrence. For
example, we want to able to ask the probability that there exists two disjoint
paths connecting a pair of points.
To formulate this disjointness, suppose we have an event
A
, and let
ω A
.
To ask whether this occurrence of
A
depends only on some set
S E
of edges,
we can look at the set
[ω]
S
= {ω
0
Ω : ω
0
(e) = ω(e) for all e S}.
If [
ω
]
S
A
, then we can rightfully say this occurrence of
A
depends only on
the edges in
S
. Note that this depends explicitly and importantly on
ω
, i.e. the
“reason”
A
happened. For example, if
A
=
{x y}
, and
ω A
, then we can
take
S
to be the set of all edges in a chosen path from
x
to
y
in the configuration
of ω. This choice will be different for different values of ω.
Using this, we can define what it means for two events to occur disjointly.
Definition
(Disjoint occurrence)
.
Let
F
be a set and =
{
0
,
1
}
F
. If
A
and
B
are events, then the event that A and B occurs disjointly is
A B = {ω Ω : S F s.t. [ω]
S
A and [ω]
F \S
B}.
Theorem
(BK inequality)
.
Let
F
be a finite set and Ω =
{
0
,
1
}
F
. Let
A
and
B be increasing events. Then
P
p
(A B) P
p
(A)P
p
(B).
This says if
A
and
B
are both events that “needs” edges to occur, then requir-
ing that they occur disjointly is more difficult than them occurring individually.
The proof is completely magical. There exist saner proofs of the inequality,
but they are rather longer.
We prove by induction on the size
n
of the set
F
.
For n = 0, it is trivial.
Suppose it holds for
n
1. We want to show it holds for
n
. For
D {
0
,
1
}
F
and i = 0, 1, set
D
i
= {(ω
1
, . . . , ω
n1
) : (ω
1
, . . . , ω
n1
, i) D}.
Let A, B {0, 1}
F
, and C = A B. We check that
C
0
= A
0
B
0
, C
1
= (A
0
B
1
) (A
1
B
0
).
Since
A
and
B
are increasing,
A
0
A
1
and
B
0
B
1
, and
A
i
and
B
i
are also
increasing events. So
C
0
(A
0
B
1
) (A
1
B
0
)
C
1
A
1
B
1
.
By the induction hypothesis, we have
P
p
(C
0
) = P
p
(A
0
B
0
) P
p
(A
0
)P
p
(B
0
)
P
p
(C
1
) P
p
(A
1
B
1
) P
p
(A
1
)P
p
(B
1
)
P
p
(C
0
) + P
p
(C
1
) P
p
((A
0
B
1
) (A
1
B
0
)) + P
p
((A B
1
) (A
1
B
0
))
= P
p
(A
0
B
1
) + P
p
(A
1
B
0
)
P
p
(A
0
)P
p
(B
1
) + P
p
(A
1
)P
p
(B
0
).
Now note that for any D, we have
P
p
(D) = pP
p
(D
1
) + (1 p)P
p
(D
0
).
By some black magic, we multipy the first inequality by (1
p
)
2
, the second by
p
2
and the third by p(1 p). This gives
pP
p
(C
1
) + (1 p)P
p
(C
0
) (pP
p
(A
1
) + (1 p)P
p
(A
0
))(pP
p
(B
1
) + (1 p)P
p
(B
0
)).
Expand and we are done.
It turns out the increasing hypothesis is not necessary:
Theorem
(Reimer)
.
For all events
A, B
depending on a finite set, we have
P
p
(A B) P
p
(A)P
p
(B).
But the proof is much harder, and in all the case where we want to apply
this, the variables are increasing.
As an application of the BK inequality, we first prove a preliminary result
P
p
(0
B
(
n
)). To prove our result, we will need a stronger
condition than p < p
c
. Recall that we defined
θ(p) = P
p
(|C(0)| = ).
We also define
χ(p) = E
p
[|C(0)|].
If χ(p) is finite, then of course θ(p) = 0. However, the converse need not hold.
Theorem.
If
χ
(
p
)
<
, then there exists a positive constant
c
such that for all
n 1,
P
p
(0 B(n)) e
cn
.
Later, we will show that in fact this holds under the assumption that
p < p
c
.
However, that requires a bit more technology, which we will develop after this
proof.
The idea of the proof is that if we want a path from, say, 0 to
B
(2
n
), then
the path must hit a point on
B
(
n
). So there is a path from 0 to a point on
B
(
n
), and a path from that point to
B
(2
n
). Moreover, these two paths are
disjoint, which allows us to apply the BK inequality.
Proof. Let
X
n
=
X
xB(n)
1(0 x).
Now consider
X
n=0
E[X
n
] =
X
n
X
xB(n)
P
p
(0 x) =
X
xZ
d
P
p
(0 x) = χ(p).
Since
χ
(
p
) is finite, we in particular have
E
p
[
X
n
]
0 as
n
. Take
m
large
enough such that E
p
[X
m
] < δ < 1.
Now we have
P
p
(0 B(m + k)) = P
p
(x B(m) : 0 x and x B(m + k) disjointly)
X
xB(m)
P
p
(0 x)P
p
(x B(m + k)) (BK)
X
xB(m)
P
p
(0 x)P
p
(0 B(k)) (trans. inv.)
P
p
(0 B(k))E
p
[X
m
].
So for any
n > m
, write
n
=
qm
+
r
, where
r
[0
, m
1]. Then iterating the
above result, we have
P
p
(0 B(n)) P
p
(0 B(mq)) δ
q
δ
1+
n
m
e
cn
.
To replace the condition with the weaker condition
p < p
c
, we need to
understand how
θ
(
p
) changes with
p
. We know
θ
(
p
) is an increasing function in
p
. It would be great if it were differentiable, and even better if we could have an
explicit formula for
dθ
dp
.
To do so, suppose we again do coupling, and increase
p
by a really tiny bit.
Then perhaps we would expect that exactly one of the edges switches from being
closed to open. Thus, we want to know if the state of this edge is pivotal to the
event |C(0)| = , and this should determine the rate of change of θ.
Definition
(Pivotal edge)
.
Let
A
be an event and
ω
a percolation configuration.
The edge e is pivotal for (A, ω) if
1(ω A) 6= 1(ω
0
A),
where ω
0
is defined by
ω
0
(f) =
(
ω(f) f 6= e
1 ω(f) f = e
.
The event that
e
is pivotal for
A
is defined to be the set of all
ω
such that
e
is
pivotal for (A, ω).
Note that whether or not e is pivotal for (A, ω) is independent of ω(e).
Theorem
(Russo’s formula)
.
Let
A
be an increasing event that depends on the
states of a finite number of edges. Then
d
dp
P
p
(A) = E
p
[N(A)],
where N (A) is the number of pivotal edges for A.
Proof.
Assume that
A
depends the states of
m
edges
e
1
, . . . , e
m
. The idea is to
let each
e
i
be open with probability
p
i
, whree the
{p
i
}
may be distinct. We then
vary the p
i
one by one and see what happens.
Writing ¯p = (p
1
, . . . , p
m
), we define
f(p
1
, . . . , p
m
) = P
¯p
(A),
Now
f
is the sum of the probability of all configurations of
{e
1
, . . . , e m}
for
which
A
happens, and is hence a finite sum of polynomials. So in particular, it
is differentaible.
We now couple all percolation process. Let (
X
(
e
) :
e L
d
) be iid
U
[0
,
1]
random variables. For a vector ¯p = (p(e) : e L
d
), we write
η
¯p
(e) = 1(X(e) p(e)).
Then we have P
¯p
(A) = P(η
¯p
A).
Fix an edge
f
and let
¯p
0
= (
p
0
(
e
)) be such that
p
0
(
e
) =
p
(
e
) for all
e 6
=
f
, and
p
0
(f) = p(f ) + δ for some δ > 0. Then
P
¯p
0
(A) P
¯p
(A) = P(η
¯p
0
A) P(η
¯p
A)
= P(η
¯p
0
A, η
¯p
A) + P(η
¯p
0
A, η
¯p
A) P(η
¯p
A).
But we know
A
is increasing, so
P
(
η
¯p
0
A, η
¯p
A
) =
P
(
η
¯p
A
). So the first
and last terms cancel, and we have
P
¯p
0
(A) P
¯p
(A) = P(η
¯p
0
A, η
¯p
6∈ A).
But we observe that we simply have
P(η
¯p
0
A, η
¯p
6∈ A) = δ · P
¯p
(f is pivotal for A).
Indeed, we by definition of pivotal edges, we have
P(η
¯p
0
A, η
¯p
6∈ A) = P
¯p
(f is pivotal for A, p(f) < X(f) p(f) + δ).
Since the event
{f is pivotal for A}
is independent of the state of the edge
f
,
we obtain
P
¯p
(f is pivotal, p(f) < X(f) p(f) + δ) = P
¯p
(f is pivotal) · δ.
Therefore we have
p(f)
P
¯p
(A) = lim
δ0
P
¯p
0
(A) P
¯p
(A)
δ
= P
¯p
(f is pivotal for A).
The desired result then follows from the chain rule:
d
dp
P
p
(A) =
m
X
i=1
p(e
i
)
P
¯p
(A)
¯p=(p,...,p)
=
m
X
i=1
P
p
(e
i
is pivotal for A)
= E
p
[N(A)].
If
A
depends on an infinite number of edges, then the best we can say is that
lim inf
δ0
P
p+δ
(A) P
p
(A)
δ
E
p
[N(A))].
To see this, again set B(n) = [n, n]
d
Z
d
. Define ¯p
n
by
¯p
n
(e) =
(
p e 6∈ B(n)
p + δ e B(n)
.
Then since a is increasing, we know
P
p+δ
(A) P
p
(A)
δ
P
¯p
n
(A) P
p
(A)
δ
.
We can then apply the previous claim, take successive differences, and take
n .
Corollary.
Let
A
be an increasing event that depends on
m
edges. Let
p q
[0, 1]. Then P
q
(A) P
p
(A)
q
p
m
.
Proof.
We know that
{f is pivotal for A}
is independent of the state of
f
, and
so
P
p
(ω(f) = 1, f is pivotal for A) = pP
p
(f is pivotal for A).
But since
A
is increasing, if
ω
(
f
) = 1 and
f
is pivotal for
A
, then
A
occurs.
Conversely, if f is pivotal and A occurs, then ω(f) = 1.
Thus, by Russo’s formula, we have
d
dp
P
p
(A) = E
p
[N(A)]
=
X
e
P
p
(e is pivotal for A)
=
X
e
1
p
P
p
(ω(e) = 1, e is pivotal for A)
=
X
e
1
p
P
p
(e is pivotal | A)P
p
(A)
= P
p
(A)
1
p
E
p
[N(A) | A].
So we have
d
dp
P
p
(A)
P
p
(A)
=
1
p
E
p
[N(A) | A].
Integrating, we find that
log
P
q
(A)
P
p
(A)
=
Z
q
p
1
u
E
u
[N(A) | A] du.
Bounding E
u
[N(A) | A] m, we obtain the desired bound.
With Russo’s formula, we can now prove the desired theorem.
Theorem. Let d 2 and B
n
= [n, n]
d
Z
d
.
(i)
If
p < p
c
, then there exists a positive constant
c
for all
n
1,
P
p
(0
B
n
) e
cn
.
(ii) If p > p
c
, then
θ(p) = P
p
(0 )
p p
c
p(1 p
c
)
.
This was first proved by Aizenman and Barsky who looked at the more
general framework of long-range percolation. Menshikov gave an alternative
proof by analyzing the geometry of pivotal edges. The proof we will see is by
Duminil-Copin and Tassion. Recall that we defined
χ(p) = E
p
[|C(0)|].
We saw that if
χ
(
p
)
<
, then
P
p
(0
B
n
)
e
cn
. We now see that
χ
(
p
)
<
iff p < p
c
.
The strategy of the proof is to define a new percolation probability
˜p
c
, whose
definition makes it easier to prove the theorem. Once we have established the
two claims, we see that (i) forces ˜p
c
p
c
, and (ii) forces ˜p
c
p
c
. So they must
be equal.
Proof (Duminil-Copin and Tassion). If S V is finite, we write
S = {(x, y) E : x S, y 6∈ S}.
We write
x
S
y
if there exists an open path of edges from
x
to
y
all of whose
end points lie in S.
Now suppose that 0 S. We define
ϕ
p
(S) = p
X
(x,y)∂S
P
p
(0
S
x).
Define
˜p
c
= sup{p [0, 1] : exists a finite set S with 0 S and ϕ
p
(S) < 1}.
Claim. It suffices to prove (i) and (ii) with p
c
replaced by ˜p
c
.
Indeed, from (i), if
p < ˜p
c
, then
P
p
(0
B
n
)
e
cn
. So taking the limit
n
, we see
θ
(
p
) = 0. So
˜p
c
p
c
. From (ii), if
p > ˜p
c
, then
θ
(
p
)
>
0. So
p
c
˜p
c
. So p
c
= ˜p
c
.
We now prove (i) and (ii):
(i)
Let
p < ˜p
c
. Then there exists a finite set
S
containing 0 with
ϕ
p
(
S
)
<
1.
Since
S
is finite, we can pick
L
large enough so that
S B
L1
. We will
prove that P
p
(0 B
kL
) (ϕ
p
(S))
k1
for k 1.
Define C = {x S : 0
S
x}. Since S B
L1
, we know S B
kL
= .
Now if we have an open path from 0 to
B
kL
, we let
x
be the last element
on the path that lies in
C
. We can then replace the path up to
x
by a path
that lies entirely in
S
, by assumption. This is then a path that lies in
C
up
to
x
, then takes an edge on
S
, and then lies entirely outside of
C
c
. Thus,
P
p
(0 B
kL
)
X
AS
0A
X
(x,y)∂A
P
p
(0
A
x, (x, y) open, C = A, y
A
C
B
kL
).
Now observe that the events
{C
=
A,
0
S
x}
,
{
(
x, y
)
is open}
and
{y
A
c
B
kL
} are independent. So we obtain
P
p
(0 B
kL
)
X
AS,0A
X
(x,y)∂S
p P
p
(0
S
x, C = A) P
p
(y
A
c
B
kL
).
Since we know that y B
L
, we can bound
P
p
(y
A
c
B
kL
) P
p
(0 B
(k1)L
).
So we have
P
p
(0 B
kL
) p P
p
(0 B
(k1)L
)
X
AS,0A
X
(x,y)∂S
P
p
(0
S
x, C = A)
= P
p
(0 B
(k1)L
) p
X
(x,y)∂S
P
p
(0
S
x)
= P
p
(0 B
(k1)L
)ϕ
p
(S).
Iterating, we obtain the deseired result.
(ii) We want to use Russo’s formula. We claim that it suffices to prove that
d
dp
P
p
(0 B
n
)
1
p(1 p)
inf
SB
n
,0S
ϕ
p
(S)(1 P
p
(0 B
n
)).
Indeed, if
p > ˜p
c
, we integrate from
˜p
c
to
p
, use in this range
ϕ
p
(
S
)
1,
and then take the limit as n .
The event
{
0
B
n
}
is increasing and only dependson a finite number of
edges. So we can apply Russo’s formula
d
dp
P
p
(0 B
n
) =
X
eB
n
P
p
(e is pivotal for {0 B
n
})
Since being pivotal and being open/closed are independent, we can write
this as
=
X
eB
n
1
1 p
P
p
(e is pivotal for {0 B
n
}, e is closed)
=
X
eB
n
1
1 p
P
p
(e is pivotal for {0 B
n
}, 0 6↔ B
n
)
Define S = {x B
n
: x 6↔ B
n
}. Then {0 6↔ B
n
} implies 0 S. So
d
dp
P
p
(0 B
n
) =
1
1 p
X
eB
n
X
AB
n
,0A
P
p
(e is pivotal, S = A)
Given that
S
=
A
, an edge
e
= (
x, y
) is pivotal iff
e A
and 0
A
x
. So
we know
d
dp
P
p
(0 B
n
) =
1
1 p
X
AB
n
,0A
X
(x,y)∂A
P
p
(0
A
x, S = A).
Observe that
{
0
A
x}
and
{S
=
A}
are independent, since to determine if
S
=
A
, we only look at the edges on the boundary of
A
. So the above is
equal to
1
1 p
X
AB
n
,0A
X
(x,y)∂A
P
p
(0
A
x)P
p
(S = A)
=
1
p(1 p)
X
AB
n
,0A
ϕ
p
(A)P
p
(S = A)
1
p(1 p)
inf
SB
n
,0S
ϕ
P
(S)P
p
(0 6↔ B
n
),
as desired.
P
p
(0
B
n
)
e
cn
is the best convergence rate when
p < p
c
, but we cannot do better, since P
p
(0 B
n
) p
n
.
Also, if p < p
c
, then we can easily bound
P
p
(|C(0)| n) P
p
(0 B
n
1/d
) exp(cn
1/d
).
However, this is not a good bound. In fact,
n
1/d
can be replaced by
n
, but we
will not prove it here. This tells us the largest cluster in
B
n
will have size of
order log n with high probability. 1.3 Two dimensions
We now focus on 2 dimensions. As discussed previously, we can exploit duality to
prove a lot of things specific to two dimensions. In particular, we will show that,
at
p
=
p
c
=
1
2
, certain probabilities such as
P
1
2
(0
B
(
n
)) exhibit a power law
decay. This is in contrast to the exponential decay for subcritical percolation
(and being bounded away from zero for supercritical percolation).
First we establish that p
c
is actually
1
2
.
Theorem. In Z
2
, we have θ
1
2
= 0 and p
c
=
1
2
.
It is conjectured that for all
d
2, we have
θ
(
p
c
(
d
)) = 0. It is known to be
true only in d = 2 and d 11.
This was proved first by Harris, Kesten, Russo, Seymour, Welsh in several
iterations.
Proof. First we prove that θ
1
2
= 0. This will imply that p
c
1
2
.
Suppose not, and θ
1
2
> 0. Recall that B(n) = [n, n]
2
, and we define
C(n) = [(n 1), (n 1)]
2
+
1
2
,
1
2
in the dual lattice. The appearance of the
1 is just a minor technical inconve-
nience. For the same n, our B(n) and C(n) look like
B(n)C(n)B(n)
We claim that for large
n
, there is a positive probability that there are open
paths from the left and right edges of
B
(
n
) to
, and also there are closed
paths from the top and bottom edges of
C
(
n
) to
. But we know that with
probability 1, there is a unique infinite cluster in both the primal lattice and
the dual lattice. To connect up the two infinite open paths starting from the
left and right edges of
B
(
n
), there must be an open left-right crossing of
B
(
n
).
To connect up the two infinite closed paths starting from the top and bottom
of
C
(
n
), there must be a closed top-bottom crossing. But these cannot both
happen, since this would require an open primal edge crossing a closed dual edge,
which is impossible.
To make this an actual proof, we need to show that these events do happen
with positive probability. We shall always take
p
=
1
2
, and will not keep repeating
it.
First note that since there is, in particular, an infinite cluster with probability
1, we have
P(B(n) ) 1.
So we can pick n large enough such that
P(B(n) ), P(C(n) ) 1
1
8
4
.
Let
A
`
/
A
r
/
A
t
/
A
b
be the events that the left/right/top/bottom side of
B
(
n
) is
connected to
via an open path of edges. Similarly, let
D
`
be the event that
the left of C(n) is connected to via a closed path, and same for D
r
, D
r
, D
b
.
Of course, by symmetry, for
i, j {`, r, t, b}
, we have
P
(
A
i
) =
P
(
A
j
). Using
FKG, we can bound
P(S
n
6↔ ) = P(A
c
`
A
c
r
A
c
t
A
c
b
) (P(A
c
`
))
4
= (1 P(A
`
))
4
.
Thus, by assumption on n, we have
(1 P(A
`
))
4
1
8
4
,
hence
P(A
`
)
7
8
.
Of course, the same is true for other A
i
and D
j
.
Then if G = A
`
A
r
D
t
D
b
, which is the desired event, then we have
P(G
c
) P(A
c
`
) + P(A
c
r
) + P(D
c
t
) + P(D
c
b
)
1
2
.
So it follows that
P(G)
1
2
,
So we have
p
c
1
2
. It remains to prove that
p
c
1
2
that
p
c
>
1
2
. Then
p
=
1
2
is in the subcritical regime, and we expect exponential
decay. Thus, (again with
p
=
1
2
fixed) there exists a
c >
0 such that for all
n 1,
P(0 B(n)) e
cn
.
Consider
C
n
= [0
, n
+ 1]
×
[0
, n
], and define
A
n
to be the event that there exists
a left-right crossing of C
n
by open edges.
Again consider the dual box D
n
= [0, n] × [1, n] +
1
2
,
1
2
.
C
n
D
n
Define
B
n
to be the event that there is a top-bottom crossing of
D
n
by closed
edges of the dual.
As before, it cannot be the case that
A
n
and
B
n
both occur. In fact,
A
n
and
B
n
partition the whole space, since if
A
n
does not hold, then every path from
left to right of C
n
is blocked by a closed path of the dual. Thus, we know
P(A
n
) + P(B
n
) = 1.
But also by symmetry, we have P(A
n
) = P(B
n
). So
P(A
n
) =
1
2
.
On the other hand, for any point on the left edge, the probability of it reaching
the right edge decays exponentially with n. Thus,
P(A
n
) n(n + 1)P(0 B
n
) (n + 1)e
cn
which is a contradiction. So we are done.
So we now know that p
c
=
1
2
. We now want to prove that
P
1
2
(0 B(n)) An
α
for some
A, α
. To prove this, we again consider the dual lattice. Observe that if
we have a closed dual circuit around the origin, then this prevents the existence
of an open path from the origin to a point far far away. What we are going to
do is to construct “zones” in the dual lattice like this:
The idea is to choose these zones in a way such that the probability that each
zone contains a closed circuit around the origin is
ζ
for some fixed
ζ
. Thus, if
B
(
n
) contains
m
many of these zones, then the probability that 0 is connected
to
B
(
n
) is bounded above by (1
ζ
)
m
. We would then be done if we can show
that m log n.
The main strategy to bounding these probabilities is to use FKG. For example,
if we want to bound the probability that there is a closed circuit in a region
then we cut it up into the pieces
If we can bound the probability of there being a left-to-right crossing of a
horizontal piece, and hence by symmetry the probability of there being a top-to-
bottom crossing of a vertical piece, then FKG gives us a bound on the probability
of there being a closed circuit.
For convenience of notation, we prove these bounds for open paths in the
primal lattice. We make the following definitions:
B(k`, `) = [`, (2k 1)`] × [`, `]
B(`) = B(`, `) = [`, `]
2
A(`) = B(3`) \ B(`)
LR(k`, `) = {there exists left-right crossing of B(k`, `) of open edges}
LR(`) = LR(`, `)
O(`) = {there exists open circuit in A(`) that contains 0 in its interior}.
We first note that we have already proven the following: