1Percolation

III Percolation and Random Walks on Graphs 1.2 Correlation inequalities
In this section, we are going to prove some useful inequalities and equalities,
and use them to prove some interesting results about
θ
(
p
) and the decay of
P
p
(0 B
n
).
To motivate our first inequality, suppose we have 4 points
x, y, u, v
, and we
want to ask for the conditional probability
P
p
(x y | u v).
Intuitively, we expect this to be greater than
P
p
(
x y
), since
u v
tells us
there are some open edges around, which is potentially helpful. The key property
underlying this intuition is that having more edges is beneficial to both of the
events. To quantify this, we need the notion of an increasing random variable.
Again, let
G
= (
V, E
) be a graph and Ω =
{
0
,
1
}
E
. We shall assume that
E
is countable.
Notation (). Given ω, ω
0
Ω, we write ω ω
0
if ω(e) ω
0
(e) for all e E.
This defines a partial order on Ω.
Definition
(Increasing random variable)
.
A random variable
X
is increasing if
X(ω) X(ω
0
) whenever ω ω
0
, and is decreasing if X is increasing.
Definition
(Increasing event)
.
An event
A
is increasing (resp. decreasing) if
the indicator 1(A) is increasing (resp. decreasing)
Example. {|C(0)| = ∞} is an increasing event.
An immediate consequence of the definition is that
Theorem. If N is an increasing random variable and p
1
p
2
, then
E
p
1
[N] E
p
2
[N],
and if an event A is increasing, then
P
p
1
(A) P
p
2
(A).
Proof. Immediate from coupling.
What we want to prove is the following result, which will be extremely useful.
Theorem
(Fortuin–Kasteleyn–Ginibre (FKG) inequality)
.
Let
X
and
Y
be
increasing random variables with E
p
[X
2
], E
p
[Y
2
] < . Then
E
p
[XY ] E
p
[X]E
p
[Y ].
In particular, if A and B are increasing events, then
P
p
(A B) P
p
(A)P
p
(B).
Equivalently,
P
p
(A | B) P
p
(A).
Proof.
The plan is to first prove this in the case where
X
and
Y
depend on a
finite number of edges, and we do this by induction. Afterwards, the desired
result can be obtained by an application of the martingale convergence theorem.
In fact, the “real work” happens when we prove it for
X
and
Y
depending on
a single edge. Everything else follows from messing around with conditional
probabilities.
If
X
and
Y
depend only on a single edge
e
1
, then for any
ω
1
, ω
2
{
0
,
1
}
, we
claim that
(X(ω
1
) X(ω
2
))(Y (ω
1
) Y (ω
2
)) 0.
Indeed,
X
and
Y
are both increasing. So if
ω
1
> ω
2
, then both terms are
positive; if
ω < ω
2
, then both terms are negative, and there is nothing to do if
they are equal.
In particular, we have
X
ω
1
2
∈{0,1}
(X(ω
1
) X(ω
2
))(Y (ω
1
) Y (ω
2
))P
p
(ω(e
1
) = ω
1
)P
p
(ω(e
1
) = ω
2
) 0.
Expanding this, we find that the LHS is 2(
E
p
[
XY
]
E
p
[
X
]
E
p
[
Y
]), and so we
are done.
Now suppose the claim holds for
X
and
Y
that depend on
n < k
edges. We
shall prove the result when they depend on k edges e
1
, . . . , e
k
. We have
E
p
[XY ] = E
p
[E
p
[XY | ω(e
1
), . . . , ω(e
k1
)]].
Now after conditioning on
ω
(
e
1
)
, . . . , ω
(
e
k1
), the random variables
X
and
Y
become increasing random variables of ω(e
k
). Applying the first step, we get
E
p
[XY | ω(e
1
), . . . , ω(e
k1
)]
E
p
[X | ω(e
1
), . . . , ω(e
k1
)]E
p
[Y | ω(e
1
), . . . , ω(e
k1
)]. ()
But
E
p
[
X | ω
(
e
1
)
, . . . , ω
(
e
k1
)] is a random variable depending on the edges
e
1
, . . . , e
k1
, and moreover it is increasing. So the induction hypothesis tells us
E
p
E
p
[X | ω(e
1
), . . . , ω(e
k1
)]E
p
[Y | ω(e
1
), . . . , ω(e
k1
)]
E
p
E
p
[X | ω(e
1
), . . . , ω(e
k1
)]
E
p
E
p
[Y | ω(e
1
), . . . , ω(e
k1
)]
Combining this with (the expectation of) () then gives the desired result.
Finally, suppose
X
and
Y
depend on the states of a countable set of edges
e
1
, e
2
, . . .. Let’s define
X
n
= E
p
[X | ω(e
1
), . . . , ω(e
n
)]
Y
n
= E
p
[Y | ω(e
1
), . . . , ω(e
n
)]
Then
X
n
and
Y
n
are martingales, and depend only on the states of only finitely
many edges. So we know that
E
p
[X
n
Y
n
] E
p
[X
n
]E
p
[Y
n
] = E
p
[X]E
p
[Y ].
By the
L
2
-martingale convergence theorem,
X
n
X
,
Y
n
Y
in
L
2
and almost
surely. So taking the limit n , we get
E
p
[XY ] E
p
[X]E
p
[Y ].
What we want to consider next is the notion of disjoint occurrence. For
example, we want to able to ask the probability that there exists two disjoint
paths connecting a pair of points.
To formulate this disjointness, suppose we have an event
A
, and let
ω A
.
To ask whether this occurrence of
A
depends only on some set
S E
of edges,
we can look at the set
[ω]
S
= {ω
0
Ω : ω
0
(e) = ω(e) for all e S}.
If [
ω
]
S
A
, then we can rightfully say this occurrence of
A
depends only on
the edges in
S
. Note that this depends explicitly and importantly on
ω
, i.e. the
“reason”
A
happened. For example, if
A
=
{x y}
, and
ω A
, then we can
take
S
to be the set of all edges in a chosen path from
x
to
y
in the configuration
of ω. This choice will be different for different values of ω.
Using this, we can define what it means for two events to occur disjointly.
Definition
(Disjoint occurrence)
.
Let
F
be a set and =
{
0
,
1
}
F
. If
A
and
B
are events, then the event that A and B occurs disjointly is
A B = {ω Ω : S F s.t. [ω]
S
A and [ω]
F \S
B}.
Theorem
(BK inequality)
.
Let
F
be a finite set and Ω =
{
0
,
1
}
F
. Let
A
and
B be increasing events. Then
P
p
(A B) P
p
(A)P
p
(B).
This says if
A
and
B
are both events that “needs” edges to occur, then requir-
ing that they occur disjointly is more difficult than them occurring individually.
The proof is completely magical. There exist saner proofs of the inequality,
but they are rather longer.
We prove by induction on the size
n
of the set
F
.
For n = 0, it is trivial.
Suppose it holds for
n
1. We want to show it holds for
n
. For
D {
0
,
1
}
F
and i = 0, 1, set
D
i
= {(ω
1
, . . . , ω
n1
) : (ω
1
, . . . , ω
n1
, i) D}.
Let A, B {0, 1}
F
, and C = A B. We check that
C
0
= A
0
B
0
, C
1
= (A
0
B
1
) (A
1
B
0
).
Since
A
and
B
are increasing,
A
0
A
1
and
B
0
B
1
, and
A
i
and
B
i
are also
increasing events. So
C
0
(A
0
B
1
) (A
1
B
0
)
C
1
A
1
B
1
.
By the induction hypothesis, we have
P
p
(C
0
) = P
p
(A
0
B
0
) P
p
(A
0
)P
p
(B
0
)
P
p
(C
1
) P
p
(A
1
B
1
) P
p
(A
1
)P
p
(B
1
)
P
p
(C
0
) + P
p
(C
1
) P
p
((A
0
B
1
) (A
1
B
0
)) + P
p
((A B
1
) (A
1
B
0
))
= P
p
(A
0
B
1
) + P
p
(A
1
B
0
)
P
p
(A
0
)P
p
(B
1
) + P
p
(A
1
)P
p
(B
0
).
Now note that for any D, we have
P
p
(D) = pP
p
(D
1
) + (1 p)P
p
(D
0
).
By some black magic, we multipy the first inequality by (1
p
)
2
, the second by
p
2
and the third by p(1 p). This gives
pP
p
(C
1
) + (1p)P
p
(C
0
) (pP
p
(A
1
) + (1 p)P
p
(A
0
))(pP
p
(B
1
) + (1 p)P
p
(B
0
)).
Expand and we are done.
It turns out the increasing hypothesis is not necessary:
Theorem
(Reimer)
.
For all events
A, B
depending on a finite set, we have
P
p
(A B) P
p
(A)P
p
(B).
But the proof is much harder, and in all the case where we want to apply
this, the variables are increasing.
As an application of the BK inequality, we first prove a preliminary result
P
p
(0
B
(
n
)). To prove our result, we will need a stronger
condition than p < p
c
. Recall that we defined
θ(p) = P
p
(|C(0)| = ).
We also define
χ(p) = E
p
[|C(0)|].
If χ(p) is finite, then of course θ(p) = 0. However, the converse need not hold.
Theorem.
If
χ
(
p
)
<
, then there exists a positive constant
c
such that for all
n 1,
P
p
(0 B(n)) e
cn
.
Later, we will show that in fact this holds under the assumption that
p < p
c
.
However, that requires a bit more technology, which we will develop after this
proof.
The idea of the proof is that if we want a path from, say, 0 to
B
(2
n
), then
the path must hit a point on
B
(
n
). So there is a path from 0 to a point on
B
(
n
), and a path from that point to
B
(2
n
). Moreover, these two paths are
disjoint, which allows us to apply the BK inequality.
Proof. Let
X
n
=
X
xB(n)
1(0 x).
Now consider
X
n=0
E[X
n
] =
X
n
X
xB(n)
P
p
(0 x) =
X
xZ
d
P
p
(0 x) = χ(p).
Since
χ
(
p
) is finite, we in particular have
E
p
[
X
n
]
0 as
n
. Take
m
large
enough such that E
p
[X
m
] < δ < 1.
Now we have
P
p
(0 B(m + k)) = P
p
(x B(m) : 0 x and x B(m + k) disjointly)
X
xB(m)
P
p
(0 x)P
p
(x B(m + k)) (BK)
X
xB(m)
P
p
(0 x)P
p
(0 B(k)) (trans. inv.)
P
p
(0 B(k))E
p
[X
m
].
So for any
n > m
, write
n
=
qm
+
r
, where
r
[0
, m
1]. Then iterating the
above result, we have
P
p
(0 B(n)) P
p
(0 B(mq)) δ
q
δ
1+
n
m
e
cn
.
To replace the condition with the weaker condition
p < p
c
, we need to
understand how
θ
(
p
) changes with
p
. We know
θ
(
p
) is an increasing function in
p
. It would be great if it were differentiable, and even better if we could have an
explicit formula for
dθ
dp
.
To do so, suppose we again do coupling, and increase
p
by a really tiny bit.
Then perhaps we would expect that exactly one of the edges switches from being
closed to open. Thus, we want to know if the state of this edge is pivotal to the
event |C(0)| = , and this should determine the rate of change of θ.
Definition
(Pivotal edge)
.
Let
A
be an event and
ω
a percolation configuration.
The edge e is pivotal for (A, ω) if
1(ω A) 6= 1(ω
0
A),
where ω
0
is defined by
ω
0
(f) =
(
ω(f) f 6= e
1 ω(f) f = e
.
The event that
e
is pivotal for
A
is defined to be the set of all
ω
such that
e
is
pivotal for (A, ω).
Note that whether or not e is pivotal for (A, ω) is independent of ω(e).
Theorem
(Russo’s formula)
.
Let
A
be an increasing event that depends on the
states of a finite number of edges. Then
d
dp
P
p
(A) = E
p
[N(A)],
where N(A) is the number of pivotal edges for A.
Proof.
Assume that
A
depends the states of
m
edges
e
1
, . . . , e
m
. The idea is to
let each
e
i
be open with probability
p
i
, whree the
{p
i
}
may be distinct. We then
vary the p
i
one by one and see what happens.
Writing ¯p = (p
1
, . . . , p
m
), we define
f(p
1
, . . . , p
m
) = P
¯p
(A),
Now
f
is the sum of the probability of all configurations of
{e
1
, . . . , e m}
for
which
A
happens, and is hence a finite sum of polynomials. So in particular, it
is differentaible.
We now couple all percolation process. Let (
X
(
e
) :
e L
d
) be iid
U
[0
,
1]
random variables. For a vector ¯p = (p(e) : e L
d
), we write
η
¯p
(e) = 1(X(e) p(e)).
Then we have P
¯p
(A) = P(η
¯p
A).
Fix an edge
f
and let
¯p
0
= (
p
0
(
e
)) be such that
p
0
(
e
) =
p
(
e
) for all
e 6
=
f
, and
p
0
(f) = p(f) + δ for some δ > 0. Then
P
¯p
0
(A) P
¯p
(A) = P(η
¯p
0
A) P(η
¯p
A)
= P(η
¯p
0
A, η
¯p
A) + P(η
¯p
0
A, η
¯p
A) P(η
¯p
A).
But we know
A
is increasing, so
P
(
η
¯p
0
A, η
¯p
A
) =
P
(
η
¯p
A
). So the first
and last terms cancel, and we have
P
¯p
0
(A) P
¯p
(A) = P(η
¯p
0
A, η
¯p
6∈ A).
But we observe that we simply have
P(η
¯p
0
A, η
¯p
6∈ A) = δ · P
¯p
(f is pivotal for A).
Indeed, we by definition of pivotal edges, we have
P(η
¯p
0
A, η
¯p
6∈ A) = P
¯p
(f is pivotal for A, p(f ) < X(f ) p(f ) + δ).
Since the event
{f is pivotal for A}
is independent of the state of the edge
f
,
we obtain
P
¯p
(f is pivotal, p(f ) < X(f ) p(f ) + δ) = P
¯p
(f is pivotal) · δ.
Therefore we have
p(f)
P
¯p
(A) = lim
δ0
P
¯p
0
(A) P
¯p
(A)
δ
= P
¯p
(f is pivotal for A).
The desired result then follows from the chain rule:
d
dp
P
p
(A) =
m
X
i=1
p(e
i
)
P
¯p
(A)
¯p=(p,...,p)
=
m
X
i=1
P
p
(e
i
is pivotal for A)
= E
p
[N(A)].
If
A
depends on an infinite number of edges, then the best we can say is that
lim inf
δ0
P
p+δ
(A) P
p
(A)
δ
E
p
[N(A))].
To see this, again set B(n) = [n, n]
d
Z
d
. Define ¯p
n
by
¯p
n
(e) =
(
p e 6∈ B(n)
p + δ e B(n)
.
Then since a is increasing, we know
P
p+δ
(A) P
p
(A)
δ
P
¯p
n
(A) P
p
(A)
δ
.
We can then apply the previous claim, take successive differences, and take
n .
Corollary.
Let
A
be an increasing event that depends on
m
edges. Let
p q
[0, 1]. Then P
q
(A) P
p
(A)
q
p
m
.
Proof.
We know that
{f is pivotal for A}
is independent of the state of
f
, and
so
P
p
(ω(f) = 1, f is pivotal for A) = pP
p
(f is pivotal for A).
But since
A
is increasing, if
ω
(
f
) = 1 and
f
is pivotal for
A
, then
A
occurs.
Conversely, if f is pivotal and A occurs, then ω(f) = 1.
Thus, by Russo’s formula, we have
d
dp
P
p
(A) = E
p
[N(A)]
=
X
e
P
p
(e is pivotal for A)
=
X
e
1
p
P
p
(ω(e) = 1, e is pivotal for A)
=
X
e
1
p
P
p
(e is pivotal | A)P
p
(A)
= P
p
(A)
1
p
E
p
[N(A) | A].
So we have
d
dp
P
p
(A)
P
p
(A)
=
1
p
E
p
[N(A) | A].
Integrating, we find that
log
P
q
(A)
P
p
(A)
=
Z
q
p
1
u
E
u
[N(A) | A] du.
Bounding E
u
[N(A) | A] m, we obtain the desired bound.
With Russo’s formula, we can now prove the desired theorem.
Theorem. Let d 2 and B
n
= [n, n]
d
Z
d
.
(i)
If
p < p
c
, then there exists a positive constant
c
for all
n
1,
P
p
(0
B
n
) e
cn
.
(ii) If p > p
c
, then
θ(p) = P
p
(0 )
p p
c
p(1 p
c
)
.
This was first proved by Aizenman and Barsky who looked at the more
general framework of long-range percolation. Menshikov gave an alternative
proof by analyzing the geometry of pivotal edges. The proof we will see is by
Duminil-Copin and Tassion. Recall that we defined
χ(p) = E
p
[|C(0)|].
We saw that if
χ
(
p
)
<
, then
P
p
(0
B
n
)
e
cn
. We now see that
χ
(
p
)
<
iff p < p
c
.
The strategy of the proof is to define a new percolation probability
˜p
c
, whose
definition makes it easier to prove the theorem. Once we have established the
two claims, we see that (i) forces ˜p
c
p
c
, and (ii) forces ˜p
c
p
c
. So they must
be equal.
Proof (Duminil-Copin and Tassion). If S V is finite, we write
S = {(x, y) E : x S, y 6∈ S}.
We write
x
S
y
if there exists an open path of edges from
x
to
y
all of whose
end points lie in S.
Now suppose that 0 S. We define
ϕ
p
(S) = p
X
(x,y)∂S
P
p
(0
S
x).
Define
˜p
c
= sup{p [0, 1] : exists a finite set S with 0 S and ϕ
p
(S) < 1}.
Claim. It suffices to prove (i) and (ii) with p
c
replaced by ˜p
c
.
Indeed, from (i), if
p < ˜p
c
, then
P
p
(0
B
n
)
e
cn
. So taking the limit
n
, we see
θ
(
p
) = 0. So
˜p
c
p
c
. From (ii), if
p > ˜p
c
, then
θ
(
p
)
>
0. So
p
c
˜p
c
. So p
c
= ˜p
c
.
We now prove (i) and (ii):
(i)
Let
p < ˜p
c
. Then there exists a finite set
S
containing 0 with
ϕ
p
(
S
)
<
1.
Since
S
is finite, we can pick
L
large enough so that
S B
L1
. We will
prove that P
p
(0 B
kL
) (ϕ
p
(S))
k1
for k 1.
Define C = {x S : 0
S
x}. Since S B
L1
, we know S B
kL
= .
Now if we have an open path from 0 to
B
kL
, we let
x
be the last element
on the path that lies in
C
. We can then replace the path up to
x
by a path
that lies entirely in
S
, by assumption. This is then a path that lies in
C
up
to
x
, then takes an edge on
S
, and then lies entirely outside of
C
c
. Thus,
P
p
(0 B
kL
)
X
AS
0A
X
(x,y)∂A
P
p
(0
A
x, (x, y) open, C = A, y
A
C
B
kL
).
Now observe that the events
{C
=
A,
0
S
x}
,
{
(
x, y
)
is open}
and
{y
A
c
B
kL
} are independent. So we obtain
P
p
(0 B
kL
)
X
AS,0A
X
(x,y)∂S
p P
p
(0
S
x, C = A) P
p
(y
A
c
B
kL
).
Since we know that y B
L
, we can bound
P
p
(y
A
c
B
kL
) P
p
(0 B
(k1)L
).
So we have
P
p
(0 B
kL
) p P
p
(0 B
(k1)L
)
X
AS,0A
X
(x,y)∂S
P
p
(0
S
x, C = A)
= P
p
(0 B
(k1)L
) p
X
(x,y)∂S
P
p
(0
S
x)
= P
p
(0 B
(k1)L
)ϕ
p
(S).
Iterating, we obtain the deseired result.
(ii) We want to use Russo’s formula. We claim that it suffices to prove that
d
dp
P
p
(0 B
n
)
1
p(1 p)
inf
SB
n
,0S
ϕ
p
(S)(1 P
p
(0 B
n
)).
Indeed, if
p > ˜p
c
, we integrate from
˜p
c
to
p
, use in this range
ϕ
p
(
S
)
1,
and then take the limit as n .
The event
{
0
B
n
}
is increasing and only dependson a finite number of
edges. So we can apply Russo’s formula
d
dp
P
p
(0 B
n
) =
X
eB
n
P
p
(e is pivotal for {0 B
n
})
Since being pivotal and being open/closed are independent, we can write
this as
=
X
eB
n
1
1 p
P
p
(e is pivotal for {0 B
n
}, e is closed)
=
X
eB
n
1
1 p
P
p
(e is pivotal for {0 B
n
}, 0 6↔ B
n
)
Define S = {x B
n
: x 6↔ B
n
}. Then {0 6↔ B
n
} implies 0 S. So
d
dp
P
p
(0 B
n
) =
1
1 p
X
eB
n
X
AB
n
,0A
P
p
(e is pivotal, S = A)
Given that
S
=
A
, an edge
e
= (
x, y
) is pivotal iff
e A
and 0
A
x
. So
we know
d
dp
P
p
(0 B
n
) =
1
1 p
X
AB
n
,0A
X
(x,y)∂A
P
p
(0
A
x, S = A).
Observe that
{
0
A
x}
and
{S
=
A}
are independent, since to determine if
S
=
A
, we only look at the edges on the boundary of
A
. So the above is
equal to
1
1 p
X
AB
n
,0A
X
(x,y)∂A
P
p
(0
A
x)P
p
(S = A)
=
1
p(1 p)
X
AB
n
,0A
ϕ
p
(A)P
p
(S = A)
1
p(1 p)
inf
SB
n
,0S
ϕ
P
(S)P
p
(0 6↔ B
n
),
as desired.
P
p
(0
B
n
)
e
cn
is the best convergence rate when
p < p
c
, but we cannot do better, since P
p
(0 B
n
) p
n
.
Also, if p < p
c
, then we can easily bound
P
p
(|C(0)| n) P
p
(0 B
n
1/d
) exp(cn
1/d
).
However, this is not a good bound. In fact,
n
1/d
can be replaced by
n
, but we
will not prove it here. This tells us the largest cluster in
B
n
will have size of
order log n with high probability.