5Brownian motion

III Advanced Probability



5.4 Donsker’s invariance principle
To end our discussion on Brownian motion, we provide an alternative construction
of Brownian motion, given by Donsker’s invariance principle. Suppose we run any
simple random walk on
Z
d
. We can think of this as a very coarse approximation
of a Brownian motion. As we zoom out, the step sizes in the simple random
walk look smaller and smaller, and if we zoom out sufficiently much, then we
might expect that the result looks like Brownian motion, and indeed it converges
to Brownian motion in the limit.
Theorem
(Donsker’s invariance principle)
.
Let (
X
n
)
n0
be iid random variables
with mean 0 and variance 1, and set S
n
= X
1
+ ··· + X
n
. Define
S
t
= (1 {t})s
btc
+ {t}S
btc+1
.
where {t} = t btc.
Define
(S
[N]
t
)
t0
= (N
1/2
S
t·N
)
t[0,1]
.
As (
S
[N]
t
)
t[0,1]
converges in distribution to the law of standard Brownian motion
on [0, 1].
The reader might wonder why we didn’t construct our Brownian motion
this way instead of using Wiener’s theorem. The answer is that our proof of
Donsker’s invariance principle relies on the existence of a Brownian motion! The
relevance is the following theorem:
Theorem
(Skorokhod embedding theorem)
.
Let
µ
be a probability measure on
R
with mean 0 and variance
σ
2
. Then there exists a probability space (Ω
, F, P
)
with a filtration (
F
t
)
t0
on which there is a standard Brownian motion (
B
t
)
t0
and a sequence of stopping times (T
n
)
n0
such that, setting S
n
= B
T
n
,
(i) T
n
is a random walk with steps of mean σ
2
(ii) S
n
is a random walk with step distribution µ.
So in some sense, Brownian motion contains all random walks with finite
variance.
The only stopping times we know about are the hitting times of some value.
However, if we take
T
n
to be the hitting time of some fixed value, then
B
T
n
would be a pretty poor attempt at constructing a random walk. Thus, we may
try to come up with the following strategy construct a probability space
with a Brownian motion (
B
t
)
t0
, and an independent iid sequence (
X
n
)
nN
of
random variables with distribution
µ
. We then take
T
n
to be the first hitting
time of
X
1
+
···
+
X
n
. Then setting
S
n
=
X
T
n
, property (ii) is by definition
satisfied. However, (i) will not be satisfied in general. In fact, for any
y 6
= 0, the
expected first hitting time of
y
is infinite! The problem is that if, say,
y >
0,
and we accidentally strayed off to the negative side, then it could take a long
time to return.
The solution is to “split”
µ
into two parts, and construct two random variables
(
X, Y
)
[0
,
)
2
, such that if
T
is
T
is the first hitting time of (
X, Y
), then
B
T
has law µ.
Since we are interested in the stopping times
T
x
and
T
y
, the following
computation will come in handy:
Lemma. Let x, y > 0. Then
P
0
(T
x
< T
y
) =
y
x + y
, E
0
T
x
T
y
= xy.
Proof sketch. Use optional stopping with (B
2
t
t)
t0
.
Proof of Skorokhod embedding theorem.
Define Borel measures
µ
±
on [0
,
) by
µ
±
(A) = µ(±A).
Note that these are not probability measures, but we can define a probability
measure ν on [0, )
2
given by
dν(x, y) = C(x + y) dµ
(x) dµ
+
(y)
for some normalizing constant
C
(this is possible since
µ
is integrable). This
(
x
+
y
) is the same (
x
+
y
) appearing in the denominator of
P
0
(
T
x
< T
y
) =
y
x+y
.
Then we claim that any (X, Y ) with this distribution will do the job.
We first figure out the value of C. Note that since µ has mean 0, we have
C
Z
0
x dµ
(x) = C
Z
0
y dµ
+
(y).
Thus, we have
1 =
Z
C(x + y) dµ
(x) dµ
+
(y)
= C
Z
x dµ
(x)
Z
dµ
+
(y) + C
Z
y dµ
+
(y)
Z
dµ
(x)
= C
Z
x dµ
(x)
Z
dµ
+
(y) +
Z
dµ
(x)
= C
Z
x dµ
(x) = C
Z
y dµ
+
(y).
We now set up our notation. Take a probability space (Ω
, F, P
) with a stan-
dard Brownian motion (
B
t
)
t0
and a sequence ((
X
n
, Y
n
))
n0
iid with distribution
ν and independent of (B
t
)
t0
.
Define
F
0
= σ((X
n
, Y
n
), n = 1, 2, . . .), F
t
= σ(F
0
, F
B
t
).
Define a sequence of stopping times
T
0
= 0, T
n+1
= inf{t T
n
: B
t
B
T
n
{−X
n+1
, Y
n+1
}}.
By the strong Markov property, it suffices to prove that things work in the case
n = 1. So for convenience, let T = T
1
, X = X
1
, Y = Y
1
.
To simplify notation, let τ : C([0, 1], R) ×[0, )
2
[0, ) be given by
τ(ω, x, y) = inf{t 0 : ω(t) {−x, y}}.
Then we have
T = τ ((B
t
)
t0
, X, Y ).
To check that this works, i.e. (ii) holds, if A [0, ), then
P(B
T
A) =
Z
[0,)
2
Z
C([0,),R)
1
τ(ω,x,y)A
dµ
B
(ω) dν(x, y).
Using the first part of the previous computation, this is given by
Z
[0,)
2
x
x + y
1
yA
C(x + y) dµ
(x) dµ
+
(y) = µ
+
(A).
We can prove a similar result if A (−∞, 0). So B
T
has the right law.
To see that T is also well-behaved, we compute
ET =
Z
[0,)
2
Z
C([0,1],R)
τ(ω, x, y) dµ
B
(ω) dν(x, y)
=
Z
[0,)
2
xy dν(x, y)
= C
Z
[0,)
2
(x
2
y + yx
2
) dµ
(x) dµ
+
(y)
=
Z
[0,)
x
2
dµ
(x) +
Z
[0,)
y
2
dµ
+
(y)
= σ
2
.
The idea of the proof of Donsker’s invariance principle is that in the limit of
large
N
, the
T
n
are roughly regularly spaced, by the law of large numbers, so
this allows us to reverse the above and use the random walk to approximate the
Brownian motion.
Proof of Donsker’s invariance principle.
Let (
B
t
)
t0
be a standard Brownian
motion. Then by Brownian scaling,
(B
(N)
t
)
t0
= (N
1/2
B
t/N
)
t0
is a standard Brownian motion.
For every
N >
0, we let (
T
(N)
n
)
n0
be a sequence of stopping times as in the
embedding theorem for B
(N)
. We then set
S
(N)
n
= B
(N)
T
(N)
n
.
For t not an integer, define S
(N)
t
by linear interpolation. Observe that
((T
(N)
n
)
n0
, S
(N)
t
) ((T
(1)
n
)
n0
, S
(1)
t
).
We define
˜
S
(N)
t
= N
1/2
S
(N)
tN
,
˜
T
(N)
n
=
T
(N)
n
N
.
Note that if t =
n
N
, then
˜
S
(N)
n/N
= N
1/2
S
(N)
n
= N
1/2
B
(N)
T
(N)
n
= B
T
(N)
n
/N
= B
˜
T
N
n
. ()
Note that (
˜
S
(N)
t
)
t0
(
S
(N)
t
)
t0
. We will prove that we have convergence in
probability, i.e. for any δ > 0,
P
sup
0t<1
|
˜
S
(N)
t
B
t
| > δ
= P(k
˜
S
(N)
Bk
> δ) 0 as N .
We already know that
˜
S
and
B
agree at some times, but the time on
˜
S
is fixed
while that on
B
is random. So what we want to apply is the law of large numbers.
By the strong law of large numbers,
lim
n→∞
1
n
|T
(1)
n
n| 0 as n 0.
This implies that
sup
1nN
1
N
|T
(1)
n
n| 0 as N .
Note that (T
(1)
n
)
n0
(T
(N)
n
)
n0
, it follows for any δ > 0,
P
sup
1nN
T
(N)
n
N
n
N
δ
!
0 as N .
Using (
) and continuity, for any
t
[
n
N
,
n+1
N
], there exists
u
[
T
(N)
n/N
, T
(N)
(n+1)/N
]
such that
˜
S
(N)
t
= B
u
.
Note that if times are approximated well up to δ, then |t u| δ +
1
N
.
Hence we have
{k
˜
S Bk
> ε}
n
˜
T
(N)
n
n
N
> δ for some n N
o
|B
t
B
u
| > ε for some t [0, 1], |t u| < δ +
1
N
.
The first probability
0 as
n
. For the second, we observe that (
B
t
)
T [0,1]
has uniformly continuous paths, so for
ε >
0, we can find
δ >
0 such that the
second probability is less than ε whenever N >
1
δ
(exercise!).
So
˜
S
(N)
B
uniformly in probability, hence converges uniformly in distri-
bution.