III Advanced Probability - Brownian motion

5Brownian motion

III Advanced Probability

5.4 Donsker’s invariance principle

To end our discussion on Brownian motion, we provide an alternative construction

of Brownian motion, given by Donsker’s invariance principle. Suppose we run any

simple random walk on

. We can think of this as a very coarse approximation

of a Brownian motion. As we zoom out, the step sizes in the simple random

walk look smaller and smaller, and if we zoom out sufficiently much, then we

might expect that the result looks like Brownian motion, and indeed it converges

to Brownian motion in the limit.

Theorem

(Donsker’s invariance principle)

Let (

)

n≥0

be iid random variables

with mean 0 and variance 1, and set S

= X

+ ··· + X

. Define

= (1 −{t})s

btc

+ {t}S

btc+1

where {t} = t −btc.

Define

[N]

)

t≥0

= (N

−1/2

t·N

)

t∈[0,1]

As (

[N]

)

t∈[0,1]

converges in distribution to the law of standard Brownian motion

on [0, 1].

The reader might wonder why we didn’t construct our Brownian motion

this way instead of using Wiener’s theorem. The answer is that our proof of

Donsker’s invariance principle relies on the existence of a Brownian motion! The

relevance is the following theorem:

Theorem

(Skorokhod embedding theorem)

Let

be a probability measure on

with mean 0 and variance

. Then there exists a probability space (Ω

, F, P

)

with a filtration (

)

t≥0

on which there is a standard Brownian motion (

)

t≥0

and a sequence of stopping times (T

)

n≥0

such that, setting S

= B

(i) T

is a random walk with steps of mean σ

(ii) S

is a random walk with step distribution µ.

So in some sense, Brownian motion contains all random walks with finite

variance.

The only stopping times we know about are the hitting times of some value.

However, if we take

to be the hitting time of some fixed value, then

would be a pretty poor attempt at constructing a random walk. Thus, we may

try to come up with the following strategy — construct a probability space

with a Brownian motion (

)

t≥0

, and an independent iid sequence (

)

n∈N

random variables with distribution

. We then take

to be the first hitting

time of

···

. Then setting

, property (ii) is by definition

satisfied. However, (i) will not be satisfied in general. In fact, for any

y 6

= 0, the

expected first hitting time of

is infinite! The problem is that if, say,

y >

and we accidentally strayed off to the negative side, then it could take a long

time to return.

The solution is to “split”

into two parts, and construct two random variables

(

X, Y

)

∈

, ∞

)

, such that if

is the first hitting time of (

−X, Y

), then

has law µ.

Since we are interested in the stopping times

−x

and

, the following

computation will come in handy:

Lemma. Let x, y > 0. Then

−x

< T

) =

x + y

, E

−x

∧ T

= xy.

Proof sketch. Use optional stopping with (B

− t)

t≥0

Proof of Skorokhod embedding theorem.

Define Borel measures

on [0

, ∞

) by

(A) = µ(±A).

Note that these are not probability measures, but we can define a probability

measure ν on [0, ∞)

given by

dν(x, y) = C(x + y) dµ

−

(x) dµ

(y)

for some normalizing constant

(this is possible since

is integrable). This

(

) is the same (

) appearing in the denominator of

(

−x

< T

) =

x+y

Then we claim that any (X, Y ) with this distribution will do the job.

We first figure out the value of C. Note that since µ has mean 0, we have

∞

x dµ

−

(x) = C

∞

y dµ

(y).

Thus, we have

1 =

C(x + y) dµ

−

(x) dµ

(y)

= C

x dµ

−

(x)

dµ

(y) + C

y dµ

(y)

dµ

−

(x)

= C

x dµ

−

(x)



dµ

(y) +

dµ

−

(x)



= C

x dµ

−

(x) = C

y dµ

(y).

We now set up our notation. Take a probability space (Ω

, F, P

) with a stan-

dard Brownian motion (

)

t≥0

and a sequence ((

, Y

))

n≥0

iid with distribution

ν and independent of (B

)

t≥0

Define

= σ((X

, Y

), n = 1, 2, . . .), F

= σ(F

, F

Define a sequence of stopping times

= 0, T

n+1

= inf{t ≥ T

: B

− B

∈ {−X

n+1

, Y

n+1

}}.

By the strong Markov property, it suffices to prove that things work in the case

n = 1. So for convenience, let T = T

, X = X

, Y = Y

To simplify notation, let τ : C([0, 1], R) ×[0, ∞)

→ [0, ∞) be given by

τ(ω, x, y) = inf{t ≥ 0 : ω(t) ∈ {−x, y}}.

Then we have

T = τ ((B

)

t≥0

, X, Y ).

To check that this works, i.e. (ii) holds, if A ⊆ [0, ∞), then

P(B

∈ A) =

[0,∞)

C([0,∞),R)

τ(ω,x,y)∈A

dµ

(ω) dν(x, y).

Using the first part of the previous computation, this is given by

[0,∞)

x + y

y∈A

C(x + y) dµ

−

(x) dµ

(y) = µ

(A).

We can prove a similar result if A ⊆ (−∞, 0). So B

has the right law.

To see that T is also well-behaved, we compute

ET =

[0,∞)

C([0,1],R)

τ(ω, x, y) dµ

(ω) dν(x, y)

[0,∞)

xy dν(x, y)

= C

[0,∞)

y + yx

) dµ

−

(x) dµ

(y)

[0,∞)

dµ

−

(x) +

[0,∞)

dµ

(y)

= σ

The idea of the proof of Donsker’s invariance principle is that in the limit of

large

, the

are roughly regularly spaced, by the law of large numbers, so

this allows us to reverse the above and use the random walk to approximate the

Brownian motion.

Proof of Donsker’s invariance principle.

Let (

)

t≥0

be a standard Brownian

motion. Then by Brownian scaling,

(N)

)

t≥0

= (N

1/2

t/N

)

t≥0

is a standard Brownian motion.

For every

N >

0, we let (

(N)

)

n≥0

be a sequence of stopping times as in the

embedding theorem for B

(N)

. We then set

(N)

= B

(N)

For t not an integer, define S

(N)

by linear interpolation. Observe that

((T

(N)

)

n≥0

, S

(N)

) ∼ ((T

(1)

)

n≥0

, S

(1)

We define

(N)

= N

−1/2

(N)

Note that if t =

, then

(N)

n/N

= N

−1/2

(N)

= N

−1/2

(N)

= B

(N)

= B

. (∗)

Note that (

(N)

)

t≥0

∼

(

(N)

)

t≥0

. We will prove that we have convergence in

probability, i.e. for any δ > 0,



sup

0≤t<1

(N)

− B

| > δ



= P(k

(N)

− Bk

∞

> δ) → 0 as N → ∞.

We already know that

and

agree at some times, but the time on

is fixed

while that on

is random. So what we want to apply is the law of large numbers.

By the strong law of large numbers,

lim

n→∞

(1)

− n| → 0 as n → 0.

This implies that

sup

1≤n≤N

(1)

− n| → 0 as N → ∞.

Note that (T

(1)

)

n≥0

∼ (T

(N)

)

n≥0

, it follows for any δ > 0,

sup

1≤n≤N



(N)

−



≥ δ

→ 0 as N → ∞.

Using (

∗

) and continuity, for any

t ∈

[

n+1

], there exists

u ∈

[

(N)

n/N

, T

(N)

(n+1)/N

]

such that

(N)

= B

Note that if times are approximated well up to δ, then |t − u| ≤ δ +

Hence we have

S − Bk

∞

> ε} ≤



(N)

−



> δ for some n ≤ N

∪



− B

| > ε for some t ∈ [0, 1], |t − u| < δ +



The first probability

→

0 as

n → ∞

. For the second, we observe that (

)

T ∈[0,1]

has uniformly continuous paths, so for

ε >

0, we can find

δ >

0 such that the

second probability is less than ε whenever N >

(exercise!).

(N)

→ B

uniformly in probability, hence converges uniformly in distri-

bution.