5Fourier transform

II Probability and Measure

5.6 Gaussian random variables

Recall that in the proof of the Fourier inversion theorem, we used these things

called Gaussians, but didn’t really say much about them. These will be useful

later on when we want to prove the central limit theorem, because the central

limit theorem says that in the long run, things look like Gaussians. So here we

lay out some of the basic definitions and properties of Gaussians.

Definition

(Gaussian random variable)

.

Let

X

be a random variable on

R

.

This is said to be Gaussian if there exists

µ ∈ R

and

σ ∈

(0

, ∞

) such that the

density of X is

f

X

(x) =

1

√

2πσ

2

exp

−

(x − µ)

2

2σ

2

.

A constant random variable

X

=

µ

corresponds to

σ

= 0. We say this has mean

µ and variance σ

2

.

When this happens, we write X ∼ N(µ, σ

2

).

For completeness, we record some properties of Gaussian random variables.

Proposition. Let X ∼ N (µ, σ

2

). Then

E[X] = µ, var(X) = σ

2

.

Also, for any a, b ∈ R, we have

aX + b ∼ N (aµ + b, a

2

σ

2

).

Lastly, we have

φ

X

(u) = e

−iµu−u

2

σ

2

/2

.

Proof.

All but the last of them follow from direct calculation, and can be found

in IA Probability.

For the last part, if X ∼ N (µ, σ

2

), then we can write

X = σZ + µ,

where

Z ∼ N

(0

,

1). Recall that we have previously found that the characteristic

function of a N(0, 1) function is

φ

Z

(u) = e

−|u|

2

/2

.

So we have

φ

X

(u) = E[e

iu(σZ+µ)

]

= e

iuµ

E[e

iuσZ

]

= e

iuµ

φ

Z

(iuσ)

= e

iuµ−u

2

σ

2

/2

.

What we are next going to do is to talk about the corresponding facts for

the Gaussian in higher dimensions. Before that, we need to come up with the

definition of a higher-dimensional Gaussian distribution. This might be different

from the one you’ve seen previously, because we want to allow some degeneracy

in our random variable, e.g. some of the dimensions can be constant.

Definition (Gaussian random variable). Let X be a random variable. We say

that X is a Gaussian on R

n

if (u, X) is Gaussian on R for all u ∈ R

n

.

We are now going to prove a version of our previous theorem to higher

dimensional Gaussians.

Theorem.

Let

X

be Gaussian on

R

n

, and le t

A

be an

m×n

matrix and

b ∈ R

m

.

Then

(i) AX + b is Gaussian on R

m

.

(ii) X ∈ L

2

and its law

µ

X

is determined by

µ

=

E

[

X

] and

V

=

var

(

X

), the

covariance matrix.

(iii) We have

φ

X

(u) = e

i(u,µ)−(u,V u)/2

.

(iv) If V is invertible, then X has a density of

f

X

(x) = (2π)

−n/2

(det V )

−1/2

exp

−

1

2

(x − µ, V

−1

(x − µ))

.

(v)

If

X

= (

X

1

, X

2

) where

X

i

∈ R

n

i

, then

cov

(

X

1

, X

2

) = 0 iff

X

1

and

X

2

are

independent.

Proof.

(i) If u ∈ R

m

, then we have

(AX + b, u) = (AX, u) + (b, u) = (X, A

T

u) + (b, u).

Since (

X, A

T

u

) is Gaussian and (

b, u

) is constant, it follows that (

AX

+

b, u

)

is Gaussian.

(ii)

We know in particular that each component of

X

is a Gaussian random

variable, which are in

L

2

. So

X ∈ L

2

. We will prove the second part of (ii)

with (iii)

(iii) If µ = E[X] and V = var(X), then if u ∈ R

n

, then we have

E[(u, X)] = (u, µ), var((u, X)) = (u, V u).

So we know

(u, X) ∼ N((u, µ), (u, V u)).

So it follows that

φ

X

(u) = E[e

i(u,X)

] = e

i(u,µ)−(u,V u)/2

.

So

µ

and

V

determine the characteristic function of

X

, which in turn

determines the law of X.

(iv)

We start off with a boring Gaussian vector

Y

= (

Y

1

, ··· , Y

n

), where the

Y

i

∼ N(0, 1) are independent. Then the density of Y is

f

Y

(y) = (2π)

−n/2

e

−|y|

2

/2

.

We are now going to construct X from Y . We define

˜

X = V

1/2

Y + µ.

This makes sense because

V

is always non-negative definite. Then

˜

X

is

Gaussian with

E

[

˜

X

] =

µ

and

var

(

˜

X

) =

V

. Therefore

X

has the same

distribution as

˜

X

. Since

V

is assumed to be invertible, we can compute

the density of

˜

X using the change of variables formula.

(v) It is clear that if X

1

and X

2

are independent, then cov(X

1

, X

2

) = 0.

Conversely, let X = (X

1

, X

2

), where cov(X

1

, X

2

) = 0. Then we have

V = var(X) =

V

11

0

0 V

22

.

Then for u = (u

1

, u

2

), we have

(u, V u) = (u

1

V

11

u

1

) + (u

2

, V

22

u

2

),

where V

11

= var(X

1

) and V

22

var(X

2

). Then we have

φ

X

(u) = e

iµu−(u,V u)/2

= e

iµ

1

u

1

−(u

1

,V

11

u

1

)/2

e

iµ

2

u

2

−(u

2

,V

22

u

2

)/2

= φ

X

1

(u

1

)φ

X

2

(u

2

).

So it follows that X

1

and X

2

are independent.