5Fourier transform
II Probability and Measure
5.6 Gaussian random variables
Recall that in the proof of the Fourier inversion theorem, we used these things
called Gaussians, but didn’t really say much about them. These will be useful
later on when we want to prove the central limit theorem, because the central
limit theorem says that in the long run, things look like Gaussians. So here we
lay out some of the basic definitions and properties of Gaussians.
Definition
(Gaussian random variable)
.
Let
X
be a random variable on
R
.
This is said to be Gaussian if there exists
µ ∈ R
and
σ ∈
(0
, ∞
) such that the
density of X is
f
X
(x) =
1
√
2πσ
2
exp
−
(x − µ)
2
2σ
2
.
A constant random variable
X
=
µ
corresponds to
σ
= 0. We say this has mean
µ and variance σ
2
.
When this happens, we write X ∼ N(µ, σ
2
).
For completeness, we record some properties of Gaussian random variables.
Proposition. Let X ∼ N (µ, σ
2
). Then
E[X] = µ, var(X) = σ
2
.
Also, for any a, b ∈ R, we have
aX + b ∼ N (aµ + b, a
2
σ
2
).
Lastly, we have
φ
X
(u) = e
−iµu−u
2
σ
2
/2
.
Proof.
All but the last of them follow from direct calculation, and can be found
in IA Probability.
For the last part, if X ∼ N (µ, σ
2
), then we can write
X = σZ + µ,
where
Z ∼ N
(0
,
1). Recall that we have previously found that the characteristic
function of a N(0, 1) function is
φ
Z
(u) = e
−|u|
2
/2
.
So we have
φ
X
(u) = E[e
iu(σZ+µ)
]
= e
iuµ
E[e
iuσZ
]
= e
iuµ
φ
Z
(iuσ)
= e
iuµ−u
2
σ
2
/2
.
What we are next going to do is to talk about the corresponding facts for
the Gaussian in higher dimensions. Before that, we need to come up with the
definition of a higher-dimensional Gaussian distribution. This might be different
from the one you’ve seen previously, because we want to allow some degeneracy
in our random variable, e.g. some of the dimensions can be constant.
Definition (Gaussian random variable). Let X be a random variable. We say
that X is a Gaussian on R
n
if (u, X) is Gaussian on R for all u ∈ R
n
.
We are now going to prove a version of our previous theorem to higher
dimensional Gaussians.
Theorem.
Let
X
be Gaussian on
R
n
, and le t
A
be an
m×n
matrix and
b ∈ R
m
.
Then
(i) AX + b is Gaussian on R
m
.
(ii) X ∈ L
2
and its law
µ
X
is determined by
µ
=
E
[
X
] and
V
=
var
(
X
), the
covariance matrix.
(iii) We have
φ
X
(u) = e
i(u,µ)−(u,V u)/2
.
(iv) If V is invertible, then X has a density of
f
X
(x) = (2π)
−n/2
(det V )
−1/2
exp
−
1
2
(x − µ, V
−1
(x − µ))
.
(v)
If
X
= (
X
1
, X
2
) where
X
i
∈ R
n
i
, then
cov
(
X
1
, X
2
) = 0 iff
X
1
and
X
2
are
independent.
Proof.
(i) If u ∈ R
m
, then we have
(AX + b, u) = (AX, u) + (b, u) = (X, A
T
u) + (b, u).
Since (
X, A
T
u
) is Gaussian and (
b, u
) is constant, it follows that (
AX
+
b, u
)
is Gaussian.
(ii)
We know in particular that each component of
X
is a Gaussian random
variable, which are in
L
2
. So
X ∈ L
2
. We will prove the second part of (ii)
with (iii)
(iii) If µ = E[X] and V = var(X), then if u ∈ R
n
, then we have
E[(u, X)] = (u, µ), var((u, X)) = (u, V u).
So we know
(u, X) ∼ N((u, µ), (u, V u)).
So it follows that
φ
X
(u) = E[e
i(u,X)
] = e
i(u,µ)−(u,V u)/2
.
So
µ
and
V
determine the characteristic function of
X
, which in turn
determines the law of X.
(iv)
We start off with a boring Gaussian vector
Y
= (
Y
1
, ··· , Y
n
), where the
Y
i
∼ N(0, 1) are independent. Then the density of Y is
f
Y
(y) = (2π)
−n/2
e
−|y|
2
/2
.
We are now going to construct X from Y . We define
˜
X = V
1/2
Y + µ.
This makes sense because
V
is always non-negative definite. Then
˜
X
is
Gaussian with
E
[
˜
X
] =
µ
and
var
(
˜
X
) =
V
. Therefore
X
has the same
distribution as
˜
X
. Since
V
is assumed to be invertible, we can compute
the density of
˜
X using the change of variables formula.
(v) It is clear that if X
1
and X
2
are independent, then cov(X
1
, X
2
) = 0.
Conversely, let X = (X
1
, X
2
), where cov(X
1
, X
2
) = 0. Then we have
V = var(X) =
V
11
0
0 V
22
.
Then for u = (u
1
, u
2
), we have
(u, V u) = (u
1
V
11
u
1
) + (u
2
, V
22
u
2
),
where V
11
= var(X
1
) and V
22
var(X
2
). Then we have
φ
X
(u) = e
iµu−(u,V u)/2
= e
iµ
1
u
1
−(u
1
,V
11
u
1
)/2
e
iµ
2
u
2
−(u
2
,V
22
u
2
)/2
= φ
X
1
(u
1
)φ
X
2
(u
2
).
So it follows that X
1
and X
2
are independent.