2Orthogonal polynomials
IB Numerical Analysis
2.5 Least-squares polynomial approximation
If we want to approximate a function with a polynomial, polynomial interpolation
might not be the best idea, since all we do is make sure the polynomial agrees
with
f
at certain points, and then hope it is a good approximation elsewhere.
Instead, the idea is to choose a polynomial
p
in
P
n
[
x
] that “minimizes the error”.
What exactly do we mean by minimizing the error? The error is defined as
the function
f −p
. So given an appropriate inner product on the vector space of
continuous functions, we want to minimize
kf − pk
2
= hf − p, f − pi.
This is usually of the form
hf − p, f − pi =
Z
b
a
w(x)[f(x) − p(x)]
2
dx,
but we can also use alternative inner products such as
hf − p, f − pi =
m
X
j=1
w
j
[f(ξ
i
) − p(ξ
i
)]
2
.
Unlike polynomial interpolation, there is no guarantee that the approximation
agrees with the function anywhere. Unlike polynomial interpolation, there is
some guarantee that the total error is small (or as small as we can get, by
definition). In particular, if
f
is continuous, then the Weierstrass approximation
theorem tells us the total error must eventually vanish.
Unsurprisingly, the solution involves the use of the orthogonal polynomials
with respect to the corresponding inner products.
Theorem.
If
{p
n
}
n
k=0
are orthogonal polynomials with respect to
h·, ·i
, then
the choice of c
k
such that
p =
n
X
k=0
c
k
p
k
minimizes kf − pk
2
is given by
c
k
=
hf, p
k
i
kp
k
k
2
,
and the formula for the error is
kf − pk
2
= kfk
2
−
n
X
k=0
hf, p
k
i
2
kp
k
k
2
.
Note that the solution decouples, in the sense that
c
k
depends only on
f
and
p
k
. If we want to take one more term, we just need to compute an extra term,
and not redo our previous work.
Also, we notice that the formula for the error is a positive term
kfk
2
sub-
tracting a lot of squares. As we increase
n
, we subtract more squares, and the
error decreases. If we are lucky, the error tends to 0 as we take
n → ∞
. Even
though we might not know how many terms we need in order to get the error to
be sufficiently small, we can just keep adding terms until the computed error
small enough (which is something we have to do anyway even if we knew what
n to take).
Proof. We consider a general polynomial
p =
n
X
k=0
c
k
p
k
.
We substitute this in to obtain
hf − p, f − pi = hf, fi − 2
n
X
k=0
c
k
hf, p
k
i +
n
X
k=0
c
2
k
kp
k
k
2
.
Note that there are no cross terms between the different coefficients. We minimize
this quadratic by setting the partial derivatives to zero:
0 =
∂
∂c
k
hf − p, f − pi = −2hf, p
k
i + 2c
k
kp
k
k
2
.
To check this is indeed a minimum, note that the Hessian matrix is simply 2
I
,
which is positive definite. So this is really a minimum. So we get the formula for
the
c
k
’s as claimed, and putting the formula for
c
k
gives the error formula.
Note that our constructed
p ∈ P
n
[
x
] has a nice property: for
k ≤ n
, we have
hf − p, p
k
i = hf, p
k
i − hp, p
k
i = hf, p
k
i −
hf, p
k
i
kp
k
k
2
hp
k
, p
k
i = 0.
Thus for all q ∈ P
n
[x], we have
hf − p, qi = 0.
In particular, this is true when
q
=
p
, and tells us
hf, pi
=
hp, pi
. Using this to
expand hf − p, f − pi gives
kf − pk
2
+ kpk
2
= kfk
2
,
which is just a glorified Pythagoras theorem.