1Introduction and preliminaries
IB Optimisation
1.2 Review of unconstrained optimization
Let
f
:
R
n
→ R
,
x
∗
∈ R
n
. A necessary condition for
x
∗
to minimize
f
over
R
n
is ∇f(x
∗
) = 0, where
∇f =
∂f
∂x
1
, · · · ,
∂f
∂x
n
T
is the gradient of f.
However, this is obviously not a sufficient condition. Any such point can be
a maximum, minimum or a saddle. Here we need a notion of convexity:
Definition
(Convex region)
.
A region
S ⊆ R
n
is convex iff for all
δ ∈
[0
,
1],
x, y ∈ S
, we have
δx
+ (1
− δ
)
y ∈ S
. Alternatively, If you take two points, the
line joining them lies completely within the region.
non-convex convex
Definition
(Convex function)
.
A function
f
:
S → R
is convex if
S
is convex,
and for all x, y ∈ S, δ ∈ [0, 1], we have δf(x) + (1 − δ)f(y) ≥ f(δx + (1 − δ)y).
x y
δx + (1 − δ)y
δf (x) + (1 − δ)f (y)
A function is concave if
−f
is convex. Note that a function can be neither
concave nor convex.
We have the following lemma:
Lemma.
Let
f
be twice differentiable. Then
f
is convex on a convex set
S
if
the Hessian matrix
Hf
ij
=
∂
2
f
∂x
i
∂x
j
is positive semidefinite for all x ∈ S, where this fancy term means:
Definition
(Positive-semidefinite)
.
A matrix
H
is positive semi-definite if
v
T
Hv ≥ 0 for all v ∈ R
n
.
Which leads to the following theorem:
Theorem.
Let
X ⊆ R
n
be convex,
f
:
R
n
→ R
be twice differentiable on
X
.
If
x
∗
∈ X
satisfy
∇f
(
x
∗
) = 0 and
Hf
(
x
) is positive semidefinite for all
x ∈ X
,
then x
∗
minimizes f on X.
We will not prove these.
Note that this is helpful, since linear functions are convex (and concave).
The problem is that our problems are constrained, not unconstrained. So we
will have to convert constrained problems to unconstrained problems.