Nonlinear Programming

Nonlinear Programming

min (max)f (x)(1a)

subject tohi(x) = 0,i = 1,...,p(1b)

gi(x) ≤ 0,i = 1,...,m(1c)

wherex = (x1,...,xn)

andf, h1,...,hp, g1,...,gm

are continuous functions

Feasible Solution:Any point x satisfying (1b) – (1c).

Optimal Solution:A feasible such that f() ≤ f(x) for all x feasible.

Box Size Problem

Let x, y and z be respectively the length, width and height of a box.

Problem: Select the box dimensions to minimize the surface area subject to a constraint on volume.

minimizef(x,y,z) = 2xy + 2xz + 2yz

subject toxyz = V

x ≥ 0, y≥0, z ≥ 0

Reliability Problem

Letri = probability that component i will not fail

ci = cost of component i

wi = weight of component i

xi = number of redundant (parallel) components i

Reliability of component i: Ri = 1 – (1 – ri)(1+xi)

The reliability of a series system is the product of the component type reliabilities.

Problem: Find the optimal number of redundant components of each type, subject to constraints on maximum total cost, C, and weight, W.

Maximize

subject to ≤ C

≤ W

xi ≥ 0, i = 1,…,n

Example 1

min f(x1,x2) = (x1 – 3)2 + (x2 – 2)2

subject tog1(x1,x2) = x – x2 – 3 ≤ 0

g2(x1,x2) = x2 – 1 ≤ 0

g3(x1,x2) = x1 ≥ 0

Isovalue Contour: (x1 – 3)2 + (x2– 2)2 = c

(circle with radius and center at (3,2))

Geometric Solution for Example 1

Example 2

min f(x1,x2) = |x1 – 2| + |x2 – 2|

subject toh1(x1,x2) = x + x – 1 = 0

g1(x1,x2) = x1 – x ≥ 0

Convex Functions

Let x = (x1,...,xn). A function f(x) is convex if and only if

f (x1 + (1 – )x2) ≤ f (x1) + (1 – )f (x2)

for all  such that 0 ≤  ≤ 1.

A function f(x) is strictly convex if the inequality is strict except when  = 0 or 1.

A function f (x) is concave if and only if

f (x1 + (1 – )x2) ≥ f (x1) + (1 – )f (x2).

If f (x) is convex then –f (x) is concave.

Convex Sets

A feasible region, C, is a convex set if any point on

a straight line connecting any two feasible points,

x1, x2 C, is also feasible.

Mathematically, this property is expressed as

x = x1 + (1–)x2C

for all  such that 0 ≤  ≤ 1.

When g(x) is convex, the set

{x Rn : g(x) ≤ b}

defines a convex region.

Convex vs. Nonconvex Programming

minimize f(x)

subject to gi(x) ≤ bi,i = 1,…,m

When f(x) is convex and each gi(x) is convex, any local optimum is a global optimum.
When either f(x) is nonconvex or any gi(x) is nonconvex, it is possible that a local optimum may not be a global optimum.
Most nonlinear programming algorithms are search techniques that find local optima.

Properties of Linear Programs

(a)The set of feasible solutions {xn : Ax = b, x ≥ 0} is convex.

(b)The set of all n-tuples (x1,...,xn) which yields a specified value of the objective function is a hyperplane.

(c)A local extremum is also a global extremum.

(d)If the optimal value of the objective function is finite, at least one extreme point in the feasible region will be optimal.

Difficulties with Nonlinearities

When nonlinearities are present in the model, none of the above properties are likely to hold.

Illustration of Local Optima

max f(x1,x2) = 25(x1– 2)2 + (x2 – 2)2

subject tox1+ x2≥ 2

–x1+ x2≤ 2

x1+ x2≤ 6

x1– 3x2≤ 2

x1, x2 ≥ 0

More Local Extrema

Determining Convexity

Single Dimension

•f(x) is convex if and only if

≥ 0 for all x.

•A function f(x) is concave if and only if

≤ 0 for all x.

Separable Functions

•A function f(x1, x2, . . . , xn) is separable if

f(x1, x2, . . . , xn) = f1(x1) + f2(x2) + • • • + fn(xn)

•The separable function f(x1, x2, . . . , xn) is convex (concave) if and only if each term is convex (concave).

A linear function is both convex and concave.

If the function f(x1, x2, . . . , xn) has some terms convex and some terms concave, the function is neither convex or concave.

Examples

a.f(x) = 4x2 – 20x + 10

d2f/dx2 = 8, function is strictly convex everywhere.

Unique global minimum at x = 2.5

b.f(x) = –4x2 – 20x

d2f/dx2 = -8, function is strictly concave everywhere.

Unique global maximum at x = -2.5

c.f(x) = 8x3+15x2 + 9x + 6

Local min at -0.5

d.f(x) = 10x1 + 5x2 – 3x3 – 4x4

e.f(x) = 10(x1)2 + 5(x2)2 – 3x3 – 4x4

f.f(x) = 10(x1)2 – 5(x2)2 – 3x3 – 4x4

g.f(x) = –10(x1)2 – 5(x2)2 – 3x3 – 4x4

Multiple Dimensions

To determine convexity in multiple dimensions we define the following:

•The gradient of f(x), denoted by f(x), is the n-dimensional column vector formed by the first partial derivatives of f(x) with respect to each individual element of x.

f(x) =

•The Hessian matrix, H(x), associated with f(x) is the nn symmetric matrix of second partial derivatives of f(x) with respect to the elements of x.

H(x) =

Examples

h. f(x)= (x1)2 + (x2)2

i. f(x) = 2(x1)2 + (x2)2 + 3(x3)2+ x1x2 + 2x1x3 + 3x2x3

Quadratic Forms

f(x) = k + cTx + xTQx

f(x) = 3x1x2 + (x1)2 + 3(x2)2

Q = H =

(When written in polynomial form the element qii is twice the value of the coefficient of (xi)2, and the element qij and qji are both equal to the coefficient of xixj.)

j.f(x) = 2(x1)2 + 4(x2)2

k.f(x) = –2(x1)2+ 4x1x2– 4(x2)2+ 4x1+ 4x2+ 10

l.f(x) = –0.5(x1)2– 2x1x2– 2(x2)2

Determining Convexity or Concavity from Hessian

Convex

•f(x) is convex (strictly convex) if its associated Hessian matrix is positive semi-definite (definite) for all x.

•H(x) is positive semi-definite if and only if

xTHx ≥ 0 for all x.

•H(x) is positive definite if and only if

xTHx > 0 for every x ≠ 0.

Concave

•f(x) is concave (strictly concave) if its associated Hessian matrix is negative semi-definite (definite) for all x.

•H(x) is negative semi-definite if and only if

xTHx ≤ 0 for all x.

•H(x) is negative definite if and only if

xTHx < 0 for every x ≠ 0.

Neither Convex nor Concave

•f(x) is neither convex nor concave if its associated Hessian matrix, H(x), is indefinite for some point x.

•H(x) is indefinite

if and only if xTHx > 0 for some x,

and ()TH< 0 for some other .

Both Convex and Concave

•f(x) is both convex and concave only if it is linear.

Testing for Definiteness with Determinants

The ithleading principal determinant of H is the determinant of the matrix formed by taking the intersection of the first i rows and the first i columns. Let Hi be the value of this determinant. Then

H1 = h11

H2=

and so on until Hn is defined.

H is positive definite if and only if:

Hi > 0, i = 1,... ,n.

H is negative definite if and only if:

H1 < 0, H2 > 0, H3 < 0, H4 > 0, …

H is indefinite if the leading principal determinants are all nonzero and neither of the foregoing conditions are true.

Examples

f(x) = (x1)2 + (x2)2

H(x) =

f(x) = 3x1x2 + (x1)2 + 3(x2)2

H =

f(x) = 2(x1)2 + (x2)2 + 3(x3)2+ x1x2 + 2x1x3 + 3x2x3

H =

f(x) = –2(x1)2 + 4x1x2– 4(x2)2+ 4x1+ 4x2+ 10

f(x) = –0.5(x1)2 – 2x1x2– 2(x2)2

When does a Constraint Define a Convex Region?

• gi(x) ≤ bi defines a convex region if gi(x) is a convex function.

•gi(x) ≥ bidefines a convex region if gi(x) is a concave function.

•gi(x) = bidefines a convex region only if gi(x) is a linear function.

When all the constraints individually define convex regions, they collectively define a convex region.

When is a Local Optimum a Global Optimum?

minimize f(x)
subject to
gi(x) {≤, ≥, =} bi,
i = 1, …,m

When f(x) is convex and the set of constraints defines a convex region.

/ maximize f(x)
subject to
gi(x) {≤, ≥, =} bi,
i = 1, …,m

When f(x) is concave and the set of constraints defines a convex region.

•Otherwise, there is no guarantee that a specific local optimum is a global optimum.

Problem Categories

1. Unconstrained Problems (classical optimization):

min{f(x) : x Rn}

2. Nonnegative Variables:

min{f(x) : x ≥ 0}

3. Equality Constraints (classical optimization):

min{f(x) : hi(x) = 0, i = 1,...,p}

4. Inequality Constraints:

min{f(x) : hi(x) = 0, i = 1,...,p; gi(x) ≤ 0, i = 1,...,m}

Optimality Conditions

Unconstrained Problems: min{f(x) : x Rn}

Necessary Conditions: Let f : n1 be twice continuously differentiable throughout a neighborhood of x*. If f has a relative minimum at x*, then it necessarily follows that

(i)f(x*) = 0

(ii)The Hessian F(x*) is positive semidefinite.

Sufficient Condition: The Hessian F(x*) is positive definite.

Example

Find the relative extreme values of

f(x) = 3x + x – 9x1 + 4x2

Solution: f(x) = [9x – 9 2x2 + 4] = [0 0]

implies that x1=1 and x2=–2.

Checking x = (1,–2), we have

F(1,–2) =

 positive definite since vTF(1,–2)v = 18v + v > 0

when v ≠ 0. Thus (1,–2) gives a strict relative minimum.

Next consider x = (–1,–2) with Hessian

F(–1,–2) =

Now we have vTF(–1,–2)v = –18v + v which may equal 0 when v ≠ 0. Thus the sufficient condition for (–1,–2) to be either a relative minimum or maximum is not satisfied.

Optimality Conditions

Nonnegative Variables: min{f(x) : x ≥ 0}

Necessary Conditions (relative minimum of f(x) at x*):

= 0, if x > 0

≥ 0, if x = 0

Alternatively:

f(x*) ≥ 0, f(x*)x* = 0, x* ≥ 0

Example

Minimize f(x) = 3x + x + x – 2x1x2 – 2x1x3 – 2x1

subject to x1 ≥ 0, x2 ≥ 0, x3 ≥ 0

Solution: Necessary conditions for a relative minimum:

(1)0 ≤ = 6x1 – 2x2 – 2x3 – 2

(2)0 = x1 = x1(6x1 – 2x2 – 2x3 – 2)

(3)0 ≤ = 2x2 – 2x1

(4)0 = x2 = x2(2x2 – 2x1)

(5)0 ≤ = 2x3 – 2x1

(6)0 = x3 = x3(2x3 – 2x1)

(7)x1 ≥ 0, x2 ≥ 0, x3 ≥ 0

From (4), either x2 = 0 or x1 = x2.

When x2 = 0, (3) and (7) x1 = 0.

From (6), x3 = 0.

But this contradicts (1) so x2 ≠ 0 and x1 = x2.

Example (Continued)

Condition (6)  either x3 = 0 or x1 = x3.

If x3 = 0, then (4), (5) and (7) x1 = x2 = x3= 0.

But this situation has been ruled out.

Thus, x1 = x2 = x3 and from (2) either x1 = 0 or x1 = 1.

Since x1 ≠ 0, the only possible relative minimum of f occurs when x1 = x2 = x3 = 1.

Must evaluate Hessian at x* = (1,1,1):

F(1,1,1) =

det(F) = 8, det = 8, det(6) = 6

Determinants of all principal minors are positive definite so f is strictly convex and has a relative minimum at x*.

Also, convexity implies that f(x*) = 1 is a global minimum.

Optimality Conditions

Equality Constraints: min{f(x) : hi(x) = 0, i = 1,...,p}

Jacobian matrix at x*:

Lagrangian: L(x,) f(x) + Th(x),

Necessary Conditions (relative minimum of f at x*):

If f and each component hi of hare twice continuously differentiable throughout a neighborhood of x*,and if the Jacobian matrix has full row rank p, then there exists a *p such that

= f(x*) + Th(x*) = 0

= h(x*) = 0

Example

min (max) f(x1,x2) = 2x1x2

subject to x + x = 1

Solution: Let L(x1,x2,) = 2x1x2 + (x + x – 1).

= 2x2 + 2x1 = 0(1)

= 2x1 + 2x2 = 0(2)

= x + x – 1 = 0(3)

Three equations in three unknowns:

Solving (1) for  we get  = –x2/x1 as long as x1 ≠ 0.

Substituting into (2) gives x = x .

From (3), we get 2x = 1.

x1 = /2, x2 = /2 and  = +1 or –1

depending on the values of x1 and x2.

Example (continued)

Four possible solutions for (x1,x2,):

(/2, /2, –1)

(–/2, –/2, –1)

(/2, –/2, 1)

(–/2, /2, –1)

By inspection:

(i)minimum value of f = –1 at both

x = (–/2,/2) and (/2,–/2).

(ii)maximum value of f = 1 at both

x = (/2,/2) and (–/2,–/2).

Second Order Optimality Conditions

Tangent Plane at x*: T = {y : h(x*)Ty = 0}

Second Order Necessary Conditions: Suppose that x* is a local minimum of f(x) subject to h(x) = 0 and the Jacobian has rank p. Then there exists a vector p such that

f(x*) + Th(x*) = 0

and the Hessian of the Lagrangian

L(x*) F(x*) + TH(x*)

is positive semidefinite on T; that is

yTL(x*)y ≥ 0 for all yT

Second Order Sufficient Conditions (for unique local minimum):

L(x*) is positive definite on T.

Example (continued)

Hessian: L(x*) = +  (independent of x*)

Tangent Plane: T = {(y1,y2) : xy1 + xy2 = 0}

At either minimizing point,  = 1 so L = ,

T = {(y1,y2) : y1 = y2}, yTLy = (y1,y2) = 8y ≥ 0

Thus second order necessary and sufficient conditions for a relative minimum are satisfied.

At either maximizing point,  = –1, L = ,

T = {(y1,y2) : y1 = –y2}, yTLy = –8y ≤ 0

L is negative definite on T  relative maximum.

Optimality Conditions

Inequality Constraints:

min{f(x) : hi(x) = 0, i = 1,...,p; gi(x) ≤ 0, i = 1,...,m}

Regular Point: Let x* be a point satisfying the constraints h(x*) = 0, g(x*) ≤ 0 and let I be the set of indices i for which gi(x*) = 0. Then x* is said to be a regular point of these constraints if the gradient vectors hi(x*) (1 ≤ i ≤ p), gi(x*) (i I) are linearly independent.

Necessary Optimality Conditions (Karush-Kuhn-Tucker Conditions): Let x* be a relative minimum and suppose that x* is a regular point for the constraints. Then there exists a vector p and a vector msuch that

f(x*) + Th(x*) + Tg(x*) = 0

Tg(x*) = 0

 ≥ 0

h(x*) = 0, g(x*) ≤ 0

Second Order Optimality Conditions

Necessary Conditions: Suppose that the functions f, h, g C2 and that x* is a regular point of the constraints. If x* is a relative minimum, then there exists a vector p and a vector m,  ≥ 0 such that KKT Conditions hold and

L(x*) F(x*) + TH(x*) + TG(x*)

is positive semidefinite on the tangent subspace of the active constraints at x*.

Sufficient Conditions: Hessian matrix

L(x*) = F(x*) + TH(x*) + TG(x*)

is positive definite on the subspace:

T' = {{y : h(x*)y = 0, gi(x*)y = 0 for all iI'}

where I' = {i : gi(x*) = 0, i > 0}.

Example

min f = 2x + x + x + x1x3 – x1x2 + x1 – x3

subject toh1 =x1+ x2= 3

g1 =x+ x– x3≤ 0

g2 =x1+ x2+ 2x3≤ 16

g3 =x1≥ 0

g4 = x2≥ 0

g5 = x3≥ 0

Stationarity Conditions:

+ 1 + 1 + 2

– 3 – 4 – 5 =

Complementarity Conditions:

1 = 0

2

i ≥ 0, i = 1,...,5

•The 5 complementarity equalities imply 25 = 32 combinations may have to be examined (most of these are infeasible).

•Assume that only g2 is active so i = 0 for all i ≠ 2.

h1 = x1 + x2 = 3 means x3 = .

The third stationarity condition then yields

+ x1 – + 22 = 0 2 < 0 which is not feasible.

•Now consider the case where only g1 is active.

One possible solution is = (1,2,5).

The third stationarity condition gives 1 = , and either the first or second stationarity conditions gives 1 = –13.

Thus with corresponding  and  satisfies the first order necessary conditions for a relative minimum.

Check second order conditions –– examine Hessian:

L() = + =

Matrix L() is positive definite on R3 so it must be positive definite on the required tangent subspace.

Therefore, = (1,2,5) is a unique local minimum.