37242 Optimisation in Quantitative Management
(part 2)
Let A be a square n × n matrix. Then the following statements are
equivalent.
I A is an invertible matrix.
I Using elementary row operations A can be transformed into In.
I The equation Ax = 0 has only the trivial solution.
I The columns in A are linearly independent.
I A is a product of elementary matrices.
Consider arbitrary y 2 E2 and z 2 E2, i.e. y = y y1 2 and
z = z z1 2 .
According to the Law of Cosines (also known as the cosine formula
or cosine rule)
jjy – zjj2 = jjyjj2 + jjzjj2 – 2jjyjj jjzjj cos #; (1)
where # is the angle between z and y.
On the other hand,
jjy-zjj2 = (y1-z1)2+(y2-z2)2 =
jjyjj2
z }|
y12 + y22 +
jjzjj2
z |
z12 + z22 -2y1z1-2y2z2
= jjyjj2 + jjzjj2 – 2y1z1 – 2y2z2;
and by substituting into (1),
y·z
z |
y1z1 + y2z2 = jjyjj jjzjj cos #
Hence, for any z 2 E2 and y 2 E2 such that y 6= 0 and z 6= 0,
z · y = jjzjj jjyjj cos #
and θ = 90◦ if and only if z · y = 0.
In general, two vectors z 2 En and y 2 En are orthogonal
(perpendicular) if
z · y = 0:
Let b be an arbitrary number and a =
24
a1
· · ·
an
35
be an arbitrary
vector such that a 6= 0. Consider the linear function
f (x1; :::; xn) = a1x1 + ::: + anxn = a · x (2)
and two arbitrary vectors z 2 En and y 2 En such that
f (z) = b and f (y) = b: (3)
Then, (2) and (3) imply that
a · (z – y) = 0;
i.e. the vectors a and z – y are orthogonal.
The set of all points, satisfying an equality of the form
a1x1 + ::: + anxn = b;
is called a hyperplane. In two-dimensional case all points,
satisfying an equality of the form
a1x1 + a2x2 = b; (4)
form a straight line. In three-dimensional case all points, satisfying
an equality of the form
a1x1 + a2x2 + a3x3 = b;
form a plane.
The set of all points, satisfying an inequality of the form
a1x1 + ::: + anxnf≤; ≥gb;
is called a half-space.
Consider the two-dimensional case.
EXERCISE 1
Show that, for each v 2 E2,
v = y + αa (5)
for some scalar α and some y 2 E2 such that a · y = b.
EXERCISE 2
Prove that y and α in (5) are unique.
Straight line (4) divides the plane into two half-planes (two
half-spaces).
EXERCISE 3
Show that for all points in one of these half-planes the scalar α in
(5) is nonnegative, whereas for all points in another this scalar is
nonpositive.
EXERCISE 4
Prove that if, for some v the scalar α in (5) is nonnegative, then
a · v ≥ b; and if this scalar in nonpositive, then a · v ≤ b:
According to the calculus, a function increases in the direction of
its gradient and decreases in the opposite direction. So, the above
result also follows from the observation that a is the gradient of
f (x1; :::; xn), i.e. 2
4
a1
· · ·
an
35
=
24
@f
@x1
· · ·
@f
@xn
35
The feasible region of a linear programming problem can be viewed
as the intersection of half-spaces. The intersection of a finite
number of half-spaces is called a polyhedral set.
Thus, the feasible region fx : Ax ≤ b; x ≥ 0g of a linear
programming problem
minimise c · x
subject to
Ax ≤ b
x ≥ 0
is a polyhedral set.
For example, consider the following linear programming problem
maximise 2×1 + 5×2 (6)
subject to
x1 + x2 ≤ 4 | (7) (8) (9) (10) |
x1 | ≤ 2 x2 ≤ 3 |
x1 ≥ 0; x2 ≥ 0 |
The constraint x1 ≥ 0 specifies the following half-plane (half-space)
Two nonnegativity constrains (10) restrict the set of considered
points to the first quadrant.
In order to find what half-space is specified by the constraint
x1 + x2 ≤ 4
it suffices to choose any point that is not on the straight line
x1 + x2 = 4
and check if this point satisfies the inequality (7). If so, then all
points that belong to the same half-space as the chosen point
satisfy the inequality (7). Otherwise all points on the opposite side
satisfy this inequality.
The intersection of this region with the half-plane, specified by (8),
and then with the half-plane, specified by (9), results in the
following feasible region:
All points that give the same value of the objective function, say z,
form the straight line
2×1 + 5×2 = z: (11)
For any two different values of the objective function, the
corresponding straight lines are parallel because they have the
same slope.
Then, it remains to find, among all straight lines (11) that contain
at least one feasible point, the straight line with the largest z.
Observe that the optimal solution is a corner point of the feasible
region. The corner points will be also referred to as extreme points.
Summary of the Geometric Solution Procedure
I Find the feasible region as the intersection of all regions
specified by the constraints.
I Identify the gradient of the objective function (the gradient
points in the direction in which the function increases).
I Identify the optimal solution by sliding the straight line
associated with the objective function in the direction of the
gradient for the maximisation problem or in the opposite
direction (the direction of the minus gradient) for the
minimisation problem.
EXERCISE 5
Construct a linear programming problem with two decision
variables that has infinitely many optimal solutions. Solve this
problem geometrically.
EXERCISE 6
Construct a linear programming problem in the standard form with
three equality constrains and five decision variables that has no
feasible solutions.
EXERCISE 7
Consider the following optimisation problem
minimise c · x
subject to
Ax = b
with c 2 En, b 2 Em, and an m × n matrix A. Let there exist
y 2 En and z 2 En such that
c · z 6= 0; Az = 0; and Ay = b:
Prove that this problem has no optimal solutions.
EXERCISE 8
Construct a linear programming problem with two decision
variables, which feasible region is not empty but which has no
optimal solutions.
EXERCISE 9
Solve the following linear programming problem geometrically
minimise – 7×1 – 5×2
subject to
-x1+ x2 ≤ 0
x1+2×2≤ 0
x1 x1 |
≤ 1 ≥-1 x2 ≤ 1 x2 ≥-1 |
(12) (13) (14) (15) |
Prove that without the constraints (12), (13), (14), (15) this
problem has no optimal solutions.
Let x 2 En and y 2 En be two arbitrary points. The line segment
joining x and y is the set of all points
αx + (1 – α)y (16)
for all 0 ≤ α ≤ 1.
The expression (16) is called the convex combination of x and y if
0 ≤ α ≤ 1 and is called the strict convex combination if 0 < α < 1.
A convex combination of any finite collection of points x1 2 En, …,
xk 2 En is
α1×1 + ::: + αkxk
where all αi are nonnegative and α1 + ::: + αk = 1.
The set X ⊂ En is convex if, for any x 2 X, y 2 X, and 0 ≤ α ≤ 1,
αx + (1 – α)y 2 X:
Examples of convex sets: Example of a non-convex set:
A point x is an extreme point of a convex set S ⊂ En if it cannot
be written as a strict convex combination of two other points of
this set. In other words, if
x = αy + (1 – α)z
for some y 2 S, z 2 S, and 0 < α < 1, then x = y = z.
EXERCISE 10
Let A be an m × n matrix and b be an arbitrary m-vector. Prove
that
I the set fx : Ax = b; x 2 Eng is convex;
I the set fx : Ax ≤ b; x 2 Eng is convex;
I the set fx : Ax ≥ b; x 2 Eng is convex.
EXERCISE 11
Let X ⊂ En and Y ⊂ En be two arbitrary convex sets. Prove that
the set X Y is convex.
EXERCISE 12
Show an example of two convex sets X ⊂ En and Y ⊂ En such
that the set X [ Y is not a convex set.
Theorem 1
For any positive integers m and n and any m × n matrix A, the
maximum number of linearly independent rows is equal to the
maximum number of linearly independent columns.
This common number is called the rank of A.
In what follows, it is assumed that any linear program in standard
form
minimise c · x (17)
subject to
Ax = b x ≥ 0 with the parameters c 2 En, an m × n matrix A, and b 2 Em, satisfies two conditions: |
(18) (19) |
(LP1) m < n;
(LP2) the rank of matrix A is m.
Since the rank of A is m, matrix A contains m linearly independent
columns. Let B be a matrix, formed by these columns, and let N
be a matrix formed by all remaining columns.
Denote by xB the vector that is formed by the variables, associated
with B (the variables in xB are listed in the same order as the
corresponding columns in B).
Denote by xN the vector that is formed by the variables, associated
with N (the variables in xN are listed in the same order as the
corresponding columns in N).
Matrix B is called a basis matrix and the variables constituting xB
are called basic variables.
The variables constituting xN are called nonbasic variables.
Using B, xB, N, and xN, (18) can be rewritten as
BxB + NxN = b
and consequently
xB = B-1b – B-1NxN:
If B-1b ≥ 0, then the solution
xB = B-1b and xN = 0
is called a basic feasible solution.
Lemma 1
Let A be an m × n matrix with columns a1, …, an which has rank
m. Then for any k < m and any set far1; :::; arkg of k linearly
independent columns, there exists a column ae 2 f = ar1; :::; arkg such
that the columns
ar1; :::; ark; ae
are linearly independent.
Proof
Suppose that the required column ae does not exist. Then, for any
set fag1; :::; agmg of m columns, there exist scalars ηi;j, 1 ≤ i ≤ k
and 1 ≤ j ≤ m, such that
a
g1 = η1;1ar1 + ::: + ηk;1ark
· · · · · · · · ·
a
gm = η1;mar1 + ::: + ηk;mark
Any scalars α1, …, αm that satisfy the vector equation
α1ag1 + ::: + αmagm = 0 (20)
also satisfy the vector equation
α1
a
g1
z |
(η1;1ar1 + ::: + ηk;1ark ) +::: + αm
a
gm
z | {
(η1;mar1 + ::: + ηk;mark ) = 0
and vice versa.
The latter vector equation is equivalent to
(α1η1;1 + ::: + αmη1;m)ar1 + (α1ηk;1 + ::: + αmηk;m)ark = 0:
Since ar1, …, ark are linearly independent, all coefficients in this
linear combination must be zero. In other words,
η1;1α1 + ::: + η1;mαm = 0
· · · · · · · · ·
ηk;1α1 + ::: + ηk;mαm = 0
This system of linear equations with the unknowns α1, …, αm has
infinitely many solutions because k < m.
Summarising the above, for any m columns
a
g1; :::; agm;
there are infinitely many values α1, …, αm that satisfy (20).
This contradicts the assumption that the rank of A is m. Indeed,
because the rank of A is m, there are m linearly independent
columns
a
g1; :::; agm;
and since these columns are linearly independent, (20) has only the
unique solution
α1 = ::: = αm = 0:
The obtained contradiction is the consequence of the assumption
that the required column ae does not exist.
Theorem 2
For any linear program (17) – (19), a point x is an extreme point
of fx : Ax = b; x ≥ 0g if and only if x is a basic feasible solution.
Proof
Let x be a basic feasible solution that corresponds to some
invertible matrix B and let N be the matrix of all columns of A
which are not in B.
Suppose that x is not an extreme point, i.e. there exist 0 < α < 1
and feasible solutions y and z such that y 6= x and z 6= x and
x = αy + (1 – α)z: (21)
Let yN and zN be the vectors, formed by the variables in y and z,
respectively, which are associated with the columns in N and are
listed in the same order as the corresponding columns in N. Then,
xN = αyN + (1 – α)zN
which by virtue of xN = 0, yN ≥ 0 , zN ≥ 0, and 0 < α < 1,
implies
yN = 0 and zN = 0: (22)
Let yB and zB be the vectors, formed by the variables in y and z,
respectively, which are associated with the columns in B and are
listed in the same order as the corresponding columns in B.
Since y and z are feasible solutions,
ByB + NyN = b and BzB + NzN = b:
This, by virtue of (22), gives
ByB = b and BzB = b
and consequently
yB = B-1b and zB = B-1b:
This together with (22) leads to y = x and z = x which
contradicts that y 6= x and z 6= x.
Now assume that x is an extreme point. If x = 0, then x is a basic
feasible solution.
Suppose that x 6= 0. Let C be a matrix that is formed by the
columns of A that correspond to the variables in x that are greater
than zero.
Observe that C may not be a square matrix.
Denote by xC the vector that is formed by the variables in x,
associated with matrix C (the variables in xC are listed in the same
order as the corresponding columns in C).
Let c1, …, ck be the columns of C.
If the columns in C are linearly independent, then k ≤ m < n.
If k = m, then C is a square invertible matrix. If k < m, then by
Lemma 1, there exists a square invertible matrix that contains all
columns of C.
In both cases, x is a basic feasible solution because all columns of
A, which are not in this square invertible matrix, are associated
with the variables which are zero..
Suppose that c1, …, ck are linearly dependent, i.e. there exist
scalars α1, …, αk, which are not all zero such that
α1c1 + ::: + αkck = 0: (23)
Denote
α =
24
α1
· · ·
αk
35
:
Then, (23) can be written as
Cα = 0: (24)
Since xi > 0, for each variable xi in xC, there exists (sufficiently
small) ” > 0 such that
xC + “α > 0
xC – “α > 0 (25)
If C 6= A, then denote by D a matrix formed by the columns of A
which are not in C.
Denote by xD the vector formed by the variables in x, associated
with matrix D (the variables in xD are listed in the same order as
the corresponding columns in D).
Consider two vectors, y and z, defined as follows:
I If C = A, then
y = xC + “α
z = xC – “α
I If C 6= A, then vectors y and z are defined by specifying the
value for the variables associated with C and the value for the
variables associated with D:
yC = xC + “α zC = xC – “α |
and and |
yD = xD zD = xD |
Taking into account (24), (25) and that xD = 0,
Ay = b and y ≥ 0
Az = b and z ≥ 0:
On the other hand,
x =
1 2
y +
1 2
z:
This contradicts the assumption that x is an extreme point.