Watching the computer do the Simplex algorithm
Some time ago, I needed a refresher on linear programming. I observed, among other things, the sparsity of solutions. It felt like some nice intuition about the solution was available that had to do with the cost vector projected onto the active constraints. Back in my mind, I knew that these things are “obvious” from the simplex algorithm, but I had forgotten the details. So I decided to implement the simplex algorithm in Python, and watch it do its thing.
I got the presentation of the algorithm of the the book Linear programming: methods and applications 1 and various sources online I forgot about. Here comes my implementation!
import numpy as np # for linear algebra
import pandas as pd # for pretty printing
np.seterr(all='raise')
class SimplexTableau:
def __init__(self,A,b,c,B,names):
"""Initialize the simplex tableau
It accepts A, b and c needed to solve the linear program
min c@x
s.t. A@x = b
x >= 0
The argument `names` is a list of names for the variables in x.
At initialization, one needs a Basic Feasible Solution (BFS) to the problem.
Such a BFS is provided by the argument B, which is a list of indices of the components of x you want
to use as a basis. In this basis, we can always assume that A[:,B] is the identity matrix. If not,
one can compute a new basis in which A[:,B] is the identity, and the solution to the problem is the
same. The cost vector `c` and the constraint vector `b` must be updated accordingly.
"""
n = len(c)
m = len(b)
assert c.ndim == 1
assert b.ndim == 1
assert A.ndim == 2
assert A.shape == (m,n)
assert len(names) == n
self.names = np.array(names)
self.A=A
self.b=b
self.c=c
self.B=B
self.z_c = np.full(n,np.nan)
np.testing.assert_allclose(A[:,B], np.eye(m),
err_msg="The constraint matrix `A` is not expressed in the basis.")
self.xhat_ = None # becomes non-nan when problem has been solved
def compute_z_c(self):
""" Update self.z_b """
cB = self.c[self.B]
z = cB @ self.A
self.z_c = z - self.c
def is_optimal(self):
self.compute_z_c()
return np.all(self.z_c <= 0)
def step(self):
self.compute_z_c()
pivot_column = np.argmax(self.z_c)
with np.errstate(divide='ignore'):
theta = self.b / self.A[:,pivot_column]
theta_pos = theta.copy()
theta_pos[theta_pos <=0] = np.inf
pivot_row = np.argmin(theta_pos)
self.B[pivot_row] = pivot_column
subA = self.A[:,self.B]
foo = np.linalg.inv(subA)
self.A = foo @ self.A
self.b = foo @ self.b
def display(self):
"""Displays the current simplex tableaux state
Side effects:
if the current tableau deems the problem is solved, self.xhat_ is populated
"""
basis_col = self.names[self.B]
cB = self.c[self.B]
z = cB @ self.A
z_c = z - self.c
pivot_column = np.argmax(z_c)
with np.errstate(divide='ignore'):
theta = self.b / self.A[:,pivot_column]
theta_pos = theta.copy()
theta_pos[theta_pos <=0] = np.inf
pivot_row = np.argmin(theta_pos)
r1 = np.concatenate([[' ',' '],self.names,[' ',' ']])
r2 = np.concatenate([['B','c_B'],self.c.T,['b','\\theta']])
r3 = np.column_stack([ basis_col ,cB, self.A.round(2) , self.b.round(2), theta.round(2)])
r4 = np.concatenate([[' ','z_j'],z.round(2),[(cB@self.b).round(2),' ']])
r5 = np.concatenate([[' ','z_j - c_j'],z_c.round(2),[' ',' ']])
composite = np.row_stack([r1,r2,r3,r4,r5])
print(pd.DataFrame(composite).to_string(index=False,header=False))
if self.is_optimal():
print("Optimal")
xhat = np.zeros_like(self.c)
xhat[self.B]=self.b
self.xhat_ = xhat
print("Solution: x = ", xhat)
else:
print(f"Pivot row: {basis_col[pivot_row]}")
print(f"Pivot col: {self.names[pivot_column]}")
def solve(self,MAXITER = 10):
for k in range(MAXITER):
print(f"\nIteration {k}")
self.display()
if self.is_optimal():
assert self.xhat_ is not None
return self.xhat_
else:
self.step()
else:
print("Failed the solve")
Next, I needed a problem to solve. The reason I looked into this was about generalizing the proof in a paper on quantile estimation 2. So I cooked up a artificial problem related to that. The full application is as below.
import scipy.stats
import scipy.optimize
def miscoverage(k,n,p):
"""
How often will the k-th ordinal NOT be a upper bound on the F^{-1}(p) quantile?
Returns:
P[X[k-1] < F^{-1}(p) ]
assuming X are a vector of n iid samples in sorted order
idexing is a little mess because mixing 0-based and 1-based indexing!
"""
return 1-scipy.stats.binom(n=n,p=p).cdf(k-1)
n = 7 # number of data points
alpha = 0.40 # miscoverage rate
p = 0.6 # the percentile to estimate
names =[f"x{k}" for k in range(1,2+n)]
c = np.arange(1,2+n,dtype=float)
b = np.array([1,1-alpha])
A = np.array([
np.ones(n+1),
1-miscoverage(np.arange(1,2+n),n,p)
])
#
# The augmented basis is a way to find a Basic Feasible Solution
#
A_ = np.block([
A,np.eye(len(A))
])
b_ = b
c_ = np.concatenate([c,np.full((len(A),),c.max()*100)])
names_ = names + [f"w{k}" for k in range(1,1+len(A))]
B_= np.arange(n+1,len(c_))
#
# Solve!
#
xhat_mine = SimplexTableau(A_,b_,c_,B_,names=names_).solve()
xhat_mine = xhat_mine[:-2] # drop the auxilliary variables
# Verify the solution using scipy
res=scipy.optimize.linprog(c=c,A_eq=A,b_eq=b)
assert res.success
np.testing.assert_array_almost_equal(xhat_mine,res.x)
Finally, this generates the following output:
Iteration 0
x1 x2 x3 x4 x5 x6 x7 x8 w1 w2
B c_B 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 800.0 800.0 b \theta
w1 800.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 0.0 1.0 1.0
w2 800.0 0.0 0.02 0.1 0.29 0.58 0.84 0.97 1.0 0.0 1.0 0.6 0.6
z_j 801.31 815.07 877.0 1031.83 1264.08 1473.1 1577.61 1600.0 800.0 800.0 1280.0
z_j - c_j 800.31 813.07 874.0 1027.83 1259.08 1467.1 1570.61 1592.0 0.0 0.0
Pivot row: w2
Pivot col: x8
Iteration 1
x1 x2 x3 x4 x5 x6 x7 x8 w1 w2
B c_B 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 800.0 800.0 b \theta
w1 800.0 1.0 0.98 0.9 0.71 0.42 0.16 0.03 0.0 1.0 -1.0 0.4 0.4
x8 8.0 0.0 0.02 0.1 0.29 0.58 0.84 0.97 1.0 0.0 1.0 0.6 366.21
z_j 798.7 785.08 723.77 570.48 340.56 133.64 30.17 8.0 800.0 -792.0 324.8
z_j - c_j 797.7 783.08 720.77 566.48 335.56 127.64 23.17 0.0 0.0 -1592.0
Pivot row: w1
Pivot col: x1
Iteration 2
x1 x2 x3 x4 x5 x6 x7 x8 w1 w2
B c_B 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 800.0 800.0 b \theta
x1 1.0 1.0 0.98 0.91 0.71 0.42 0.16 0.03 0.0 1.0 -1.0 0.4 2.52
x8 8.0 0.0 0.02 0.09 0.29 0.58 0.84 0.97 1.0 -0.0 1.0 0.6 0.71
z_j 1.0 1.12 1.66 3.02 5.06 6.89 7.8 8.0 0.99 7.01 5.2
z_j - c_j 0.0 -0.88 -1.34 -0.98 0.06 0.89 0.8 0.0 -799.01 -792.99
Pivot row: x8
Pivot col: x6
Iteration 3
x1 x2 x3 x4 x5 x6 x7 x8 w1 w2
B c_B 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 800.0 800.0 b \theta
x1 1.0 1.0 0.98 0.89 0.66 0.31 0.0 -0.16 -0.19 1.0 -1.19 0.29 0.29
x6 6.0 0.0 0.02 0.11 0.34 0.69 1.0 1.16 1.19 -0.0 1.19 0.71 inf
z_j 1.0 1.1 1.56 2.72 4.44 6.0 6.78 6.94 0.99 5.95 4.56
z_j - c_j 0.0 -0.9 -1.44 -1.28 -0.56 -0.0 -0.22 -1.06 -799.01 -794.05
Optimal
Solution: x = [0.28743674 0. 0. 0. 0. 0.71256326
0. 0. 0. 0. ]
Nice! The solution is the same as the one found by scipy.
You can follow along in the solution. We start in the augmented basis w1
, w2
. I introduce these two variables simply to start in a BFS.
Then, we replace one of these two variables in the basis with one of the variables in the original problem.
One picks the new variable for which the gain in the objective function is largest (this is the quantity z_j - c_j
in the tableaux).
Then we must compute which variable should leave the basis. This is described by the quantity theta
in the tableaux. Smallest positive theta
should go.
In the code, the computation of these quantities is done in the step
method.
After selecting the new basis direction, we pivot the tableaux. That means describing A
, b
and c
in the new basis, i.e. where the new variable is in the basis vectors. The code verifies this by checking that A[:,B]
is the identity matrix.
After the pivot, we are in a new basis, and we can compute the new z_j
and z_j - c_j
and theta
quantities and present the tableaux.
When there is no positive z_j - c_j
left, we are at the optimal solution. The solution is exactly b
in this basis, since A[:,B]
is an identity.
Please read more in a linear programming book if you want a better explanation. This was mostly a reminder for myself to save and share the code. :) Happy programming!
-
S. I. Gass, Linear programming: methods and applications. in Decision making and operations management series. Danvers, Mass: Boyd & Fraser, 1995. ↩
-
R. Zieliński and W. Zieliński, ‘Best exact nonparametric confidence intervals for quantiles’, Statistics, vol. 39, no. 1, pp. 67–71, Feb. 2005, doi: 10.1080/02331880412331329854. ↩