Expository DemonstrationΒΆ
Consider the following matrix
import numpy as np
A = np.array([[40, 50, 60, 65],
[30, 38, 46, 48],
[25, 33, 41, 43],
[39, 45, 51, 59]])
print(A)
[[40 50 60 65]
[30 38 46 48]
[25 33 41 43]
[39 45 51 59]]
We wish to solve the linear sum assignment problem on A
, namely minimize \(trace(P^T A)\) \(\forall P \in \mathcal{P} \), where \(\mathcal{P} \) is the set of permutation matrices
from scipy.optimize import linear_sum_assignment
row, col = linear_sum_assignment(A, maximize=False)
P1 = np.eye(4)[col,:]
print("assignment: ", col)
print("cost: ", A[row,col].sum())
print('permutation matrix: ')
print(P1)
assignment: [0 1 3 2]
cost: 172
permutation matrix:
[[1. 0. 0. 0.]
[0. 1. 0. 0.]
[0. 0. 0. 1.]
[0. 0. 1. 0.]]
However, another optimal solution for A also exists: [0, 3, 1, 2]
col = [0, 3, 1, 2]
P2 = np.eye(4)[col,:]
print("assignment: ", col)
print("cost: ", A[row,col].sum())
print('permutation matrix: ')
print(P2)
assignment: [0, 3, 1, 2]
cost: 172
permutation matrix:
[[1. 0. 0. 0.]
[0. 0. 0. 1.]
[0. 1. 0. 0.]
[0. 0. 1. 0.]]
Trivially, the average of these two permutation matrices would yield the same objective value as each individually. \(trace((\frac{(P_1 + P_2)}{2})^T A) = \frac{1}{2}trace((P_1 + P_2)^T A) = \frac{1}{2}trace(P_1^T A + P_2^T A) = \frac{1}{2}trace(P_1^T A) + \frac{1}{2}trace(P_2^T A) = \frac{1}{2} OOFV + \frac{1}{2}OOFV = OOFV\)
Therefore, for instances where there are multiple optimal solutions, we relax the feasible region to the set of doubly stochastic matrices to get a better representation of the possible solutions: minimize \(trace(P^T A)\) \(\forall P \in \mathcal{D} \), where \(\mathcal{D} \) is the set of doubly stochastic matrices.
To do so, we utilize the lightspeed optimal transport method to get a doubly stochastic matrix that closely approximates this solution. Namely, we perform the Sinkhorn-Knopp algorithm on \(\exp{- \lambda A}\).
def alap(P, n, maximize, reg, tol):
power = 1 if maximize else -1
lamb = reg / np.max(np.abs(P))
P = np.exp(lamb * power * P)
P_eps = _doubly_stochastic(P, tol)
return P_eps
def _doubly_stochastic(P, tol=1e-3, max_iter=1000):
# Adapted from @btaba implementation
# https://github.com/btaba/sinkhorn_knopp
# of Sinkhorn-Knopp algorithm
# https://projecteuclid.org/euclid.pjm/1102992505
c = 1 / P.sum(axis=0)
r = 1 / (P @ c)
P_eps = P
for it in range(max_iter):
if it % 100 == 0: # only check every so often to speed up
if (np.abs(P_eps.sum(axis=1) - 1) < tol).all() and (
np.abs(P_eps.sum(axis=0) - 1) < tol
).all():
# All column/row sums ~= 1 within threshold
break
c = 1 / (r @ P)
r = 1 / (P @ c)
P_eps = r[:, None] * P * c
return P_eps
Q = np.around(alap(A, 4, False, 400, 1e-3), 3)
# print(Q.sum(axis=1))
print("cost: ", np.trace(Q.T@A))
print('doubly stochastic matrix (rounding to three decimal points): ')
print(Q)
print("average of other two optimal matrices: ")
print((P1+P2)/2)
cost: 172.00799999999998
doubly stochastic matrix (rounding to three decimal points):
[[0.998 0.002 0. 0. ]
[0.001 0.498 0.001 0.5 ]
[0.001 0.498 0.001 0.5 ]
[0. 0.002 0.998 0. ]]
average of other two optimal matrices:
[[1. 0. 0. 0. ]
[0. 0.5 0. 0.5]
[0. 0.5 0. 0.5]
[0. 0. 1. 0. ]]