Expository DemonstrationΒΆ

Consider the following matrix

import numpy as np
A = np.array([[40, 50, 60, 65],
    [30, 38, 46, 48],
    [25, 33, 41, 43],
    [39, 45, 51, 59]])
print(A)
[[40 50 60 65]
 [30 38 46 48]
 [25 33 41 43]
 [39 45 51 59]]

We wish to solve the linear sum assignment problem on A, namely minimize \(trace(P^T A)\) \(\forall P \in \mathcal{P} \), where \(\mathcal{P} \) is the set of permutation matrices

from scipy.optimize import linear_sum_assignment
row, col = linear_sum_assignment(A, maximize=False)
P1 = np.eye(4)[col,:]
print("assignment: ", col)
print("cost: ", A[row,col].sum())
print('permutation matrix: ')
print(P1)
assignment:  [0 1 3 2]
cost:  172
permutation matrix: 
[[1. 0. 0. 0.]
 [0. 1. 0. 0.]
 [0. 0. 0. 1.]
 [0. 0. 1. 0.]]

However, another optimal solution for A also exists: [0, 3, 1, 2]

col = [0, 3, 1, 2]
P2 = np.eye(4)[col,:]
print("assignment: ", col)
print("cost: ", A[row,col].sum())
print('permutation matrix: ')
print(P2)
assignment:  [0, 3, 1, 2]
cost:  172
permutation matrix: 
[[1. 0. 0. 0.]
 [0. 0. 0. 1.]
 [0. 1. 0. 0.]
 [0. 0. 1. 0.]]

Trivially, the average of these two permutation matrices would yield the same objective value as each individually. \(trace((\frac{(P_1 + P_2)}{2})^T A) = \frac{1}{2}trace((P_1 + P_2)^T A) = \frac{1}{2}trace(P_1^T A + P_2^T A) = \frac{1}{2}trace(P_1^T A) + \frac{1}{2}trace(P_2^T A) = \frac{1}{2} OOFV + \frac{1}{2}OOFV = OOFV\)

Therefore, for instances where there are multiple optimal solutions, we relax the feasible region to the set of doubly stochastic matrices to get a better representation of the possible solutions: minimize \(trace(P^T A)\) \(\forall P \in \mathcal{D} \), where \(\mathcal{D} \) is the set of doubly stochastic matrices.

To do so, we utilize the lightspeed optimal transport method to get a doubly stochastic matrix that closely approximates this solution. Namely, we perform the Sinkhorn-Knopp algorithm on \(\exp{- \lambda A}\).

def alap(P, n, maximize, reg, tol):
    power = 1 if maximize else -1
    lamb = reg / np.max(np.abs(P))
    P = np.exp(lamb * power * P)

    P_eps = _doubly_stochastic(P, tol)

    return P_eps
def _doubly_stochastic(P, tol=1e-3, max_iter=1000):
    # Adapted from @btaba implementation
    # https://github.com/btaba/sinkhorn_knopp
    # of Sinkhorn-Knopp algorithm
    # https://projecteuclid.org/euclid.pjm/1102992505

    c = 1 / P.sum(axis=0)
    r = 1 / (P @ c)
    P_eps = P

    for it in range(max_iter):
        if it % 100 == 0:  # only check every so often to speed up
            if (np.abs(P_eps.sum(axis=1) - 1) < tol).all() and (
                np.abs(P_eps.sum(axis=0) - 1) < tol
            ).all():
                # All column/row sums ~= 1 within threshold
                break

        c = 1 / (r @ P)
        r = 1 / (P @ c)
        P_eps = r[:, None] * P * c
    return P_eps
Q = np.around(alap(A, 4, False, 400, 1e-3), 3)

# print(Q.sum(axis=1))
print("cost: ", np.trace(Q.T@A))
print('doubly stochastic matrix (rounding to three decimal points): ')
print(Q)
print("average of other two optimal matrices: ")
print((P1+P2)/2)
cost:  172.00799999999998
doubly stochastic matrix (rounding to three decimal points): 
[[0.998 0.002 0.    0.   ]
 [0.001 0.498 0.001 0.5  ]
 [0.001 0.498 0.001 0.5  ]
 [0.    0.002 0.998 0.   ]]
average of other two optimal matrices: 
[[1.  0.  0.  0. ]
 [0.  0.5 0.  0.5]
 [0.  0.5 0.  0.5]
 [0.  0.  1.  0. ]]