COIN: Coincidence Algorithm

Introduction

Combinatorial optimisation is the optimisation where the domains of feasible solutions are discrete.  Examples of this domain are traveling salesman problem, minimum spanning tree problem, set-covering problem, knapsack problem, etc. It is also related to constraint satisfaction problem, such as N-Queen puzzle.  For a reasonable problem size, exhaustive search is not feasible. Any searching method can not guarantee to find an optimal solution.  Combinatorial optimisation has many applications for operational research.

COIN belongs to a subgroup of Evolutionary Algorithms that makes use of models to generate solutions. The emphasis is on using some form of model as a repository of “trait” or knowledge extracted from previous candidate solutions. Instead of using genetic operations to create the next generation candidate solutions from the current solutions, the algorithm sampling the new candidates directly from this model, hence eliminate many difficulties involved in designing and performing those genetic operations.

The model in COIN is a joint probability matrix, H. This matrix represents a kind of Markov Chain. An entry in Hxy is a probability of transition from a state x to a state y. We call xy a coincidence of the event x and event y. This matrix, H, fits to represent combinatorial problems. Let’s illustrate this representation using Traveling Salesman Problem (TSP).  A solution of a TSP problem is a tour, a combination of cities which can be represented by a string of numbers.  Each number denotes a city, so, the following string is a tour of ten cities TSP problem.

        1346785290

A coincidence is an adjacent pair of cities in the tour. There are ten coincidences in this tour.  They are 1-3, 3-4, 4-6, 6-7, 7-8, 8-5, 5-2, 2-9, 9-0 and 0-1. The joint probability matrix, H, is a square matrix of size n × n. The sum over each row Hxy where y ranges from 1 to n equals to 1.0.  It denotes the probability of the occurrence of xy in the tour. Each entry of Hxy has a value 0 to 1.0.  The diagonal Hxx are zero.

Steps of COIN

Coincidence algorithm searches for solutions similar to any Evolutionary Algorithm, that is, starting from a random population of candidates; it selects some candidates and uses them to update H; with H, a new population of candidate is sampled; the selected candidate again, are used to update H; this process is repeated until satisfactory solutions are found.

Steps of the algorithm
1  Initialise H to a uniform distribution.
2  Sample a population from H.
3  Evaluate the population.
4  Select two groups of candidates: better, and worse.
5  Use these two groups to update H.
6  Repeate the steps 2-3-4-5 until satisfactory solutions are found.

These steps are quite standard and are similar to any Estimation Distribution Algorithm except for the step 4 and 5.  The precise reason for this step will be discussed later. At this moment, let's discuss these steps.  The joint probability matrix, H, is central to this algorithm. It is maintained and updated properly throughout the search cycle.

1  Initialise H

H is initially filled with a value 1/(n-1) (where n is the size of problem) except the diagonal Hxx is zero.

2  Sample a population

If the problem does not constrain the starting point, a random x is chosen, then, xy is sampled according to Hxy.  The next step is then started at y.  The next pair is again sampling from H. Any element that is a repeat of element that occurs earlier will have to be throwaway. This process is repeated until a combination of length n is reached.  Each candidate is sampled this way.  Sample a population of the required size.

3  Evaluate the population

Each candidate in the population is evaluated for it fitness according to some objective function.

4  Selection of candidates

The whole population is ranked.  For a single objective problem this can be simple, the candidate is ranked by its fitness.  For a multi objective problem, the most popular choice of ranking is the Pareto ranking. The unique characteristic of COIN is that it selects two groups of candidates: better-group and worse-group. This notion of better/worse is relative to the average fitness of the population. The exact selection method can be varied according to problems, for example, best 10% and worst 10% or some other method “normalized” to the population sizes and/or the deviation of the fitness.  

5  Updating the joint probability matrix

The update of H is separated into two components: reward and punishment.  The reward is the increase of Hxy by the occurrence of the pair xy found in the better-group candidates. The incremental step is k/(n-1) where k denotes the step size, n the length of a candidate.  The punishment is the decrease of Hxy by the occurrence of the pair xy found in the worse-group candidates with similarly calculation to the reward. Here is the equation:

        Hxy(t+1) = Hxy(t) + k/(n −1)(rxy − pxy)  + k/(n −1)2 (∑ pxz − ∑ rxz)  ................. (eq 1)

Where k denotes the step size, n the length of a candidate, rxy the number of occurrence of xy in the better-group candidates, pxy the number of occurrence of xy in the worse-group candidates.  Hxx are always zero.  The term, k/(n−1)(rxy − pxy), is the reward and punishment of the occurrence of a xy pair found in both groups.  The last term, k/(n−1)2 (∑ pxz − ∑ rxz), represents the adjustment step for all “others” Hxz (z ≠ y, z ≠ x) in the opposite direction hence keeping the sum of all probabilities in a row constant to one.

Computational Cost and Space

For the problem size n, and m candidates in each generation, the computational cost and space complexity are as follow:
1. Generating the population requires time O(mn2) and space O(mn)
2. Sorting the population requires time O(m log m)
3. The generator require space O(n2)
4. Updating the joint probability matrix requires time O(mn2)

Code Release

<coming soon>

References 

publications directly related to COIN

last update 17 Aug 2017