I have done a little project in the past involving Genetic Algorithms (GAs), written in VB6 (really poorly). Now, after I understand the whole concept of programming much better, I decided to give it another try.
This time I've decided to try tackling the Traveling Salesman Problem. For those not familiar with it, it goes a little like this:
Imagine you have N cities scattered across a country. A salesman has to travel to each city exactly once, but he doesn't like traveling very much. So, he wants to find the minimum traveling distance while still visiting each city once. The path the salesman takes is often called a tour.
The amount of different tours is N!, which can grow very very large for as little as 20 cities. So simply trying each one and choosing the shortest is not an option. That's where the GA comes in.
The GA starts off with a random sample (population) of tours (let's say 50 of them). Each tour has to be encoded with some kind of string, representing a chromosome. This is often done as a string of bits. Then, a fitness function determines how fit each tour is (in better words: how well the tour fits). A number of the fittest tours are then recombined to (hopefully) form better (shorter) tours. The same number of the worst (longest) tours are removed. This recombination basically takes two chromosomes and switches them around after a certain point, like this:
010 | 01100110
110 | 11010010
010 | 11010010
110 | 01100110
This way, we might end up with even better tours. Then, the same happens for this next generation. The fittest tours are again recombined, etc, until we have a good enough solution, or until a predefined time as run out.
In order to randomize the populations a bit, once every few generations a mutation can occur. This means basically changing one or a few 'bits' (in this case) from 0 to 1 or 1 to 0. This might make the tour in question much better, or much worse. If it's better, it will be recombined with another good tour in the next generation. If it's worse, it will vanish over time as other bad tours.
Well, seems easy enough doesn't it? But the real problem is in determining how to encode the chromosomes, and how to do the recombinations/mutations.
I thought I had it figured out pretty well, by simply taking the chromosome to be an array of the number of each city, in the order which the salesman visits them. For example, if we have cities 1, 2, 3 and 4, a chromosome might look like:
2, 3, 1, 4
indicating that the salesman starts at 2, goes to 3, to 1, and finally ends in 4.
I thought I had this all figured out, but I'm afraid I was half asleep when I thought of this, because it has a fatal flaw:
- After recombination, we cannot be sure that each city is visited exactly once.
For example, suppose we have tours [2,3,1,4] and [3,1,4,2] and recombine them (switching after 2nd city), we get:
[2,3,4,2] and [3,1,1,4]
Now, the first tour visits city 2 twice and never visits city 1. The second tour visits city 1 twice and never city 2...
I cannot think of any way to fix this?! I suppose I could check if the new recombined tour is valid before accepting it, and if not, try it again, but that will probably GREATLY increase the execution time of the algorithm, and GAs are already notoriously slow... I'm doing this mainly to see how fast I can get it, and using a validity check is not a good start at all...
Can anyone think of a better way to encode the tour, and how to recombine two tours while ensuring that the new tour remains valid?
Thanks a lot!