A GA to Solve the TSP
SIGCSE 2008 nifty assignment
Raja Sooriamurthi / raja@cmu.edu

# The Assignment

In this assignment students will write a genetic algorithm (GA) to solve instances of the Traveling Salesman Problem (TSP). I normally give this assignment as a warm up exercise in the beginning of an advanced class on object orientation, or as an intermediate exercise in an introductory class on object-oriented programming. The main pedagogical objectives of the assignment are:
• Working with multiple classes: In my introductory class this is the first significant assignment where students work with multiple classes that together form a solution to a problem. (My complete solution consists of 400+ lines spread over 8 classes.)

• Reading and extending code: I also use this as an exercise in reading, understanding, and extending an existing application. As we discuss in my class, most of the time in the real world one will be working on an existing application. One needs to hone the skill of reading and understanding code written by another person before making extensions to it.

# SIGCSE description

The description of the assignment in SIGCSE nifty assignment terms is as follows:

 Summary The program will implement a simple genetic algorithm (GA) to solve instances of the Traveling Salesman Problem (TSP). Topics Intermediate object-orientation; responsibility driven design; unit testing; principles of genetic algorithms Audience CS1 or CS2 Difficulty Intermediate. Typically takes 1 to 1.5 weeks. (Depends on how much code students are given a priori and how much they are expected to write on their own.) Strengths Students have enjoyed the assignment and have found the problem to be an interesting programming exercise. They particularly like the visual feedback of how their program is performing and the evolutionary improvement in the computed solution. I use this as an exercise as part of my introduction to responsibility driven design. The sample code I provide deliberately has some design flaws. Part of the learning goal of the assignment is for students to read my code and improve upon it. For example, the program needs to draw cities (the green dots) on a map (the black background). An interesting discussion question is: Whose responsibility is it to do the drawing? Should the cities draw them selves on the map? Leading to city.draw(map) or should the map draw the cities? Leading to map.draw(city). Is one choice better than the other? I also use this exercise to introduce unit testing with jUnit. Students write some simple tests for the methods in the Util class. I've also used this problem to provide an informal and intuitive introduction to the notion of NP-completeness. Weaknesses How much background code to give depends on the prior preparation of the students and the focus of the class. On occasion, some students have felt that I give them too much help code thereby making the assignment less challenging. But one of the points of this exercise is to read and understand another persons code and then to extend it. Dependencies Requires a brief discussion of the general idea GAs and how they work. I hand out a chapter on genetic algorithms from A.K. Dewdney's The New Turing Omnibus. Variants There is an itneresting trade-off between the complexity of the encoding method and cross-over method. Students need to write methods to encode a tour of cities. A simple representation is to use a linear sequence of cities. But the process of genetic cross-over may lead to an invalid tour which then needs to be patched. Alternatively a more sophisticated tour representation could be used wherein any cross-over will result in a valid tour. Once the general GA framework has been created it could be applied to other problems. I normally give this assignment as a Java application but with minor modifications it could also be written as an applet.

# Short Background on Genetic algorithms

For a nice concise description of the essential ideas behind a genetic algorithm read the chapter from A.K.Dewdney's The New Turing Omnibus. The essential high-level structure of applying a genetic algorithm to solve a problem consists of:
1. Determine a way to encode a representation for a solution.
2. Determine a fitness function (how to compare two candidate solutions).
3. Create an initial population of random individuals.

4. REPEAT
1. SELECT two parents from the population pool
2. CROSSOVER (reproduce) the parents to produce two children
3. probabilistically MUTATE the children
4. collect the children in a new pool
UNTIL
the fitness of the best individual doesn't improve (or the average fitness of the pool as a whole doesn't improve)

# The Traveling Salesman Problem (TSP)

In this assignment you will study and complete a set of classes that implement a GA solution to the Traveling Salesman Problem. In the figures below you have 20 cities (numbered in yellow from 0 -- 19) laid out on the sides of a square. Your program will start off with an initial collection of random tours. The sequence of cities within a sample tour is given in red in the graphic below left. Your GA will determine the optimal way to traverse these 20 cities which is along the sides of the square (below right figure). The sequence of cities within a tour and the distance of a tour is also given in the title bar of each of the windows below. The square is of 600 pixles width and hence the optimal tour distance is 2400. Clicking on each graphic will open it in a window of its own.

# The Provided Code

The starting code-framework consists of the following files:
File name Given Code

My Solution

LOC SLOC LOC SLOC
City.java 27 12 38 27
GenerateCities.java 57 34 57 34
Map.java 59 39 44 29
SplitInterval.java 62 36 62 36
TSP.java 126 77 126 77
Tour.java 180 120 204 155
Util.java 29 12 33 20
UtilTest.java 0 0 37 26
Total 540 330 639 426

LOC: Lines of code (source + comments + blank lines)
SLOC: Source Lines of Code

The files marked in purple don't need to be changed at all. The files marked with green will need modifications. In your final version of your code you will want to use the TSP.java code I've provided as is. But during the process of developing your solution you may want to edit TSP.java to experiment with your evolving code.

# The assignment

The following would be one way of approaching this assignment:
1. Read the chapter on Genetic algorithms from A. K.Dewdney's The Turing Omnibus (handout in class) and get an overview of the whole process. Note that in the Dewdney article he talks about a novel encoding of tours that will ensure that we always get valid tours with crossover. We are not using that encoding. Rather ours is just a simple encoding of the actual sequence of cities visited in the tour.

2. Download, print and study the giving code-framework Being able to comprehence and make changes to an existing code-base is one of the objectives of this assignment.

3. In Util.java write the methods for:

• rndInt(int i, int j)
returns an random integer r in the interval i <= r < j

• count(int[] a, int key, int start, int end)
returns the number of times key occurs in the array segment of a[start..end-1] i.e., upto end but not including it. This method will be used in Tour.java when writing the crossover method. It will be used to ensure that the tours we get from the cross-over process are indeed valid tours.

4. Concurrently write UtilTest.java to develop a suite of units tests for both these methods

5. In City.java complete the method distance(City) which calculates the distance between two cities

6. In City.java move the definition of display(Graphics,int) from Map.java to City.java

7. In Tour.java write the method distance to find out the distance of an entire tour

8. Finally write the method crossover(Tour, int, int)
The method interface is: Tour crossOver(Tour p2, int x1, int x2)
this ---------------------------------
^           ^
x1          x2
p2   +++++++++++++++++++++++++++++++++
^           ^
x1          x2

c1    ------+++++++++++++--------------
|           |
|           |
c2    ++++++-------------++++++++++++++

At a high level the array copying takes place in three stages:
c1[    0 .. x1-1    ]  = this[    0 .. x1-1     ]
c1[   x1 .. x2      ]  = p2  [   x1 .. x2       ]
c1[ x2+1 .. lenth-1 ]  = this[ x2+1 .. length-1 ]

The potential problem is that after we copy the array segments we may have an invalid tour with duplicate cities as shown below:

this  a b c d e f g h i
^^^^^
p2    d h i c b e a g f
^^^^^

c1    a b c c b e g h i
^^^^^

We can resolve this by: (here is where you will need to use count)

1. scan the crossover segment of 'this' for elements that don't occur in the crossover segment of p2:
uniq: d f

2. scan the beginning and end segments of the child looking for duplicate elements in the crossed-over segment. When we find a duplicate replace it with an element of the 'uniq' array.
e.g.,  replace 'b' with 'd'
'c' with 'f'

giving the new child:
a d f c b e g h i