Introduction to Genetic Algorithms



In this assignment, you will study the performance of a genetic algorithm at solving three problems of increasing difficulty: the pattern problem, the traveling salesman problem, and the nurse scheduling problem. I have provided the bulk of the code for you, but there are a few methods remaining for you to implement.

About the code

This code demonstrates a Pythonic technique known as "mixins". This should be familiar to you if you've used Lisp or Scheme; Java's interface system is somewhat similar.

Mixins take advantage of Python's support for multiple inheritance to "mix in" behavior from several unrelated classes. In our case, we'll use it to deal with one of the common frustrations involving GAs: all the different choices for implementing the algorithm. We will mix in behavior from four different classes: We also need a class to contain the problem knowledge, which includes a fitness function, a method to generate an initial population, and a method to mutate chromosomes. I have provided three of these, along with an abstract base class. This architecture may seem complicated, but it allows us to easily try out different combinations of elitism, crossover and selection mechanisms without editing, recompiling, or generating config files. Instead, we create a mixin that inherits the specific classes whose behavior we want to incorporate. For example, let's suppose we want to solve the "all-ones" problem: find a bitstring with all ones in it. This is easy, but will help us get going.

We start by making a fitness function:
>>> def allOnes(chr) :
     return len([x for x in chr.bitstring if x == '1'])
This is not the only possible fitness function; it's just as easy one to get us started. The key is that it should return larger values for solutions that are closer to the goal. Try writing fitness functions to match particular patterns, or compute parity.

Now, we create a BitStringProblem. Let's suppose we want each individual to be of length 20.
import BitStringGAProblem
b = BitStringGAProblem.BitStringGAProblem(allOnes,20)
Lastly, we build a solver for this problem. Let's suppose we want to use deterministic elitism, single-point crossover, and roulette selection. We would do:
>>> class mySolver(GA.GA, Elitism.DeterministicElites, Crossover.OnePointStringCrossover, Selector.RouletteSelector):
...     pass
This creates a class called mySolver that "mixes in" the behavior we want. We then create an instance of this class, pass it the problem we want to solve, and run it.
>>> m=mySolver(b)
>>> m.run()
(output omitted)
Run contains a number of optional arguments:
  • I've left a few methods for you to implement: TournamentSelection and GreedyCrossover. TournamentSelection is pretty straightforward; GreedyCrossover is more challenging. (There are some hints in Crossover.py). You might want to get comfortable running the code first and start working on the next question, and then return to this at the end.


  • The bitstring problem is a good place to get started, because it's easy to see what the solution should be, but it's pretty boring. We don't really need all this infrastructure just to write a program that generates a string of all ones.

    This assignment also contains two "real-world" problem: the nurse scheduling problem and the traveling salesman problem. This will let you experiment with using a genetic algorithm to solve a more complex problem.

    The nurse scheduling problem can be stated as this: given a set of nurses n1 - nk, a set of shifts s1-sj, and a set of constraints c1-cm, find an assignment of nurses to shifts such that all constraints are satisfied. (Or, if this is impossible, such that the constraints are minimally violated.)

    We can represent the nurse scheduling problem as a matrix, with nurses on the rows and shifts on the columns. A one in a cell in the matrix indicates that the nurse is scheduled for that shift, and a 0 indicates he/she is not. for example:
        s1    s2    s3
    n1  0      1     1
    n2  1      0     0
    n3  0      1     0
    
    We can encode this as the bitstring 011100010.

    NurseSchedulingProblem.py creates a NurseSchedulingProblem by providing a set of contraints to a NurseFactory, which is repsonsible for creating these objects. By creating different constraints, you can specify different problems.

    To begin, run the GA with the nursing problem on a simple example to get the feel of it. I've provided you with two sample constraints: oneNursePerShift, which says that each shift should have exactly one nurse working it, and oneShiftEach, which says that each nurse should work exactly one shift. (for there to be a solution, there must be the same number of nurses as shifts).
    We can build a nurse scheduling problem like so
    constraintList = [oneShiftEach, oneNursePerShift]
    nurseProblem = nurseFactory(constraintList)
    b = BitStringGAProblem.BitStringGAProblem(nurseProblem,25)
    
    This will create a nurseProblem with 5 nurses and 5 shifts, and attach the two constraints. We can then run this as any other bitstring GA problem.

    You should add the following constraints: You should write constraint functions for each of these constraints. (In other words, write four separate functions.) These functions should return (0-number of constraints violated). For example, You will want to test each of these constraints separately.
  • The traveling salesman problem requires a different approach. The problem can be stated as follows: Given a set of cities c1-cn, and a set of distances dij between each pair of cities, find the shortest route that visits each city exactly once and returns to the starting city.

    This problem is typically represented as a graph, where cities are vertices and distances are edges. In using a GA to find a solution, we must decide how individuals are represented and altered.

    For this assignment, a TSP solution (implemented as TSPChromosome) is a list of vertices indicating the order they are visited, and the fitness is the corresponding tour. (an aside: this is a minimization problem, so you will want to design your fitness function to return MAXLENGTH - tour length). The number of possible solutions to this problem is the number of permutations of our list of vertices.

    The biggest challenge with using a GA to solve a graph problem is developing appropriate crossover and mutation operators. We can't just use single-point crossover, as the resulting individuals will not be proper tours. Instead, we use two graph-specific crossover operators: PermutationCrossover and GreedyCrossover. PermutationCrossover is implemented for you; GreedyCrossover is left for you to implement (pseudocode is provided.)
  • A challenge with GAs is knowing what parameters to use and what methodologies to choose. To address this, you should perform a set of experiments and prepare a report that summarizes your results.

    There are lots of potential experiments to perform; rather than explicitly list them all, I have provided some broad guidelines. Within these guidelines, you may choose how to explore the effect of different choices. You may want to focus on the time needed to discover a solution, the quality of the solution found, or both. You are free to modify or extend the code however you like, including adding functionality to capture or log partial results, print out answers, or add additional features. Please include a README that describes any changes you have made.

    Suggested Experiments: You may choose to do some or all of these experiments, or others that you find interesting. Quality is more important than quantity; a smaller number of carefully-crafted experiments with clear conclusions is more interesting than a large number of shallow experiments.

    What to turn in: You should turn in a report (in Word or PDF) that describes your experiments. Make sure you include enough detail that your experiments can be replicated (e.g. all relevant paramter settings). I would strongly recommend including a graph for each experiment. You may use whatever graphing software you like, but please be sure that your graphs are clear and legible. You should also make sure that you explain the results of your experiments. Don't just say "graph 3 shows compares roulette selection and tournament selection" - explain what your conclusions are.

    You will be graded on the thoroughness, quality, and presentation of your experiments. Typical ways students are marked down are unclear or incomplete graphs, little or no explanation of results, and a lack of depth in experiments. Perfect English is not expected, but we do need to be able to understand what you are trying to say.


  • Files

    Below are links to the files needed for this assignment: