The algorithm used to solve Mastermind is basically a stepwise optimal algorithm, which, between steps, tries to minimize the distance to an optimal combination, that is, one that meets all the rules.
First of all, this Genetic Mastermind evolves combinations directly, without representing them in any way; each combination is a string of valid colors, and all genetic operators give way to other valid combinations. Actually, Genetic Mastermind uses EO (Evolving|Evolutionary objects) , which can evolve any object, provided you can assign it a fitness and change and/or combine them with each other.
Several operators that increment and decrement diversity are used: transpose, which applies a permutation, and is applied with priority 2; mutation, which substitutes one of the colors by the next, circling back to one if it is the last color, with priority 1; and crossover, which interchanges the central segment of 2 combinations, applied with priority 2. These priorities are normalized to one, which means that each step transposition is applied to 20% of the new combinations, and crossover and mutation to 40% each. Optionally, a zapping operator can also be applied; this operator substitutes a color for a different one. This operator was not used because the circular mutation searches in a more systematic way.
A steady state and rank-based selection scheme is used. Each generation, the 50% worst combinations are eliminated; and they are substituted by the offspring resulting from the application of the genetic operators to the remaining. The score of each combination as shown in section 1 was used as fitness; the basic idea was to score each combination by how far they were from meeting all rules. As will be shown in the results, this makes the fitness landscape smoother, as compared to the fitness used before in ; and thus easier to search over; on the other hand, this fitness makes the landscape really dynamic. As soon as one combination is played, its fitness becomes negative: it will have ``all blacks'' with respect to itself, and thus its score will be minus the length of the combination plus the number of black pegs less the number of white pegs. On the other hand, this needs not be bad by itself: it inmediately changes the fitness of all the population, increasing diversity, and thus a reinitialization of the population is seldom necessary. Besides, a crowding scheme was also used: each new combination only substituted the most similar to it within the pool of combinations that is going to be eliminated.
The default population was 400, which is also an improvement over our previous work, where 500 and even 10000 individuals were used. This is due to the new fitness mentioned before: it made possible to have many different fitnesses, so that the population was more diverse in fitness space. Besides, if a certain number of generations, set to 15, passed without finding the a consistent combination, the population was all killed and new individuals were generated. This is evidently a drastic measure, but it was a valid way of increasing diversity; besides, it was rarely used in actual runs. The evolution of maximum and minimum fitness, together with average and online fitness are shown in figure 3.
In general, the main problem of this algorithm is to maintain diversity: if many generations transcur without finding a consistent combination, the population pool stagnates, and is likely to need a reset. The problem was solved in part with the introduction of the new fitness, and might be used by increasing the mutation rate; however, this will make the algorithm similar to random search; that is why we preferred to use a single hypermutation step if it was found necessary.
Two kinds initial combinations were tested: the first contained all possible colors (that is, colors from 1 to L, being N the number of colors), or half the available colors (as proposed by Knuth in ). The justification for the first combination was intuitive: the answer would give, at least, the number of colors present in the secret code; the justification for the second is theoretical: it will give, at least, as much information as the former. The results with both will be shown in the next section.