Sunday, May 26, 2013

Chess programmers and harpists

Harpists spend 90 percent of their lives tuning their harps and 10 percent playing out of tune.
Igor Stravinsky

My computer just finished a second tuning run trying to optimize the search control parameters in my engine. This time it run for another 364 hours. I optimized the parameter set a bit, removed dead parameters (parameters belonging to a feature that the first run decided to disable) and added a few new ones. I also lowered the learning rate and modified the fitness function a bit.













Sunday, May 12, 2013

Population Based Incremental Learning and Chess - Pt. II

Population Based Incremental Learning (PBIL) is a variant of  a genetic algorithm. In the recent past I've used it a lot to improve my chess engine iCE.

The two main factors responsible for personality and strength within a chess engine are evaluation and search.

Search is responsible to analyze the game tree of possible future chess positions and selecting the most promising path. As the game tree is huge and exponentially growing with increasing depth it has to decide which parts of the tree should get more focus than others.

Evaluation is responsible for assigning a numerical value to a chess position that is expresses the winning probabilities of both sides. Search will select a path that has the best winning probability as determined by the evaluation.

Both features (especially evaluation) are heavily parametrized and selecting the right parameters is the one million dollar question. Intuition, game play performance, chess theory knowledge, crystal balls and sometimes voodoo are used to select the parameters. Every change (especially the small ones) requires extensive testing. Often 30.000 games and more are played just to see whether a small change in one of the parameters made the engine stronger or weaker. To make things worse most of the parameters are connected somehow. Changing one parameter might require to change other parameters as well to keep the whole thing in balance.

As I consider playing millions of games boring and I really don't have the resources to do that anyway I came up with the idea to just hand my engine over to the evolution and let nature take its course. PBIL is excellent in optimizing large parameter sets. Instead of maintaining a population it maintains a probability vector of population properties. This is much simpler but at least as effective as classic GAs.

Here are the results of my first approach to optimize the parameter sets used by Search.

 My search framework is controlled by 35 parameters that require 176 bits. Those parameters control the search extensions (e.g. check extension), late move reductions, null move, internal iterative deepening, lazy evaluation and futility pruning.

The current search parameter set of iCE that should be optimized through PBIL looks like

 ID | PARAMETER NAME                     | VALUE |   MAX
----+------------------------------------+-------+-------
  0 | EXTEND_BY_CAPTURE_TO_PAWN_EG       |    30 |     31
  1 | EXTEND_IN_CHECK                    |    10 |     15
  2 | EXTEND_BY_CHECK_SNGL_RESPONSE      |     5 |     15
  3 | EXTEND_BY_CHECK_DBL_RESPONSE       |     1 |     15
  4 | EXTEND_BY_PAWN_PUSH_7              |    10 |     15
  5 | S0MP_DEPTH                         |     3 |      7
  6 | S0MP_MARGIN_1                      |     0 |  1.023
  7 | S0MP_MARGIN_2                      |     0 |  1.023
  8 | NULL_MOVE_DEPTH                    |     1 |      3
  9 | NULL_MOVE_IN_PV                    |     0 |      1
 10 | NULL_MOVE_MARGIN                   |    63 |    127
 11 | NULL_MOVE_THRESHOLD_R3             |     7 |     15
 12 | F_PRUNING_DEPTH                    |     5 |      7
 13 | F_PRUNING_1                        |   120 |    511
 14 | F_PRUNING_2                        |     0 |    511
 15 | F_PRUNING_3                        |   255 |    511
 16 | F_PRUNING_4                        |     0 |    511
 17 | F_PRUNING_5                        |     0 |    511
 18 | F_PRUNING_6                        |     0 |    511
 19 | LMR_DEPTH                          |     2 |      7
 20 | LMR_IN_PV                          |     0 |      1
 21 | LMR_GOOD_MOVE_REDUCTION            |     1 |      3
 22 | LMR_BAD_MOVE_REDUCTION             |     2 |      3
 23 | LMR_ADDITIONAL_REDUCTION_NT        |     1 |      3
 24 | LMR_MOVES_SEARCHED_PV              |     0 |     31
 25 | LMR_MOVES_SEARCHED_NON_PV          |     0 |     31
 26 | LMR_QUIET_MOVES_SEARCHED_PV        |     0 |     31
 27 | LMR_QUIET_MOVES_SEARCHED_NON_PV    |     0 |     31
 28 | IID_DEPTH                          |     5 |      7
 29 | IID_ONLY_PV                        |     1 |      1
 30 | IID_REDUCTION_PV                   |     2 |      3
 31 | IID_REDUCTION_NON_PV               |     2 |      7
 32 | QS_DELTA_CUTOFF_MARGIN             |   200 |  1.023
 33 | LE_USE                             |     0 |      1
 34 | LE_MARGIN                          | 2.047 |  2.047
----+------------------------------------+-------+-------


PBIL settings:

Population Size:    32
Generations:        1100
Learning Rate:      0.018 - 0.040 (depending on the progress)
Neg. Learning Rate: 0.000
Elitism:            Yes
Mutational Prob.:   0.01 - 0.05 (depending on the progress)
Fitness Function:   Knock-Out Tournament (0.25 sec per move)
Run Time:           365 hrs

The starting entropy with 176 bits is 61. Here all bits have a 50/50 chance to flip either towards 0 or 1. In the course of the evolution the entropy dropped. Bits increased their chances to flip either towards 0 or 1. At the end of the evolution about half of the bits were fully converged. So their chance of being 0 or 1 was close to 100%.


To monitor the progress of the evolution I measured from time to time the behavior and performance of the generated population . One interesting but not surprising fact is that the generated engines in later generations reached higher search depths given a constant time control.


Average Search Depths reached for Mid Game positions for moves 10. - 25.

Another fact is that the diversity within the population decreased. This could be seen with an increasing probabilities for draws and also that the games played in the tournament took longer.


Non-Draw Percentage of Games

Finally I measured the real playing strength of the engines after each 100 generations. Here I played an engine with the current state of the evolution against the base version with the original weights. Both engines used an identical evaluation and only differed in the values of the search parameters. The evolution improved the playing strength very rapidly at first. In later generations it went in and out of local optima.


It looks like the generation #1100 were I stopped the evolution is not the strongest one. The global peak was probably reached around generation #900. ("probably" because those data points have an error bar of about -20/+20 ELO).

Next steps: I will optimize the parameter set using the results from the first run. Some features are over determined and some others are under determined, so this requires some adjustments. I will further try to remove some randomness by modifying the fitness function a bit.

And then give it another run. 

Saturday, May 4, 2013

To Null Move or not to Null Move

 
After tuning the evaluation weights I'm currently using a PBIL based evolution to tune the parameters within the search framework of iCE. It includes control parameters for Extensions, Futility Pruning, LMR, IID, Delta Pruning, Lazy Eval and of course Null Move.

Part of the genom is a bit for most features that allows the evolution to disable a feature or restrict it to PV or None PV nodes.




The evolution is still running but is already pretty converged. During the course of the run I realized that the evolution is unsure whether the Null Move should be performed always or restricted to Non PV nodes. This comes a bit as a surprise because I never considered null moving in a PV node even theoretical sound (I just included the bit in the genom for completeness).

I checked a few other engines including stockfish and crafty and although crafty is using null move extensively it excludes it in PV nodes. The only engine that seems to do it is Texel


In a PVS framework PV nodes are rather rare, so it might make little difference whether Null Moves are used here or not. Maybe this is the reason the evolution has a hard time to converge this bit. But I'm still wondering whether it breaks something if used. A PV that contains illegal moves is surely questionable.

If the final genom has it on, I will probably have to spend some additional asserts in the framework.

How are others approaching that ?

Followup: 

I performed a test of the final genom with and without null moving in PV and it made almost no difference. Therefore the evolution was not able to converge that. The one that used null move in PV nodes performed 4 ELO better but well within the error bar. So for the moment I will disable it.