Yinsh
An advanced YINSH player
7.1 Heuristics
Some simple AI examples were shown during the previous chapters. To create a more advanced AI player which could perform well on the tournament,
a good evaluation function was necessary.
In an attempt to create an interesting AI, the following factors were concidered when giving an
heuristic value to a gameboard:
- Won ring difference: the most important feature calculates the difference in won rings
- Consecutive markers connected to a ring: for both white and black, the number of 4 consecutive coloured markers connected to an owned ring is counted. No further check is done whether the ring can actually turn these markers into a row, these advanced checks are done at another measure point. These values can then be weighted separately for owned and enemy color. (p1 & p2)
- Average distance between the center point and the owned rings: the average distance between every owned ring and the center is counted and weighted. This measures the central dominance on the gameboard, which seems to be an important feature according to professional players. (p3)
- Average distance between the center point and the owned markers (p4)
- Difference in amount of owned and enemy markers: although having more markers in your color than the enemy does not necessarily makes you more powerfull on the board (since it raises the risk of markers being flipped into a row for the enemy), it certainly is an important feature in the game. (p5)
- Opportunity row scoring:
- Having a ring right in between a total of 4 markers is an oppurtunity since when the ring would jump away, a row of 5 markers is achieved. This opportunity is calculated and weighted for both white and black. (p6 & p7)
- Having a ring next to a row of 4 enemy markers is an opportunity since the ring can turn them into an owned row and score a point. The difference with the 2nd feature is that the ring doesn't need to be connected immediately to the row. This opportunity is calculated and weighted for both white and black. (p8 & p9)
- Having a ring in line of a row of 4 owned markers is an opportunity since it can be extended to a 5-marker-row. This opportunity is calculated and weighted for both white and black. (p10 & p11)
- Having a ring in line of a marker which can be switched and would therefore create a row and score a point is interesting, since human players often have difficulties detecting these risks. This opportunity is calculated and weighted for both white and black. (p12 & p13)
This returns in a total of 14 parameters which were trained using the evoluation algorithm. To speed up training, the first parameter (difference in won rings) was set to a very high value and not being adjustable for training, meaning that having one ring more than the opponent would always be more important than any other feature. All the other parameters were initialized on 1.0, and trained to a value between [0.0,10.0[.
Also, a distance calculator was created to calculate distances between several locations on the gameboard (in order to determine p3 & p4).
Since we are dealing with points in a space where only movement along axes is possible, the manhattan distance was used. However, due to the choice of representation of the gameboard a special implementation was needed for correctly calculating the distance between two datapoints.
The Manhattan distance calculates the distance between two points measured along axes at right angles. In a plane with p1 at (x1 , y1) and p2 at (x 2, y2), it is |x1 - x2| + |y1 - y2|. Since our axes are actually not equally set up as in the YINSH gameboard, we get the following errors:

Figure 7.2 illustrates the technique used to calculate the Manhattan distance.

Figure 7.2 - : when you want to know the distance between 2 locations, gradually move towards the other location first along the X-axis, and when that is equal, then along the Y-axis. Every move increases the total distance with 1
This results into the following AI:
7.2 Evolution
To train the weights of the heuristics, an evolution was started using 30 chromosomes (5 default, 25 randomized) during 30 generations which lasted around 6 hours. Fitness was calculated simulating 50 games on white and 50 games on black. The fitness score is calculated as follows: every win provides 3 points and every draw 1 point. The points are then added up and squared to strengthen small differences in win results. During this evolution, the tree search was set off and the move that lead to the board with the highest heuristic value was returned. Figure 7.3 illustrates the perceived evolution.

Figure 7.3 - The perceived evolution
It is clear that there is an evolution in the weights and the fitness, although the evolution of the fittest chromosome goes quite slowly. The variance in the fittest chromosome is due to the randomized play of the opponent in combination with the fitness function which strenghtens the difference in the amount of won games (since only 100 games are played for calculating the fitness). As an estimation, training for at least 400 generations is needed to evolve towards an optimal solution.
The learned weights however prove that the difference in amount of markers is indeed not an important feature and prove the importance of having rings close to enemy markers or markers that can be switched to form a row.

When matching up the advanced AI (playing the move leading into the board with the highest heuristics) against the basic AI, 500 games lead into 405 wins (and 3 draws).
7.3 AI Behaviour
The advanced AI's behaviour has 2 different stages:
- ring placement
- ring movement
Since the placement of the rings takes in total 10 turns, it is very hard to use
tree search to define the best moves. Therefore, a ruleset was written to place the 5 rings on the board which uses the following tactic:
Since the center of the board is very important, the AI will try to put as many rings in the center, avoiding putting markers on the same line (since that might block his movement). This is achieved by following a spiral web around the center. When such a move is no longer possible, the AI will try to place his rings as close to an opponents ring and still close to the center.
During the ring movement stage, the AI performs a tree search with a fixed depth of 3 plies, combined with the trained heuristics for defining the best move.
Although the AI proved to be very powerful during the tournament (esp since the low depth search compared to other competitors), some flaws in the behiaviour showed up:
- the AI has difficulties with scoring some specific types of rows. However, one should concider that the AI does not always score a row as long as the opponent could score afterwards. In such cases the AI will first try to block the opponents scoring, as long as his own row remains or his own situation after both removals is more powerful.
Since the heuristics are used to evaluate both own boards and opponents boards, this also implies the risk of not blocking (of even helping) an enemy row formation. - The AI rarely scores an opponent row while that had no benefit at all
When this advanced YINSH player is matched up against the random player for 20 games, 2 conclusions can be made. Firstly, the advanced AI beats the random AI very quickly: on average, 28.9 turns are needed. Secondly, the number of expanded nodes decreases very fast and its averages rawly take the shape of a reciprocal function (cfr figure 7.5). Note that measuring was started at turn 10, since the placement of the rings doesn't use tree search for deciding the move.

Figure 7.5