Connect-K Final Report

Team Name: JJJunior

Team Member: Xi Huang ID : 93727753, Junkyu Lee ID : 78525009

The heuristic scoring method (evaluation function) we adopted is based on the number of connected lines of shorter lengths than K. Initially, we formulate heuristic scores for all types of combinations in a line, and whether lines are blocked by opponent or not (see attached heuristic scoring table). We used 7 features to evaluate the board state, and the weight vector of each features were selected to address the importance of its role. 6 features counts number of connected line in length of K-4, K-3, K-2, and K-1. The last feature counts potential number of ways to win given board state. Also, the length of K-1, and K-2 lines are so special that we treat them as an indicator of winning or losing. By experiment, we found several rules that lead a player to win based on such K-1 and K-2 length lines (see attached action generation and evaluation rules table). In short, we tried to generate the heuristic evaluation scores as the best estimate of current board state as possible. Reflecting the rules we clarified, the evaluation score do provide defensive and offensive strategy of players. Furthermore we used heuristic scores to prune forward before expanding node, and check whether the state is quiet or not in quiescence search. For example, if current play have an action that generates two simultaneous K-1 length lines and opponent does not have any way to finish the game at the next turn, we don’t generate other actions since the player can win the game in next turn. Such observations are summarized as a rule and consistently reused for score evaluation, forward pruning, and quiescence search.

Alpha-Beta pruning is implemented in a standard way using recursive calls. We didn’t gathered statistics, but as a rule of thumb, 10 ~ 25% of total node counts were pruning nodes. For the Iterative Deepening Search, we utilized two methods. To speed up the checking winners during Alpha-Beta pruning, which is required when the K value is in the range of 4 to 6, we pre-calculated all possible ways to win and compare that with the current board to determine the winner in nearly constant time. So our AI has less dependency on board size. Secondly, we used hash tables (Java HashMap) to save evaluation scores, and important board statistics not to repeat same routine for repeated states during search. This is especially helpful for iterative deepening search, because the evaluation scores of previous layer can be reused at next layer. Since the number of possible states grows exponentially, we maintained the size of hash table to be 10,000. As the size of hash table grows, table accessing price was higher than re-calculating and we selected the value 10,000 based on experiments.

Dead line enforces our search depth to be shallow. In most cases the depth couldn’t go deeper than depth 4 in 9x11x6 board configuration. Quiescence search addresses this issue, by checking current cut off state is reliable or not. To check whether the current board is quiet or not, heuristic scores are reused in different manner (See attached Quiescence search criteria). In essence, if current state is middle of battle or the next state the opponent can raise a dangerous move, for example, two simultaneous K-2 lines, the offset of depth limit is increased to one, and the maximum offset was set to two for computational efficiency. Quiescence search, combined with powerful heuristic function provides better situational awareness to AI player, since it helps AI to avoid over estimating current situation.

To improve the project, we suggest extending current naïve game rule to address Renju rules, which eliminated perfect winning sequences of the 1st player. Also we are interested in improving the current AI architecture to encompass the knowledge based agents to factor out reusable rules that are hard coded as heuristic functions.

Heuristic Scoring Table

Name / Pattern / Scoring Scheme
*(K-4) / One side blocked length K-4 line / Reserved
(K-4) / Both side opened length K-4 line / Reserved
*(K-3) / One side blocked length K-3 line / +10 per each lines
(K-3) / Both side opened length K-3 line / +10 per each lines
p.t.w. / “promotions to win”
2 actions required to connect K / (p.t.w).^2 * 100
e.g.) if (p.t.w.)=2, then 400
P.T.W. / “Positions To Win”
1 action required to connect K / (P.T.W).^2*100
e.g.) if (P.T.W.)=2, then 400
Alive / Number of all potentially alive lines to connect K / +1 per alive line in the board

Action generation and evaluation rules

Action Gen. Rules / Evaluation Schemes [player’s perspective]
Valid actions around opponent pieces / -
Valid actions within manhattan distance 2 from player’s pieces / -
If there’s Best actions, the generate Best Actions only. / Case1 : (pPTW > 0)
Case2: (pPTW ==0) ^ (oPTW >0)
If there’s Better actions, the generate Better Actions only / Case1: (nextpPTW>=2)
Case2: (nextPTW>=1) ^(nextpptw > pptw)
Case3: (pptw==0)^(optw>0)^(nextoptwoptw)
Case4: (pptw==0)^(optw==0)^(nextpptw>=pptw+2)
If there’s Good actions, the generate Good Actions only / Case1: (nextpPTW>0)
Case2: (netpptw>0)
Case3 :opponent’s Best actions, Better actions
If there’s no Best, Better, and Good actions, sort valid actions using evaluation scores / Sort actions w.r.t. evaluation score of the board
l  pPTW : player’s Positions To Win, oPTW: opponent’s Positions To Win
l  pptw : player’s promotions to win, optw : opponent’s promotions to win
l  nextpPTW : player’s Positions To Win after having this action
l  Best Actions : A player can finish the game if put the stone at particular position, or lose the game if don’t put the stone at particular position
l  Better Actions : A player can make *(K-1)^*(K-1), or (K-1), or (K-2)^(K-2). If a player don’t put the stone at particular position, opponent will have (K-1) at the next turn.
l  Good Actions : A player can make *(k-1) or (K-2), which leads continuous offense

Quiescence Search Criteria

Criteria to Enter Quiescence Search / Evaluation Schemes [player’s perspective]
Opponent’s Board has threat positions,
Length (K-3) and (K-4) lines / 8x (K-3) + 4x (*K-3) + 2x(K-4) + (*K-4)
>= Threshold (18)
Opponent will have Best or Better Actions at next turn / Case 1: Opponent has Bet Actions at next turn
Case 2: Opponent has Better Actions at next turn