Neural Networks for Opponent Modeling in Texas Hold Em Poker

Neural Networks for Opponent Modeling in Texas Hold’em Poker

John Pym

PSY/ORF 322

5/9/2005

Abstract

In this project, a neural network was used to model opponent behavior in the pre-flop stage of a game of Texas Hold’em poker. A neural network predictor was used to calculate the probabilities of each of a player’s possible actions, either a raise, call, or fold, based on context information such as the size of the pot, the number of players in the game, and the player’s hand. Next, Bayesian logic was used to determine a probability distribution of the player’s likely hands. This probability distribution was calculated by iterating through each of the possibletwo card hands, and adding the predicted probability of the opponent’s action to a weight table. This weight table was then normalized to obtain a probability distribution of the opponent’s possible hands. The predictor’s accuracy was evaluated by training with a set of 5000 training data points consisting of pre-flop contexts and actions, and its prediction accuracy was tested on a separate set of 5000 test data points. This resulted in an overall prediction accuracy of 62.0%, almost twice the accuracy that one would expect to achieve using a random predictor.

Introduction

The study ofstochastic games of imperfect information is a rapidly growing area of artificial intelligence research. Unlike most of the games traditionally studied by artificial intelligence researchers, part of the information required to determine an optimal strategy is hidden from the player, requiring the player to infer unknown information from the context of the game. Additionally, stochastic games of imperfect information involve uncertainty about the outcome of a player’s actions, which adds many complexities not present in other games. Texas Hold’em poker is a classic example of such a game, involving both uncertainty and imperfect information. Texas Hold’em poker involves many of the intricacies associated with stochastic games of imperfect information, and its study is applicable to a broad range of artificial intelligence problems.

A game of limit Texas Hold’em poker is played with a standard deck of 52 cards, and is usually played with between two and ten players. In most versions of the game, the two players to the left of the dealer begin the game by making forced bets called blinds. Each of the players are then dealt two cards face down, and a round of betting follows, in which each player can either call the blind, raise, or fold their hand. Next, three community cards are dealt, and all players can use these cards to make the best possible hand. This is known as the flop, and is followed by another round of betting. Two more community cards, known as the turn and the river, are then dealt, each followed by another round of betting. After the last round of betting, all of the players who have not folded out of the hand show their cards, and the player with the best five-card hand, consisting of any combination of the community cards and the player’s hole cards, wins the money in the pot.

Correct strategy for Texas Hold’em depends greatly on the knowledge of opponents’ behavior. The ability to guess what actionanother playeris likely to take in a particular situation is very important in making decisions such as whether to bluff, slow-play, or value-bet a hand. Even more important is the ability to accurately guess what hands an opponentis likely to be holding. This allows the player to determine how far ahead or behind he is in a hand, which is a key factor in choosing the appropriate action. The prediction of an opponent’s action in different situations is highly related to the task of inferring his potential hole cards. While both of these are extremely challenging tasks for human players and artificial intelligence programs alike, many expert players have mastered these skills. The ability to accurately infer another player’s potential hole cards is known as “putting people on hands,” and the skill is highly dependent on the knowledge of how a player acts in various situations. Analyses of poker situations frequently involve discussion about what hands an opponent would have played in the manner observed, and this is used to make guesses about what hands the opponent is actually holding.

This project attempts to simulate this type of reasoning in order to obtain a mathematical probability distribution of the likely hole cards that an opponent may hold during the pre-flop stage of a game of Texas Hold’em poker. A predictor was implemented using an artificial neural network, which provided the probabilities of the opponent raising, calling, or folding. Given the opponent’s actual action, the predictor was used to create a weight table containing the probabilities of this action for each of the opponent’s possible hole card combinations. This table was then normalized such that the all of the entries in the table summed to 1.0, resulting in a probability distribution for each of the possible hands that the opponent may hold.

Previous work

The study of poker in artificial intelligence research is a rapidly growing field, led primarily by the Computer Poker Research group at the University of Alberta. This group has implemented several highly sophisticated poker bots, which are capable of winning against most human players at Texas Hold’em poker. These include Pokibot, a bot that uses a game theoretical approach to calculate a pseudo-optimal strategy, as well as Vexbot, an bot that uses an adaptive algorithm to determine its strategy.[1] Additionally, this group has published several papers dealing specifically with opponent modeling. In his masters’ thesis, Aaron Davidson used several statistical and adaptive approaches to model opponent behavior, including a simple neural network.[2]

Methods

This project involved the implementation of two components: a neural network predictor of opponent behavior and a method for creating a weight table with a distribution of likely hands held by an opponent. Additionally, the implementation of a parser was required to read hand history data from files for analysis. All of these components made heavy use of several open source Java packages, which included the University of Alberta’s Computer Poker Research Group’s Java repository (available at and an artificial neural network package written by Joseph A. Huwaldt (available at Additionally, data from the IRC poker server, collected by Michael Maurer between 1995 and 2001 and available at was used to train and test the predictor.

The University of Alberta group’s Java packages provide many of the basic poker evaluation functions. This includes the classes PlayerInfo and GameInfo, which store information about the players and game of poker, as well as hand evaluator and dealer classes. The java packages also include severalclasses for storing AI-related data, which include a weight table for storing the probabilities associated with every pair of cards, a probability triple class for storing the probabilities of a raise, call or fold action, and a context class containing information about the context in which an action is taken. Additionally, an interface for defining a predictor object is provided, which provides the outlines of the functions implemented in a predictor class, allowing user-defined predictor objects to interact with other classes in a standardized way.

The neural network predictor was initially implemented using the Fast Artificial Neural Network (FANN) library, which is available at Because this library is implemented in C, however, it required the use of calls to System.loadLibrary() to import a system library. Because such usage is highly system dependent, and because of the limited functionality of the FANN library, the neural network was replaced with the jahuwaldt.tools.NeuralNets package, available at This package provides several object-oriented tools for implementing neural networks, including several types of neuron classes with sigmoidal and hyperbolic tangent functions, and feed forward networks with back propagation and scaled conjugate gradient training algorithms. For this project, a simple feed forward network with back propagation and basic neurons was used. Because of the highly object-oriented nature of the jahuwaldt.tools.NeuralNets package, however, the network could be implemented using various other types of neural networks.

The Bayesian calculation of opponent hand probability distributions was implemented as a public method in the weight table class. This function takes as input the Predictor object to be used to generate the weight table, as well as a Context object containing all of the relevant context information for the player except the two hole cards. For each entry in the weight table, the Predictor is evaluated for the context and the hole cards associated with the weight table entry. A probability triple object is returned by the predictor, and the value associated with the opponents actual action is selected and inserted into the weight table. After the iteration through the table is finished, the table is normalized so that the probabilities sum to 1.0. This yields a probability distribution of the possible hands held by the opponent.

Results

Because of the lack of an effective method for evaluating the accuracy of the weight table given an opponent’s actual cards, the prediction accuracy was assessed instead in terms of the predictor. The predictor was trained using a set of 5000 training points, which are passed to the predictor class in the form of context objects. These training points represented pre-flop betting decisions, and the information used to train the neural network included several factors from the context, including hand strength, pot size, and the player’s position. The neural network predictor achieved an overall accuracy of 62.0%, correctly predicting 3099 of the 5000 test data points. This is a significant improvement over the 33.3% accuracy that would be expected from a random predictor choosing one of the three possible actions at random. For individual actions, prediction accuracy was similar. The neural network predictor correctly classified 39.3% of the 1205 raises, 59.3% of the 2646 calls, and 91.8% of the 1150 folds.

Further Work

The calculation of a weight table containing a probability distribution of an opponent’s hands is an important element of poker strategy. One of the most obvious applications of such a table is the implementation of an artificial intelligence strategy for full scale poker. Most artificial intelligence strategies involve the calculation of the expected value of a particular action, and a weight table of hand probabilities is an important element of such a calculation. Existing strategies that use this technique include the University of Alberta’s Pokibot and Lokibot poker bots, which decide on a move based on a pseudo-optimal calculation of the expected value of a move.[3]

Additionally, a weight table of likely hands could be used to create a decision tool for online poker players. Human players frequently have difficulty estimating the probabilities of events, especially when Bayesian logic is involved. Such a tool would display the probabilities of hands during an actual game of online poker. While such a tool would not be a complete artificial intelligence system, this would take care of a very large part of the player’s decision, allowing the player to focus on other aspects of the game.

Conclusion

The high accuracy of the neural network predictor showed that accurate prediction of opponent behavior in poker is possible using neural networks. The predictor has several applications in artificial intelligence, including both as a component in a complete artificial intelligence system, and as a decision tool for aiding online players in their decisions. The behavior of an opponent is highly related to the prediction of the hands he is likely to be holding, and a function for calculating this distribution is implemented.

Works Cited

Davidson, Aaron. “Opponent Modeling in Poker: Learning and Acting in a Hostile Environment.” M.Sc. thesis, 2002.

[1] See for more information.

[2] Davidson, Aaron. “Opponent Modeling in Poker: Learning and Acting in a Hostile Environment.” M.Sc. thesis, 2002.

[3] see for more information.