The Evolution of Cooperation in a Hierarchically Structured,

Two-Issue Iterated Prisoner’s Dilemma Game:

An Agent-Based Simulation of Trade and War

1 October 2004

Abstract:

The study of the emergence of cooperation in anarchy has neglected to address the implications of multiple issues linked within a hierarchical structure. In many social situations, actors simultaneously (or sequentially) interact in several issue areas. For example, countries typically interact with their neighbors on both security issues and economic issues. In this situation, there is a hierarchy among the issues in that the costs and benefits in one issue (war) surpass the costs and benefits in the other issue area (trade). This article explores the impact of multiple, hierarchically structured issue areas on the level of cooperation in an iterated prisoner’s dilemma game using a computer simulation. Three findings emerge from the analysis. First, while altering the payoffs in only one issue area has a marginal impact on overall cooperation, sustaining high levels of cooperation across the population requires improving incentives to cooperate in both issue areas. Second, cooperation can be sustained even when the payoffs are based on relative power. Third, the rate of learning is critical for the emergence and spread of cooperative strategies.

David L. Rousseau

Assistant Professor

Department of Political Science

235 Stiteler Hall

University of Pennsylvania

Philadelphia, PA19104

E-mail:

Phone: (215) 898-6187

Fax: (215) 573-2073

and

Max Cantor

Department of Political Science

and the School of Engineering and Applied Science

University of Pennsylvania

Philadelphia, PA19104

E-mail:

INTRODUCTION

Since Axelrod’s (1984) landmark study, we have learned a lot about the evolution of cooperation and the utility of agent-based computer simulations to explore the evolutionary process.[1] One issue that has been neglected in this growing literature is the impact of multiple issue areas on the emergence of cooperation. In many social situations, actors simultaneously (or sequentially)interact inseveral issue areas. For example, countries typically interact with their neighbors on both security issues and economic issues. In this example, there is a hierarchy among the issues in that the costs and benefits in one issue (war) surpass the costs and benefits in the other issue area (trade). This article explores the impact of multiple, hierarchically structured issue areas on the level of cooperation in an iterated prisoner’s dilemma game using a computer simulation. Three findings emerge from the analysis. First, while altering the payoffs in only one issue area has a marginal impact on overall cooperation, sustaining high levels of cooperation across the population requires improving incentives to cooperate in both issue areas. Second, cooperation can be sustained even when the payoffs are based on relative power. Third, the rate of learning is critical for the emergence and spread of cooperative strategies.

While our findings are generalizable to any hierarchically linked issue areas, we focus on the implications of multiple issue areas in international relations. Over the last several centuries the sovereign state has emerged as the dominant organizational unit in the international system (Spruyt 1994). During this evolutionary period, sovereign states and their competitors have struggled to identify an optimal strategy for maximizing economic growth and prosperity. In general, states have pursued some combination of three general strategies: (1) war, (2) trade, or (3) isolation. For example, realists such as Machiavelli(1950) argued that military force is an effective instrument for extracting wealth and adding productive territory. In contrast, liberals such as Cobden (1850, 518) argued that international trade was the optimal strategy for maximizing economic efficiency and national wealth. Finally, economic nationalists such as Hamilton(Earle 1986) rejected this liberal hypothesis and argued that isolation from the leading trading states rather than integration with them would enhance economic development. The three policies are not mutually exclusive. For example, List promoted both military expansion and economic isolation for Germany in the 19th century (Earle 1986). However, in general, trade and war are viewed as hierarchical because states rarely trade with enemies during military conflict. While some instances have been identified (e.g., Barbieri and Levy 1999), many systematic studies indicated that military conflict reduces international trade (e.g., Kim and Rousseau 2004).

The importance of issue linkage has long been recognized. Axelrod and Keohane (1986, 239) argued that issue linkage could be used to alter incentive structures by increasing rewards or punishments. Although this idea of issue linkage has been explored withrational choicemodels (e.g., Morgan 1990; Lacy and Niou 2004) and case studies (e.g., Oye 1986), modeling issue linkage as a two-stage game within an N-person setting has largely been neglected.[2]

Conceptualizing trade and war as generic two stage game raises a number of interesting questions that will be explored in this article. If trade and war are the two most important issues in international relations, how does the hierarchical linkage between the two issues influence the amount of cooperation in the international system? What factors encouraged (or discouraged) the rise of a cooperative order? What strategies might evolve to facilitate the maximization of state wealth in a two issue world? Finally, how stable are systems characterized by hierarchical linkages?

COOPERATION IN ANARCHY

Many interactions among states are structured as “Prisoner’s Dilemmas” due to order of preferences and the anarchical environment of the international system. The Prisoner’s Dilemma is a non-zero-sum game in which an actor has two choices: cooperate (C) with the other or defect (D) on them. The 2x2 game yields four possible outcomes that can be ranked from best to worst: 1) I defect and you cooperate (DC or the Temptation Payoff "T"); 2) we both cooperate (CC or the Reward Payoff "R"); 3) we both defect (DD or the Punishment Payoff "P"); and 4) I cooperate and you defect (CD or the Sucker's Payoff "S"). The preference order coupled with the symmetrical structure of the game implies that defection is a dominant strategy for both players in a single play game because defecting always yields a higher payoff regardless of the strategy selected by the opponent. Therefore, the equilibrium or expected outcome for the single play prisoner’s dilemma game is “defect-defect” (i.e., no cooperation). This collectively inferior equilibrium is stable despite the fact that both actors would be better off under mutual cooperation. The problem is neither actor has an incentive to unilaterally alter their selected strategy because they fear exploitation.[3]

Many situations in international economic and security affairs have been models as iterated Prisoner’s Dilemmas. In the trade arena, researcher tend to assume states have the following preference order from “best” to “worst”: 1) I raise my trade barriers and you keep yours low; 2) we both keep our trade barriers low; 3) we both raise our trade barriers; and 4) I keep my trade barriers low and you keep yours high. In the security sphere, scholars typically assume that states possess the following preference order: 1) you make security concessions, but I do not; 2) neither of us makes security concessions; 3) both of us make security concessions; and 4) I make security concessions but you do not. While scholars have identified particular instances in which preference orders may deviate from these norms (e.g., Conybeare 1986 and Martin 1992), the Prisoner’s Dilemma preference order is typical of many, if not most, dyadic interactions in the two issue areas.[4]

Students of game theory have long known that iteration offers a possible solution to the dilemma (Axelrod 1984, 12). If two agents can establish a cooperative relationship, the sum of a series of small "Reward Payoffs" (CC) can be larger than a single "Temptation Payoff" followed by a series of "Punishment Payoffs" (DD). The most common solution for cooperation within an iterated game involves rewards for cooperative behavior and punishments for non-cooperative behavior -- I will only cooperate if you cooperate. The strategy of Tit-For-Tat nicely captures the idea of conditional cooperation. A player using a Tit-For-Tat strategy cooperates on the first move and reciprocates on all subsequent. Axelrod (1984) argues that the strategy is superior to others because it is nice (i.e., cooperates on first move allowing a CC relationship to emerge), firm (i.e., punishes the agent's opponent for defecting), forgiving (i.e., if a defector returns to cooperation, the actor will reciprocate), and clear (i.e., simple enough for the agent's opponent to quickly discover the strategy). Leng (1993)and Huth (1988) provide empirical evidence from international relations to support the claim that states have historically used reciprocity oriented strategies to increase cooperation. Although subsequent work has demonstrated that Tit-For-Tat is not a panacea for a variety of reasons (e.g., noise, spirals of defection, sensitive to the composition of the initial population), reciprocity oriented strategies are still viewed as viable mechanisms for encouraging cooperation under anarchy.[5]

Although the emergence of cooperation can be studied using a variety of techniques (e.g., case studies, regression analysis, laboratory experiments), in this articlewe employ an approach that has been widely used in this literature: an agent-based computer simulation (Axelrod 1997; Macy and Skvoretz 1998; Rousseau 2005). Agent-based simulations are “bottom-up” models that probe the micro-foundations of observable macro-patterns. Agent-based models make four important assumptions(Macy and Willer 2002, 146). First, agents are autonomous in that no hierarchical structure exists in the environment. Second, agents are interdependent in that their welfare depends on interactions with other agents. Third, agent choices are based on simple rules utilizing limited information rather than complex calculations requiring large amounts of data. Finally, agents are adaptive or backward looking (e.g., if doing poorly in last few rounds, change behavior) as they alter characteristics or strategies in order to improve performance. Agent-based models are ideally suited for complex, non-linear, self-organizing situations involving many actors. Macy and Willer claim that agent-based models are “most appropriate for studying processes that lack central coordination, including the emergence of organizations that, once established, impose order form the top down” (2002, 148).

As with all methods of investigation, computer simulations have strengths and weaknesses.[6] On the positive side of the ledger, five strengths stand out. First, as with formal mathematical models, simulations compel the researcher to be very explicit about assumptions and decision rules. Second, simulations allow us to explore extremely complex systems that often have no analytical solution. Third, simulations resemble controlled experiments in that the researcher can precisely vary a single independent variable (or isolate a particular interaction between two or more variables). Fourth, while other methods of inquiry primarily focus on outcomes (e.g., do democratic dyads engage in war?), simulations allow us to explore the processes underlying the broader causal claim (e.g., how does joint democracy decrease the likelihood of war?). Fifth, simulations provide a nice balance between induction and deduction. While the developer must construct a logically consistent model based on theory and history, the output of the model is explored inductively by assessing the impact of varying assumptions and decision rules.

On the negative side of the ledger, two important weaknesses stand out. First, simulations have been criticized because they often employ arbitrary assumptions and decision rules (Johnson 1999, 1512). In part, this situation stems from the need to explicitly operationalize each assumption and decision rule. However, it is also due to the reluctance of many simulation modelers to empirically test assumptions using alternative methods of inquiry. Second, critics often question the external validity of computer simulations. While one of the strengths of the method is its internal consistency, it is often unclear if the simulation captures enough of the external world to allow us to generalize from the artificial system we have created to the real world we inhabit.

Given that we are primarily interested in testing the logical consistency and completeness of arguments, the weaknesses are less problematic in our application of agent-based model. That is, we are interested in probing whether traditional claims (e.g., increasing the reward payoff increases cooperation or relative payoffs decrease cooperation) produce expected patterns when confronting multiple, hierarchically structured issues. While we use the trade/war setting to illustrate the applicability of the model, we do not claim to be precisely modeling the international trade or interstate conflict. This is not to say that using agent-based simulations to model real world interactions is impossible or unimportant. We simply wish to use the technique to evaluate general claims about factors promoting or inhibiting cooperation.

OVERVIEW OF THE MODEL

In our agent-based model, the world or "landscape" is populated with agents possessing strategies that are encoded on a string or “trait set” (e.g., 00010010100110011001). Over time the individual traits in the trait set (e.g., the “0” at the start of the string)change as less successful agents emulate more successful agents. The relatively simple trait set with twenty individual traits employed in the model allows for over 1 million possible strategies. Presumably, only a small subset of these possible combinations produces coherent strategies that maximize agent wealth.

The structure of our model was inspired by the agent-based model developed by Macy and Skvoretz (1998). They use a genetic algorithm to model the evolution of trust and strategies of interaction in a prisoner’s dilemma game with an exit option. Like us, they are interested in the relative payoff of the exit option, the location of interaction, and the conditionality of strategies. Our model, however, differs from theirs in many important respects. First, unlike their single stage game, our model is a two stage prisoner’s dilemma game that links two issues in a hierarchical fashion (i.e., both a “war” game and a “trade” game). Second, our traitset differsfrom theirs because we have tailored it to conform tostandard assumptions about trade and war. In contrast, their trait set is more akin to “first encounter” situations (e.g., do you greet partner? do you display marker? do you distrust those that display marker?). Third, our model allows for complex strategies such as Tit-For-Tat to evolve across time.

Our simulation model consists of a population of agents that interact with each other in one of three ways: 1) trade with each other; 2) fight with each other; or 3) ignore each other. Figure 1 illustrates the logic of each of these encounters. The game is symmetrical so each actor has the same decision tree and payoff matrices (i.e., the right side of the figure is the mirror image of the left side of the figure). Each agent begins by assessing the geographic dimension of the relationship: if the agents are not immediate neighbors, then the agent skips directly to the trade portion of the decision tree. If the two agents are neighbors, the agent must ask a series of questions in order to determine if it should enter the war game. If it chooses not to fight, it asks a similar series of questions to determine if it should enter the trade game. If it chooses neither war nor trade, it simply exits or "ignores" the other agent. Both the war and the trade games are structured as prisoner's dilemma games.

**** insert Figure 1 about here ****

The model focuses on learning from one's environment. In many agent-based simulations, agents change over time through birth, reproduction, and death (Epstein and Axtell 1996). In such simulations, unsuccessful agents die as their wealth or power declines to zero. These agents are replaced by the offspring of successful agents that mate with other successful agents. In contrast, our model focuses on social learning.[7] Unsuccessful agents compare themselves to agents in their neighborhood. If they are falling behind, they look around for an agent to emulate. Given that agents lack insight into why other agents are successful, they simply imitate decision rules (e.g., don't initiate war against stronger agents) selected at random from more successful agents. Over time repeatedly unsuccessful agents are likely to copy more and more of the strategies of their more successful counterparts. Thus, the agents are “boundedly rational” in that they use short cuts in situations of imperfect information in order to improve their welfare (Simon 1982).[8]

The fitness of a particular strategy is not absolute because its effectiveness depends on the environment in which it inhabits. Unconditional defection in trade and war is a very effective strategy in a world populated by unconditional cooperators. However, such an easily exploited environment begins to disappear as more and more agents emulate the more successful (and more coercive) unconditional defectors. While this implies that cycling is possible, it does not mean it is inevitable. As the simulation results demonstrate, some populations are more stable across time because they are not easily invaded by new strategies. .