Game theory is a framework for analyzing the outcome of the strategic interaction between decision makers^{1}. The fundamental concept is that of a Nash equilibrium where no player can improve his payoff by a unilateral strategy change. Typically, the Nash equilibrium is considered to be the optimal outcome of a game, however in social dilemmas the individual optimal outcome is at odds with the collective optimal outcome^{2}. This means that one player can improve his payoff at the expense of the other by unilaterally deviating, but if both deviate, they end up with lower payoffs. In this type of games, the mutually beneficial, but nonNash equilibrium strategy is called cooperation. However, in this context cooperation should not be interpreted as an interest in the welfare of others, as players only aim to secure a high payoff for themselves.
In this framework, payoff maximization is considered to be rational, but when such rational players then seize every opportunity to gain at the opponent’s expense, they may counterintuitively both end up with low payoffs. A game that clearly exhibits this contradiction is the Traveler’s Dilemma. Since its formulation in 1994 by the economist Kaushik Basu^{3}, the game has become one of the most studied in the economics literature. Additionally, it has been discussed in theoretical biology in the context of evolutionary game theory.
In general, the dilemma relies on the individuals’ incentive to undercut the opponent. To be more specific, players are motivated to claim a lower value than their opponent to reach a higher payoff at the opponent’s expense. Such incentive leads players to a systematic mutual undercutting until the lowest possible payoff is reached, which is the unique Nash equilibrium. It seems paradoxical that players defined as rational in a game theoretical sense end up with such a poor outcome. Therefore, the question that naturally arises is how this poor outcome can be prevented and how cooperation can be achieved.
To address these questions, it can be helpful to better understand price wars, which consist in the mutual undercutting of prices to gain market share. In addition, it can provide information about human behavior, because economic experiments have shown that individuals prefer to choose the cooperative high payoff action, instead of the Nash equilibrium^{4}.
Our analysis focuses on showing that the Traveler’s Dilemma can be decomposed into a local and a global game. If the payoff optimization is constrained to the local game, then players will inevitably end up in the Nash equilibrium. However, if players escape the local maximization and optimize their payoff for the global game, they can reach the cooperative high payoff equilibrium.
Here, we show that the cooperative equilibrium can be reached in a game like the Traveler’s Dilemma due to diversity, which we define as the presence of suboptimal strategies. The appearance of strategies far from those of the residents allows for the local maximization process to be escaped, such that an optimization at a global level takes place. Overall this can lead to cooperation because by considering “suboptimal strategies” that play against each other it is possible to reach higher payoffs, both collectively and individually.
Game
The Traveler’s Dilemma is a twoplayer game. Player i has to choose a claim, (n_i)from the action space, consisting of all integers on the interval [L, U]where (0 le L < U). The payoffs are determined as follows:

If both players, i and jchoose the same value ((n_i = n_j)), both get paid that value.

There is a reward parameter (R>1)such that if (n_i < n_j)then i receives (n_i + R) and j gets (n_i R)
Thus, the payoff of the player i playing against players j is
$$begin{aligned} pi _{ij} = {left{ begin{array}{ll} n_i& text { if } n_i = n_j\ n_i + R& text { if } n_i < n_j n_j  R& text { if } n_i > n_j end{array}right. } end{aligned}$$
(1)
Thus, a player is better off by choosing a slightly lower value than the opponent: when j plays (n_j)then it is best for i to play (n_j1). The iteration of this reasoning, which we will call the stairway to hellleads to the only Nash equilibrium of the game, ({L,L}), where both players choose the lowest possible claim. The classical game theory method to reach this equilibrium is called iterative elimination of dominated strategies^{5}.
The game can be visualized through its payoff matrix (Fig. 1). For simplicity, we use the values from the original formulation: (L=2), (U=100) and (R=2). The payoff matrix shows that the Traveler’s Dilemma can be decomposed into a local and a global game. Let us begin with the local game. When the action space of the game is reduced to two adjacent actions n and (n+1) (black boxes in Fig. 1), the Traveler’s Dilemma with (R=2) is equivalent to the Prisoner’s Dilemma^{6}. In general, for any value of Rthe Traveler’s Dilemma becomes a Prisoner’s Dilemma for any pair of actions n and (n+s)where ( 1 le s le R1 ). For example, for (R=4) the pair of actions n and (n+1), n and (n+2), n and (n+3) follow the same game structure as the Prisoner’s Dilemma. Therefore, the Traveler’s Dilemma consists of many embedded Prisoners’ Dilemmas. This means that at a local level the game is a Prisoner’s Dilemma.
If we now consider actions that are distant from each other in the action space, eg 2 and 100, we can observe a coordination game structure (gray boxes in Fig. 1), where ({100,100}) is payoff and risk dominant^{7,8}. In general, any pair of actions n and (n+s)where ( R le s le Un), construct a coordination game. As a result, the Traveler’s Dilemma becomes a coordination game at a global level, which has different equilibria than the local game.
Paradox
Social dilemmas appear paradoxical in the sense that selfinterested competing players, when rationally playing the Nash equilibrium, end up with a payoff that clearly goes against their selfinterest. But with the Traveler’s Dilemma, the paradox goes further, as suggested in its original formulation^{3}. Classical game theory proposes ({L,L}) as the Nash equilibrium of the game. However, it seems unlikely and implausible that, with R being moderately low, say (R=2), for individuals to play the Nash equilibrium. This has been confirmed in economic experiments where individuals rather choose values close to the upper bound of the interval. Such experiments have also shown that the chosen value depends on the reward parameter (R), where an increasing value of R shifts players’ decision towards ({L,L})^{4}. Nevertheless, classical game theory states that the Nash equilibrium of the game is independent of R.
Consequently, the aim of this paper is to seek and explore simple mechanisms through which the apparent nonrational cooperative behavior can come about. We also examine the effect of the reward parameter on the game’s outcome. Given that the Traveler’s Dilemma paradox emerges in the classical game theory framework, we analyze the game using evolutionary game theory tools^{5,9,10}. This dynamical approach allows us to explore adaptive behavior outside of the stationary classical game theory framework. To be more precise, for this approach individuals dynamically adjust their actions according to their payoffs.
The key point of course is to understand how the system can converge to high claims. We show that this behavior is possible because the Traveler’s Dilemma can be decomposed into a local and a global game. If the payoff maximization is constrained to the local level, then the stairway to hell leads the system to the Nash equilibrium; given that locally the game is a Prisoner’s Dilemma. On the other hand, at a global level the game follows a coordination game structure, where the high claim actions are payoff dominant. Thus, for the system to reach a high claim equilibrium, the maximization process needs to jump from the local to the global level.
Our analysis led us to identify the mechanism of diversity as responsible for enabling this jump and preventing players from going down the stairway to hell. This mechanism works on the idea that to reach a high claim equilibrium, players have to benefit from playing a high claim. For a population setting, it means that players need to have the chance to encounter opponents who are also playing high. From a learning model point of view, it refers to the belief that the opponent will also play high, at least with a certain probability. If the belief is shared by both players, they should both play high and reach the cooperative equilibrium. Here, we explore these two types of models to unveil the mechanism leading to cooperation.
Populationbased models reveal diversity as the cooperative mechanism via the effect of mutations on the game’s outcome. This is shown for the replicatormutator equation and the Wright–Fisher model. Similarly, a twoplayer learning model approach, more in line with human reasoning, shows that if players are free to adopt a higher payoff action from a diverse action set during their introspection process, they can reach the cooperative equilibrium. This result is obtained using introspection dynamics.
Finally, we explain how diversity is the underlying mechanism that enables the convergence to high claims in previously proposed models. To be more precise, we show that diversity is required because it allows for the maximization process to jump from the local to the global level.