Application of Game Theory to Neuronal Networks

Individual neurons are the building blocks for more complex neural circuits. In natural systems these more complex neural circuits interact with other components in a manifold of ways thereby generating the compellingly sensual world of behavior around us. Although tireless and tedious efforts in various disciplines culminated in fundamental insights in the field, there are still many unknowns about individual neurons and the processes in which individual neurons interact and organize themselves in neural circuits (e.g., [1]). Recently, game theory has obtained some attention in the field of neuroscience. The field of neuroeconomics, for instance, combines the two fields in experiments with human and nonhuman players in order to better understand human decision-making (e.g., [2]). This paper has a different motivation and proposes a neural network model under a concept of game theory where individual neurons are assumed to optimally behave with a given payoff matrix. The paper theoretically analyzes a paired neuron system and critically specifies that the value game theory may have as an organizing principle for such a system (in the sense of a guiding principle or mechanism involved in neural communication, organization, and synchronization). The paper also specifies a learning algorithm based on game theory for a paired neuron system, which is a major contribution in this text. In the remainder of this text, Section 2 summarizes the motivation for this paper and validates an intuitively appealing (though not unproblematic) relationship between game theory and biological/artificial neurons. Sections 3 and 4 investigate this relationship, the theory, and the major concepts and challenges involved in more detail, concentrating, among other things, on static and dynamic games of complete/perfect information. Section 5 applies game theoretic constructs to artificial neural networks and presents a learning algorithm based on game theory for network learning. The discussion in Section 6 revolves around related work and Section 7 ends the paper with a summary.


Introduction
Individual neurons are the building blocks for more complex neural circuits. In natural systems these more complex neural circuits interact with other components in a manifold of ways thereby generating the compellingly sensual world of behavior around us. Although tireless and tedious efforts in various disciplines culminated in fundamental insights in the field, there are still many unknowns about individual neurons and the processes in which individual neurons interact and organize themselves in neural circuits (e.g., [1]).
Recently, game theory has obtained some attention in the field of neuroscience. The field of neuroeconomics, for instance, combines the two fields in experiments with human and nonhuman players in order to better understand human decision-making (e.g., [2]). This paper has a different motivation and proposes a neural network model under a concept of game theory where individual neurons are assumed to optimally behave with a given payoff matrix. The paper theoretically analyzes a paired neuron system and critically specifies that the value game theory may have as an organizing principle for such a system (in the sense of a guiding principle or mechanism involved in neural communication, organization, and synchronization). The paper also specifies a learning algorithm based on game theory for a paired neuron system, which is a major contribution in this text.
In the remainder of this text, Section 2 summarizes the motivation for this paper and validates an intuitively appealing (though not unproblematic) relationship between game theory and biological/artificial neurons. Sections 3 and 4 investigate this relationship, the theory, and the major concepts and challenges involved in more detail, concentrating, among other things, on static and dynamic games of complete/perfect information. Section 5 applies game theoretic constructs to artificial neural networks and presents a learning algorithm based on game theory for network learning. The discussion in Section 6 revolves around related work and Section 7 ends the paper with a summary.

Game Theory, Biological Neurons, and Artificial Neural Networks
Our previous work in various areas (e.g., artificial intelligence, soft computing, reasoning under uncertainty, and neuroscience) identified that many cooperations between two agents (artificial or natural) can be interpreted or bear some of the characteristic features of a game. For example, the main concepts in a game are the players in a game, a set of rules by which the game is played, and an outcome in the form of a reward or a punishment (more generally referred to as a payoff) for the players in the game. In addition, 2 Advances in Artificial Intelligence a so-called payoff matrix is a common scheme to represent the dynamic behavior of a game. Figure 1 applies these key concepts to a coupled neuron system where the neurons are modeled to calculate their strategies according to their individual payoff matrix. (The scopes for game theory and neural networks are extremely wide. The paper therefore uses several abstractions and simplifications (e.g., the neuronal circuit models presented in this text are relatively basic, and in terms of game theory this paper concentrates on static games and dynamic games of complete/perfect information). At large, the paper does not suffer from this reductionism as the findings mentioned in the paper are relevant in a wider sense. London and Häusser [1], for instance, emphasize that the contribution of single neurons to computation in the brain has long been underestimated and that there is a need to investigate novel mechanisms that allow individual neurons to implement elementary computations.) Imagine that the two neurons in Figure 1(a) shall generate the following global behavior: if Neuron-1 fires, then Neuron-2 shall fire, and if Neuron-1 is at rest (not firing), then Neuron-2 shall be at rest (it is possible to assume an information exchange, unidirectional or bidirectional, via biochemical substances or electrical signals between Neuron-1 and Neuron-2). Figure 1(b) presents this behavior in a payoff matrix. The payoff matrix assigns a payoff (illustrated as a reward R or a punishment P) to each neuron for each combination of strategies (Fire, Rest). For instance, if Neuron-1 fires and Neuron-2 also fires, then each neuron obtains a rewarding payoff. (Traditionally, the payoff for Neuron-1 would be the left value in a matrix cell, and the payoff for Neuron-2 would be the right value in a cell. Note also that the payoffs in a cell need not be identical.) If the two neurons correspond with different strategies (e.g., Neuron-1 fires and Neuron-2 remains at rest or vice versa), then each neuron receives a punishment payoff P. Thus, if the goal for the two neurons in Figure 1(a) is to eventually demonstrate the global behavior Fire/Fire, Rest/Rest, then it is possible to assume the following: (i) if the two neurons demonstrate the desired behavior (Fire/Fire, Rest/Rest), then no action is required, and (ii) in case the two neurons do not demonstrate this desired way of interaction, then some corrective action has to be taken to achieve the desired global behavior. Again, this paper is not interested in the exact description of the biochemical processes (which are not known in their entirety anyway) that may achieve this mode of operation in biological neurons-the motivation here is to describe this global interaction via game theoretic concepts, perhaps involving additional models and abstractions for the two neurons in Figure 1(a). (The following book by Purves et al. [3] provides a comprehensive account of the state of art of neuroscience, and Chapter 1 of this book, which is dedicated to neural signaling, is particularly informative about many of the issues mentioned in this text.) On the other hand, it is crucial to understand that the payoff matrix in Figure 1 is a crude generalization. In reality, it is very difficult to find and specify exactly a payoff function for a game, which is a critical task in game theory (i.e., approximations are the norm rather than the exception).  Laying this issue aside, it is possible to provide a rather straightforward mathematical description for the modeling of the global behavior desired for the two neurons in Figure 1(a). To begin with Figure 1(a), it is necessary to understand that the communication between the two neurons in Figure 1(a) is a relatively simple, one-dimensional, linearly separable, and supervised learning classification task. Neuron-1 can either fire or be at rest, and Neuron-2 has to respond accordingly. It is possible to imagine a function f (x) where a value x ∈ R above a certain threshold value t ∈ R represents the firing state for Neuron-1 and, a value x ≤ t represents the resting state for this neuron (1) as Rest otherwise. (1) Collectively, it is possible to think of Neuron-1 and Neuron-2 as a simple input-output unit that behaves similar to a switch. In terms of its global behavior, a perceptron can be interpreted exactly in the same way. (It is not necessary to elaborate on the perceptron learning algorithm in great detail as this information is widely available in the neural network literature (e.g., in [4, pages 43-54]).) This does not mean, however, that the payoff matrix in Figure 1(b) can be implemented by a traditional perception. Figure 1(c) illustrates a model that is similar to a perceptron but incorporates elements from game theory that may allow this model to demonstrate the behavior Advances in Artificial Intelligence 3 illustrated by the payoff matrix in Figure 1(b). It is clear from Figure 1(c) that the decision-making process for this model involves some form of an input, an output, a transfer function, and a reward/punishment mechanism, all based on concepts from game theory. The forthcoming Section 5 provides a more detailed description for this model and the relationship illustrated in Figure 1 at large. The current focus is to describe the intuitive relationship between game theory, biological neurons, and artificial neural networks just mentioned in more detail and to elaborate on the various (fundamental) challenges involved in this relationship.

Game Theoretic Interpretations
In order to appreciate the forthcoming sections and to avoid unnecessary confusion, it is helpful to understand that game theory distinguishes between different types of games. At large, there are static games or dynamic games with complete information or incomplete information. If the payoffs and strategies available to other players are known and common knowledge to each player, as in Figure 1(b), then a game has complete information; otherwise, the game is classified as a game of incomplete information. Crucially, in a static game, players take their decisions simultaneously (individually and independently), they then move (not necessarily simultaneously but bound to the decisions they took) and then receive their payoffs. That is, the players in a static game are unaware about the strategies the other players in the game may choose but any player may hypothesize on the strategies other players may choose. (Marriage vows couples exchange to each other during a wedding ceremony may be a good example; the decisions are taken independently and the further proceedings of the ceremony unfold upon these decisions.) In a dynamic game, decisions are taken sequentially. In such a game, a player A may choose and act a particular strategy, and another player B who has observed player A may use this information for an appropriate response. (Chess is a typical example for such a game.) It is tempting now to immediately view and deal with Figure 1 as a dynamic game with complete information where the payoffs in the matrix are common knowledge between the players, and Neuron-2 reacts (sequentially) to the signal arriving from Neuron-1 (perhaps with other processes going on bidirectionally). There are several reasons, however, to initially treat Figure 1 as a static game with complete information. For one thing, Figure 1 is a rather extreme reduction and it is relatively easy to envisage more complex scenarios. The two neurons in the figure could be exchanged with the brains of two humans or, for that matter, with the complete computer simulation of such two brains, which is the dream of the Blue Brain Project at EPFL (École Polytechnique Fédérale de Lausanne). Another reason involves understanding and learning; it is better to begin with (somewhat simpler) games of complete information and then to move on to more challenging games (in terms of the theory involved). In any case, the forthcoming text benefits from this bottom-up approach as it helps to specify, more clearly, some of the subtleties involved in this investigation.
In terms of these subtleties, it is important to understand that several of the fundamental assumptions in game theory can be challenged intellectually with relative ease. Some of the reasons for this not only relate to the current example but also reach out deeper into the heart of game theory. These more sensitive (interrelated) concepts include rationality, simultaneity, equilibrium, and mixed strategies.
Rationality. Many of the formalisms in traditional game theory imply a degree of rationality by the players/agents involved in a game. As crucial as the notion of rationality is for the theory, the term rationality is not without problems. For one thing, the term rationality is not universally defined, and for another thing, human agents are often not the hyperrational agents the theory requires them to be. Many applications of game theory therefore involve abstractions and simplifications to various degrees. For instance, this happens when game theory is applied to the modeling of interactions in genes, viruses, or cells, as is the case in evolutionary game theory [5]. (Evolutionary game theory is an extension to classical game theory motivated by some of the more problematic issues discussed in this section. Though very interesting and with some relevance to this work, evolutionary game theory has not been dealt with in this text mainly for the sake of brevity.) Another interesting contribution to this discussion may come from the observation that people usually associate biological brains with higher cognitive functions such as learning or rational decision-making. As true as this may be, many people also carry the common misconception that such a task can only be achieved by organisms with highly developed nervous systems, i.e., with brains, which is incorrect. For example, there are instances of predictive behavior within microbial genetic networks where bacteria anticipate changing environments [6]. Bacteria, however, have no brains or nervous systems. Instead, these microbes experience and learn through evolutionary changes in their complex networks of interacting genes and proteins (i.e., the problem-solving potential is encoded, in part, in the architectural configuration of the system) [7]. Although the specific mechanisms for this problem-solving ability are largely unknown today, many would agree that such tasks should involve some form of memory. The recent euphoria devoted to so-called memristors (memory resistors) may shed some light on this topic in the future. In electronics, a memristor is a fundamental basic circuit element [8]. Importantly, through this element, nature seems to provide a form of memory for free. Naturally, the value memristors have for neural networks has been identified in some of the aforementioned and other works already (e.g., [9]).
Simultaneity and Equilibrium. These terms are problematic too and can quickly lead into a deep philosophical discussion. A root problem in Figure 1(a) seems to relate to the larger problem of existence and timing. The typical development process for artificial neural networks relates to this problem quite well too. The learning process for such networks usually starts with a network configuration and a random weight assignment. But how does nature determine the configuration for a network or the degree of connectivity?

Advances in Artificial Intelligence
And how does the network know about the point in time when operation begins? Are these tasks performed by a monitoring supervisory unit or do the neurons involved act with a degree of autonomy (and rationality)? A more distant view magnifies this point even more. An outside observer looking at the complete neural activity of a human being, or a human being in its entirety for that matter, witnesses a multitude of processes running in parallel/simultaneously, and it is not clear at all to this observer how these processes may relate to each other or how they are coordinated in detail. A full discussion of this problem is beyond the scope of this preliminary investigation but it is worthwhile to describe how the concept of equilibrium emerges in this context. It is difficult to imagine an observer that is able to grasp a human being in its entirety. It is possible, however, to imagine an observer witnessing the object under observation in a particular higher-level, abstract global state. Assume a state of equilibrium (e.g., defined by an energy minimum or some other form of optimization or stabilization). In nature, a system may naturally strive or converge for such an equilibrium. Game theory provides the concept of equilibrium too-the agents in a game acquire this equilibrium through rational thought. Whether such an equilibrium is a law in nature (e.g., similar to the concept of entropy in physics) is only a thought that shall be laid aside here.
Mixed Strategies. Imagine that for some reason Neuron-1 and Neuron-2 in Figure 1 have cooperated well over time. In this case the likelihood that Neuron-2 fires when Neuron-1 fires could be rather high. On the other hand, if for some reason their cooperation was relatively poor in the past, then the likelihood of a correct response may be low. It is important to understand that in both cases positive as well as negative responses are still possible (e.g., a relatively good cooperation over time may not entirely prevent undesired responses). Game theory uses mixed strategies for the modeling of such likelihoods, and from a purely theoretical point of view, they are rather important in game theory. For example, in any game where a player has to outguess the behavior (strategy) of any other player involved in the game (e.g., in poker or in the childhood game rock-paper-scissors), there is no Nash equilibrium [10, pages 29-33]. In such a game a player may select a strategy according to some likelihood (e.g., motivated by a hint, a tipoff, or some other piece of information that may be difficult to quantify). Game theory expresses a mixed strategy for a player as a probability distribution over some or all strategies available to a player (p i ) in a game. It is clear that in many cases probability distributions may not be available and that the exact quantification of likelihoods is a point of weakness in game theory. In such cases the term uncertainty is often more appropriate. This term, however, opens the door for various theories dedicated to the field of management of uncertainty and ultimately adds a touch of vagueness to the rigorous formal underpinnings game theory provides. Hampton et al. [11], for instance, present several update rules for mixed strategies in a neuroscience-related study with human players and the paper mentions several other sources where this has happened in the past. Anyhow, Figure 2 illustrates a case with mixed strategies (r, 1 − r) for Player-1 and (q, 1 − q) for Player-2. The hypothetical mixed strategy p 2 = (q, 1 − q) = (0.8, 0.2) for Player-2 may then be interpreted as Player-1's uncertainty that Player-2 may play strategy Fire with probability/likelihood 0.8 and strategy Rest with probability/likelihood 0.2. (Note that the terms player and neuron can be used interchangeably in the figure. In addition, the payoff matrix in Figure 2 with its numeric values is less general than that of Figure 1(b). This is for demonstration purposes only and does not impair the general conclusions presented in the forthcoming sections.) The remaining text in Section 3 analyzes the static game with complete information illustrated in Figure 2 in more detail and starts with Player-1's point of view of the game. (Gibbon's [10] book on game theory is a major resource in this work and those readers wishing to get further information about the game theoretic elements mentioned in this text are referred to that text.) Figure 3 illustrates Player-1's (Neuron-1's) view only. Player-2's payoff is irrelevant in this view; that is why it is omitted in Figure 3.

Player-1's (Neuron-1's) Point of View. For simplicity,
According to Figure 3, given that Player-1 believes that Player-2 will play the mixed strategy (q, 1 − q), then the expected payoff for Player-1 for playing the pure strategy Fire is Similarly, the expected payoff for Player-1 for playing the pure strategy Rest is (3) Figure 4 illustrates (2) and (3) in a single diagram. In order to understand the forthcoming arguments, it is important to always bear in mind that the main goal for each player is to obtain a maximum payoff in a game. Figure 4 illustrates that if q > 1/2, then f * (q) > g * (q) in which case Player-1 should play strategy Fire (see also Figure 3). On the other hand, if q < 1/2, then g * (q) > f * (q) in which case Player-1 should adopt strategy Rest. A special case exists for Figure 3: Viewpoint of Player-1 (Neuron-1). Figure 4: Decision-making support for Player-1 if Player-1 believes that Player-2 plays the mixed strategy (q, 1 − q). q = 1/2, which is the point where the two straight lines f * (q) and g * (q) intersect. In this case Player-1 is indifferent about which strategy to play. It is also possible to consider mixed strategy responses by Player-1. Player-1's expected payoff r * (q) from playing the mixed strategy (r, 1 − r) when Player-2 plays the mixed strategy (q, 1 − q) is the weighted sum of the expected payoff for each of the pure strategies (Fire, Rest) where the weights are the probabilities (r, 1 − r). According to Figure 3 this payoff amounts to What exactly is at stake here? At stake is the goal to maximize the payoff for Player-1 expressed by (4). The mixed strategy (r, 1 − r) is the parameter that provides Player-1 with a handle to work towards this maximum. Consider three cases: q = 0, q = 1, and q = 1/2 (i.e., the problem is to determine which values for r maximize r * (q = 0), r * (q = 1), or r * (q = 1/2) for Player-1). For q = 0, (4) gives r * (q = 0) = 1 − r. In this case r = 0 maximizes the term 1 − r. For q = 1, (4) gives r * (q = 1) = r, in which case r = 1 provides the maximum. Finally, for q = 1/2, (4) yields r * (q = 1/2) = 1/2. This term is independent of r and indicates that any response by Player-1 is a best response to Player-2's assumed strategy. Figure 5 summarizes all best responses by Player-1 if Player-2 plays mixed strategy (q, 1−q), and mixed strategy (r, 1−r) is available to Player-1.

Player-2's (Neuron-2's) Point of View.
This section describes Player-2's view from Figure 2. Overall, the steps are similar to those steps performed in the previous section. Given that Player-2 believes that Player-1 will play the mixed strategy (r, 1 − r), then the expected payoff for Player-2 when playing strategy Fire is The expected payoff for Player-2 for playing the pure strategy Rest is Figure 6 illustrates (5) and (6) in a single diagram. The interpretation of Figure 6 is similar to that of Figure 4. It is, however, important to carefully look at the labeling on the coordinate system axes. In Figure 6 the two straight lines f * (r) and g * (r) intersect at r = 1/2, indicating that for r = 1/2, Player-2 is indifferent about which strategy to play. Figure 6 then illustrates that Player-2 should play strategy Fire for r > 1/2 (because f * (r) > g * (r)) and strategy Rest for r < 1/2 (because g * (r) > f * (r)). Further, Player-2's expected payoff r * (r) from playing the mixed strategy (q, 1 − q) when Player-1 plays the mixed strategy (r, 1 − r) is (see Figure 2) The interpretation of (7) is similar to that for (4). Here, Player-2 has the mixed strategy (q, 1 − q) at his disposal in order to maximize (7). Consider the following three cases: r = 0, r = 1, and r = 1/2. For r = 0, (7) gives r * (r = 0) = 1 − q, and q = 0 generates the maximum for this term. Next, r = 1 gives r * (r = 1) = q, and q = 1 provides the maximum. Finally, r = 1/2 establishes r * (r = 1/2) = 1/2. This term is independent of q and so any response by Player-2 is a best response to Player-1's proposal. Figure 7 summarizes all best responses by Player-2 if Player-1 plays the mixed strategy (r, 1 − r). Figure 7 illustrates that if Player-1 plays mixed strategy (r, 1 − r), then Player-2's best response is to play (i) strategy Fire if r > 1/2, (ii) strategy Rest if r < 1/2, and (iii) any strategy if r = 1/2. Player-1 (Neuron-1) and Player-2 (Neuron-2). Figures 5 and 7 are quite similar and it is possible to combine both figures in a single diagram. Figure 8 emerges if Figure 7 is put on top of Figure 5 and additionally Figure 7 is flipped and rotated.

Nash Equilibrium for
The interesting features in Figure 8 include those points where r * (q) and r * (r) intersect (i.e., points (0, 0), (1/2, 1/2), and (1, 1)). What makes these three points important is that for each of these three points the strategy chosen by any of the two players involved is a best response to the strategy chosen by the other player, and this is the definition of a Nash equilibrium. Crudely, in a game played by n players, the strategies (s 1 , . . . , s n ) are in a Nash equilibrium if for each player i in a game strategy s i is a best response to the strategies (s 1 , . . . , s i−1 , s i+1 , . . . , s n ) specified for the n−1 other players in the game (e.g., see [10, pages 8-12 and 33-48]).
In the communicating neuron context of Figure 1(a), this means that if Neuron-1 fires then Neuron-2's best response is to fire too. If Neuron-1 is at rest, then Neuron-2's best response is to be at rest too. An interesting situation exists for point (1/2, 1/2). This situation may be interpreted as if Neuron-2 is unaware about the state (strategy) of Neuron-1, then Neuron-2 may play either strategy, and vice versa (i.e., the situation for each neuron/player is similar to the tossing of a coin).
At this moment, it may be useful to take a step back and to evaluate the results mentioned before a bit more carefully. The results are derived from a purely formal investigation of the (arbitrary) game illustrated in Figure 2. As discussed above, a neural behavior can be modeled under these game theoretic concepts. Whether these concepts can be theoretically generalized to neural systems with other arbitral payoff matrices is a question of debate. For example, the assumption that natural systems organize themselves according to the predictions of game theory (e.g., converge to or exploit Nash equilibriums) rather quickly leads back to the problems mentioned earlier in Section 3 (simultaneity, rationality, etc.). Consider a newly created or evolving biological neural network where new neurons emerge frequently (e.g., thousands of new neurons arise in the adult brain every day [12]). Some of these new neurons may be required to establish a way of communication with other neurons and it is difficult to imagine how this may work if there is no previous history between these neurons. Theoretically, for artificial neural networks, the situation is similar. Imagine a supervised learning scenario and an untrained network just provided with an initial random weight assignment. How does such a network know about a correct/incorrect classification outcome in the first place? The simple answer is that it knows from its supervisor (the network designer, developer, programmer, etc.). But who is the supervisor in nature? In nature, scientists often search for a guiding principle or law. This text does not suggest at all that game theory provides such a guiding principle, but it is necessary to create an awareness of the wider issues this work touches upon. Forthcoming sections relate back to some of the problems mentioned in this section but for the moment this text moves on to dynamic games.

Dynamic Games and Neural Circuit Dynamic
This section concentrates on dynamic games with complete and perfect information. Such games have three distinctive features: (i) the moves in the game occur sequentially, (ii) a sort of move history exists (i.e., all previous moves are observed before a next move is chosen), and (iii) the payoffs in the payoff matrix are known to all players in the game. Remember, a game has complete information if the content of the payoff matrix is common knowledge to all players in the game. A game has perfect information if every player has a record of the complete history of the game so far; otherwise, the game has imperfect information. Backwards induction is a general problem-solving strategy for such games and in many situations a game tree is a useful representation for a dynamic game. The game tree in Figure 9 represents an arbitrary dynamic two-move game played by two players (indicated as 1 and 2 in the figure).
The strategies for the players in Figure 9 are Left (L) and Right (R) for player one and Up (U) and Down (D) for player two. The numbers at the leave nodes at the bottom of the tree represent the payoffs for the players after traversing a particular route through the tree. The top number represents the payoff for player one, and the bottom number represents the payoff for player two. The game follows three rules; and Payoff Player-1: Payoff Player-2: Figure 9: A game tree for a simple two-move game. There are two players (1 and 2) and the numbers at the bottom of the tree represent the payoff for each player traversing a particular path. taken together, these rules are referred to as the extensiveform representation of the game.
(1) Player one decides on one of the available strategies (here, L or R).
(2) Player two observes this decision and decides on an appropriate strategy response (here, U or D).
(3) The players receive their payoffs.
Backwards induction works its way up from the bottom of the tree. Assume the position at the bottom of Path-1 where player one has decided to play strategy L and player two, who has observed this decision, is contemplating a response. The best response for player two is to play strategy U in which case player two receives the payoff 2 (instead of payoff 1), and player one recieves the payoff 1 (instead of payoff 2). Per definition, all information in the tree is available to all players (i.e., player one is aware that the response of player two is U if player one decides to play strategy L). Now assume the position of player two at the bottom of Path-2. In this case the best response for player two is again to play strategy U in which case player two receives the payoff 3 and player one the payoff 0. Player one can do some reasoning too. Between the two paths, and expecting best response decisions by player two, player one can expect a payoff of 1 for Path-1 (L) and a payoff of 0 for Path-2 (R). Each player aims for a maximum payoff and so player one decides to play strategy L. For player two, who is rational and aware of this thinking, the best response for this choice is to play strategy U. The pair (L, U) of best responses for player one and player two is referred to as the backwards-induction outcome of the game. This text mentioned earlier that there are different types and definitions for Nash equilibrium. In the type of dynamic game that just investigated the backwards-induction outcome of the game is the Nash equilibrium for the game (note that a game may have more than one Nash equilibrium). Figure 10 applies these notions to the neuron communication example (see Figure 1(a) and the payoff matrix in Figure 2). The number 1 in the figure represents Neuron-1 and the number 2 stands for Neuron-2. The strategies for both neurons are Fire (F) and Rest (R).
For the game tree in Figure 10, backwards induction produces two backwards-induction outcome pairs, namely,  Figure 10: Game tree for the communicating neuron example (Figures 1 and 2). Two neurons (1 and 2) and their payoffs for traversing a particular path. the pair (F, F) and the pair (R, R). Both pairs are a Nash equilibrium for the game. This result is not so surprising and correlates with those results produced in the previous Section 3. If Neuron-1 fires, then the best response for Neuron-2 is to fire too, and if Neuron-1 is at rest, then the best response for Neuron-2 is to be at rest too. It is necessary now to mention that game theory provides several possible extensions to the type of games presented in this section. A simple extension is games with longer sequences (perhaps an infinite number) of moves and more than two players. A complete treatment of all these features is well beyond the scope of this paper, and the reference section in this paper may direct the interested reader to further relevant information on these topics. Overall, however, the section provides several important insights. First, the findings in this section associate game theory and neural network dynamic intuitively well, and second, the issue of repetitive, longer sequences involving updates naturally leads to the issue of learning.

Game Theory and Neural Network Learning
In order to acquire a capacity for decision-making, a network has to evolve from an unorganized state to an organized (synchronized) state with the latter state demonstrating the desired problem-solving potential. The mechanism that drives artificial neural networks from an unorganized state to an organized state is typically realized by a learning algorithm. This section describes a learning algorithm based on game theory for artificial neural networks. The question marks in Figure 11 indicate that game theory provides two possible access points for a learning algorithm: (i) the payoffs in the payoff matrix (i.e., the payoff function), and (ii) the values for the mixed strategies.

Algorithm.
For the algorithm, imagine a one-dimensional, linearly separable, and supervised learning classification task. Figure 12 illustrates such a task. The classification scenario in Figure 12 takes place in an arbitrary real-valued x, y coordinate system. The classification scenario involves n objects and together these objects represent the training set for the learning algorithm (e.g., an object may represent a measurement of membrane potential in a neuroscience experiment and indicate whether a neuron is firing or in a resting state). The values measured for these objects have ?,? ?,? Figure 11: A learning algorithm for an artificial neural network based on game theory may exploit the payoffs in the matrix (i.e., the payoff function) and the mixed strategies. Figure 12: A one-dimensional, linearly separable, and supervised learning classification task.
been normalized such that for every object i yields x i ∈ [0, 1]. Let the black dots in Figure 12 represent objects of Class 1, and let the lined circles represent objects of Class 2; and let Class 1 indicate the resting state of a neuron and Class 2 indicate the firing state of a neuron. The two points P and P in the figure are division points. In their current positions, P correctly separates all objects into their corresponding classes, whereas P incorrectly classifies three Class 2 objects. At the start of a learning scenario, P may have been positioned randomly and in successive steps the learning algorithm may have moved this starting point (through various other points) until it finished in location P , which is a solution to the problem. Figure 13 projects these ideas into a game theoretic context. The figures in Figure 13 are similar to Figure 4 and represent Neuron-1's point of view. Remember, the mixed strategy (q, 1 − q) represents Neuron-1's uncertainty about Neuron-2 and the task for Neuron-1 is to establish (in a learning process) a model about the expected behavior (mixed strategy (q, 1 − q), payoff function) for Neuron-2. Further, every figure in Figure 13 includes two lines f 0 and either f Q or f R , which are all payoff functions. (Note that the forthcoming discussion now focuses on Figure 13(a) to 13(c).) Line f 0 is fixed and always remains unaltered during the learning process. In addition, f 0 represents the payoff function for Class 1 and so, per definition, the resting state for Neuron-1. The second line f Q is determined by the angle Q, where 0 ≤ Q ≤ 90 degree. This line represents the payoff function for Class 2 (i.e., the firing state for Neuron-1). The angle Q is derived by the function m : q = [0, 1] → Q = [0 • , 90 • ] (e.g., the value q = 0.5 corresponds to an angle Q = 45 • ). The learning process for Figure 13 is similar to the scenario mentioned for Figure 12. Figure 13(a) represents an initial random assignment for Q. Point P in this figure is at the intersection of f 0 and f Q . The learning algorithm will find out in the training phase that this point does not separate the two classes correctly and take appropriate action. In this case, the algorithm will increase the angle Q, which moves the intersection point further to the left. There may be several such steps until the algorithm arrives at point P in Figure 13 figure). However, any of these points yields f 0 (x l ) > f Q (x l ). That is, the payoff for f 0 (x l ) (rest) is always larger than the payoff for f Q (x l ) (fire). Therefore, Neuron-1 chooses to stay at rest for any such value. For similar reasons, for any object x r to the right of P , Neuron-1 chooses to fire, because for any such value, the payoff f Q (x r ) > f 0 (x r ). Equation (8) formalizes this outcome as follows:

Advances in Artificial Intelligence
where x P is the x coordinate of intersection point P and in general the separation point determined by the learning algorithm. For the sake of completeness, Figure 13(d) illustrates a possible scenario from the viewpoint of Neuron-2. This scenario is similar to Figure 13(a) but in this figure it is Neuron-2 that has just received an initial random assignment for the angle R. The task for the learning algorithm now is to establish a model for Neuron-2 about the expected behavior (mixed strategy (r, 1 − r), payoff function) for Neuron-1. It is not necessary to provide a detailed description for these processes for Neuron-2 because of the general symmetry of the system. (Note that this does not mean necessarily that Neuron-1 and Neuron-2 learn on the same data. Many of the examples in this text are high-level abstractions of natural systems where (i) information exchange between two neurons can be unidirectional, bidirectional, inhibitory, excitatory, and effect neuronal differentiation, (ii) unconventional neurotransmitters can provide signaling from postsynaptic cells back to presynaptic cells, or (iii) chemical signaling is not limited to synapses only (e.g., signaling may involve the secretion of chemical signals onto a group of nearby target cells). Thus, a measurement of data (e.g., a particular molecular concentration or a particular biochemical or electrical signal) at Neuron-1 related to q/Q or r/R may correspond to a related event involving the same or different components at Neuron-2. (For more detail, see [3,Unit 1,Neural Signalling]).) It is important, however, to understand what Section 5 achieved. The section formalized a learning algorithm in the game theoretic framework such that a paired neuron system can establish a synchronized way of communication. The learning algorithm determines the payoff functions for the payoff matrix as well as the mixed strategies for the neurons involved. This is an interesting and novel outcome according to our current understanding of the field. (It is clear that the presented algorithm shares many similarities with traditional neural network algorithms (e.g., the perceptron learning algorithm). The presented model, however, goes beyond traditional models where a neuron is modeled as an accumulator of multiple inputs (e.g., such as the McCulloch-Pitts neuron, which has been a basis of neural networks for some time). The paper mentioned already that, in reality, there are still many unknowns about individual neurons communicating with other neurons (e.g., an individual neuron is not just a neuronal membrane; for instance, it includes complex molecular circuits and well-organized structures, such as dendritic trees [1]). It is necessary, therefore, to develop the theoretical concept of individual neurons beyond the accumulative neuron model (e.g., by proposing a neural network model where individual neurons are assumed to optimally behave according to concepts from game theory). In addition, although the proposed model is for a paired neuron system only, the model has the potential to be expanded for the utilization to more complex networks. For example, the angles Q, R, etc. in Figure 13 lead to trigonometric functions (e.g., the division point x P in Figure 13(c) can be determined from cos(Q ) and P ). Learning algorithms for more complex multilayer networks (e.g., the backpropagation algorithm) rely heavily on derivatives (e.g., those of a transfer function). The derivatives for trigonometric functions are easy to obtain and this is certainly beneficial for potential expansions of the proposed approach to more complex network structures. A treatment of such potential expansions, however, is outside the scope of this paper.)

Related Work
This section initially repeats an important fact that has been mentioned several times in this text already, namely, that the scopes for neuroscience and game theory are quite rich and rather complex in their own right, and that this paper, consequently, can only present a condensed view of the many challenges involved in the wider context of this investigation. A second important statement in this section is the finding that although there is work combining game theory and neuroscience, according to our understanding, the two fields have not been combined in the way presented in this paper. For example, the relatively young field of neuroeconomics combines the two fields in experiments with human and nonhuman players (e.g., see Sanfey et al. [13] for a somewhat briefer introduction to neuroeconomics or Krüger et al. [2] who reviews this topic quite well). One assumption in the field is that one of the tasks of the human nervous system is to facilitate successful interaction in complex environments and that the process in essence is a decision-making process. Körding [14] describes that the value decision theory, which is formally well defined, may have for generating a better understanding of the processes going on in the nervous system during these interactions. The paper introduces the basic concepts of decision theory and emphasizes Bayesian decision theory because this theory, according to Körding, provides a compact and elegant formalism and contains other properties (e.g., its ability to handle uncertainty) that may suit studies in neuroscience well. Works by Sanfey [13,15] or Hampton et al. [11] indicate other interesting research directions in neuroeconomics. A common feature in these papers is studies in which decision-making is based on game theoretic models that are mathematically well understood (e.g., Prisoners' Dilemma, Trust Game, or Ultimatum Game) and where the neural activity of participating players is recorded via established methods (e.g., functional magnetic resonance imaging). A major goal in these studies is to relate brain areas and fundamental brain mechanisms with decision-making tasks. Interesting results include those findings where outcomes disagree with theoretical predictions as is the case when emotions such as anger, frustration, or greed, which are generally difficult to quantify and to describe mathematically, come into play because such findings may challenge basic game theoretic assumptions and definitions. For example, one study [16] measuring activation in the anterior insula (a brain region involved in emotional processing) of players participating in the so-called Ultimatum Game contradicts the concept of rationality mentioned in Section 3. The results from this study indicate that players may act irrationally (in a game theoretic sense) if other players act in an antisocial or unacceptable way (e.g., a player may not accept an indecent, unfair, or greedy offer). An outcome may also deviate from a predicted outcome if nonhuman players are involved (e.g., a program running on a desktop PC or a robot-like device), which may be interesting for people working in human computer interaction.
The possible application of game theory to fields such as human computer interaction indicates that game theory has long left its traditional environment-economy and human decision-making (the famous mathematician John Forbes Nash was awarded, jointly, the Nobel Prize in Economics in 1994 for his work in game theory). Today, the theory is widely applied in the natural sciences for the modeling of a rich variety of biological games involving agents of various types. Indeed, the principles of the theory are general enough to attract cutting-edge research in artificial intelligence or systems biology in applications where web-based intelligent agents or robots may have to wrestle with complex decisionmaking problems [17] or where evolutionary game theory investigates the interplay between evolutionary dynamics and biological games [5]. For this work it is important to understand that the term rational may not be utilized with ease in these domains and the term uncertainty often softens stricter demands (e.g., those coming from probability theory). Applications in artificial intelligence and evolutionary game theory therefore are permeated by techniques from soft computing (genetic algorithms, fuzzy logic, etc.), which makes it tempting to foresee the inclusion of some of these techniques into the model proposed in this work.
Although it is clear that several other interesting studies could be mentioned here, this review section wants to draw to an end by commenting, briefly, on the timing of games. This paper dealt with static and dynamic games in a separate way and this treatment may have given the impression that a system, over time, always sticks to one type of game, which is questionable. Consider the timing of games in a different context. Take a tournament where the teams A and B are two teams among several other teams. Imagine not only that team A and team B meet in the early qualifying stages of the tournament, that team A beats team B during these qualifying stages, but also that both teams survive qualifying and later meet again in the final, which is won by team B (e.g., in the 2008 Olympic Games, this was the case for the women's softball teams of Japan and the US. Team of Japan lost in the early stages against the team from the US but won the gold medal in the final against team of US.) Anyhow, if the team coaches elaborate on their strategies in the qualifying stages, then this analysis may have the form of a static game, whereas in the final, both teams have met before and so the coaches find themselves as game theoretic dynamic game analysts. How does this relate to neural networks? Take the case of an untrained neural network (natural or artificial) again. If the network is untrained (without history), then preliminary assumptions may come from a static game perspective. At a later point in time, some neurons in the network may have cooperated in the past in some way, and for their further interaction, dynamic game concepts may be applicable. A further treatment of this line of thought is beyond the scope of this paper but we feel that the accumulated information in this review section at large provides several pointers for further research.

Summary
The paper presented a novel concept for describing individual neurons under the game theoretic framework. The paper created a firm understanding about some of the fundamental problems in game theory and emphasized that these problems are not unique to the domain of neural systems, but that these problems reach out more deeply into game theory, science, and the world around us. The paper demonstrates that various strategic game theoretic concepts and calculations seem to be naturally suitable for the modeling of the behavior of a paired neuron system (and possibly for more complex networks). This finding was further solidified through the specification of a novel learning algorithm based on game theory for the purpose of neural learning.