Publication

Reaching Correlated Equilibria Through Multi-agent Learning

Related concepts (24)

In game theory, the Nash equilibrium, named after the mathematician John Nash, is the most common way to define the solution of a non-cooperative game involving two or more players. In a Nash equilibrium, each player is assumed to know the equilibrium strategies of the other players, and no one has anything to gain by changing only one's own strategy. The principle of Nash equilibrium dates back to the time of Cournot, who in 1838 applied it to competing firms choosing outputs.

Intelligent agent

In artificial intelligence, an intelligent agent (IA) is an agent acting in an intelligent manner; It perceives its environment, takes actions autonomously in order to achieve goals, and may improve its performance with learning or acquiring knowledge. An intelligent agent may be simple or complex: A thermostat or other control system is considered an example of an intelligent agent, as is a human being, as is any system that meets the definition, such as a firm, a state, or a biome.

Multi-agent system

A multi-agent system (MAS or "self-organized system") is a computerized system composed of multiple interacting intelligent agents. Multi-agent systems can solve problems that are difficult or impossible for an individual agent or a monolithic system to solve. Intelligence may include methodic, functional, procedural approaches, algorithmic search or reinforcement learning. Despite considerable overlap, a multi-agent system is not always the same as an agent-based model (ABM).

Coordination game

A coordination game is a type of simultaneous game found in game theory. It describes the situation where a player will earn a higher payoff when they select the same course of action as another player. The game is not one of pure conflict, which results in multiple pure strategy Nash equilibria in which players choose matching strategies. Figure 1 shows a 2-player example.

Game theory

Game theory is the study of mathematical models of strategic interactions among rational agents. It has applications in all fields of social science, as well as in logic, systems science and computer science. The concepts of game theory are used extensively in economics as well. The traditional methods of game theory addressed two-person zero-sum games, in which each participant's gains or losses are exactly balanced by the losses and gains of other participants.

Reinforcement learning

Reinforcement learning (RL) is an area of machine learning concerned with how intelligent agents ought to take actions in an environment in order to maximize the notion of cumulative reward. Reinforcement learning is one of three basic machine learning paradigms, alongside supervised learning and unsupervised learning. Reinforcement learning differs from supervised learning in not needing labelled input/output pairs to be presented, and in not needing sub-optimal actions to be explicitly corrected.

Strategy (game theory)

In game theory, a player's strategy is any of the options which they choose in a setting where the outcome depends not only on their own actions but on the actions of others. The discipline mainly concerns the action of a player in a game affecting the behavior or actions of other players. Some examples of "games" include chess, bridge, poker, monopoly, diplomacy or battleship. A player's strategy will determine the action which the player will take at any stage of the game.

Q-learning

Q-learning is a model-free reinforcement learning algorithm to learn the value of an action in a particular state. It does not require a model of the environment (hence "model-free"), and it can handle problems with stochastic transitions and rewards without requiring adaptations. For any finite Markov decision process (FMDP), Q-learning finds an optimal policy in the sense of maximizing the expected value of the total reward over any and all successive steps, starting from the current state.

Deep reinforcement learning

Deep reinforcement learning (deep RL) is a subfield of machine learning that combines reinforcement learning (RL) and deep learning. RL considers the problem of a computational agent learning to make decisions by trial and error. Deep RL incorporates deep learning into the solution, allowing agents to make decisions from unstructured input data without manual engineering of the state space. Deep RL algorithms are able to take in very large inputs (e.g.

Evolutionary game theory

Evolutionary game theory (EGT) is the application of game theory to evolving populations in biology. It defines a framework of contests, strategies, and analytics into which Darwinian competition can be modelled. It originated in 1973 with John Maynard Smith and George R. Price's formalisation of contests, analysed as strategies, and the mathematical criteria that can be used to predict the results of competing strategies. Evolutionary game theory differs from classical game theory in focusing more on the dynamics of strategy change.

Bayesian game

In game theory, a Bayesian game is a strategic decision-making model which assumes players have incomplete information. Players hold private information relevant to the game, meaning that the payoffs are not common knowledge. Bayesian games model the outcome of player interactions using aspects of Bayesian probability. They are notable because they allowed, for the first time in game theory, for the specification of the solutions to games with incomplete information. Hungarian economist John C.

Multi-armed bandit

In probability theory and machine learning, the multi-armed bandit problem (sometimes called the K- or N-armed bandit problem) is a problem in which a fixed limited set of resources must be allocated between competing (alternative) choices in a way that maximizes their expected gain, when each choice's properties are only partially known at the time of allocation, and may become better understood as time passes or by allocating resources to the choice. This is a classic reinforcement learning problem that exemplifies the exploration–exploitation tradeoff dilemma.

Non-credible threat

A non-credible threat is a term used in game theory and economics to describe a threat in a sequential game that a rational player would not actually carry out, because it would not be in his best interest to do so. A threat, and its counterpart - a commitment, are both defined by American economist and Nobel prize winner, T.C. Schelling, who stated that: "A announces that B's behaviour will lead to a response from A. If this response is a reward, then the announcement is a commitment; if this response is a penalty, then the announcement is a threat.

Evolutionarily stable strategy

An evolutionarily stable strategy (ESS) is a strategy (or set of strategies) that is impermeable when adopted by a population in adaptation to a specific environment, that is to say it cannot be displaced by an alternative strategy (or set of strategies) which may be novel or initially rare. Introduced by John Maynard Smith and George R. Price in 1972/3, it is an important concept in behavioural ecology, evolutionary psychology, mathematical game theory and economics, with applications in other fields such as anthropology, philosophy and political science.

Polynomial

In mathematics, a polynomial is an expression consisting of indeterminates (also called variables) and coefficients, that involves only the operations of addition, subtraction, multiplication, and positive-integer powers of variables. An example of a polynomial of a single indeterminate x is x2 − 4x + 7. An example with three indeterminates is x3 + 2xyz2 − yz + 1. Polynomials appear in many areas of mathematics and science.

Polynomial ring

In mathematics, especially in the field of algebra, a polynomial ring or polynomial algebra is a ring (which is also a commutative algebra) formed from the set of polynomials in one or more indeterminates (traditionally also called variables) with coefficients in another ring, often a field. Often, the term "polynomial ring" refers implicitly to the special case of a polynomial ring in one indeterminate over a field. The importance of such polynomial rings relies on the high number of properties that they have in common with the ring of the integers.

Machine learning

Machine learning (ML) is an umbrella term for solving problems for which development of algorithms by human programmers would be cost-prohibitive, and instead the problems are solved by helping machines 'discover' their 'own' algorithms, without needing to be explicitly told what to do by any human-developed algorithms. Recently, generative artificial neural networks have been able to surpass results of many previous approaches.

Software agent

In computer science, a software agent or software AI is a computer program that acts for a user or other program in a relationship of agency, which derives from the Latin agere (to do): an agreement to act on one's behalf. Such "action on behalf of" implies the authority to decide which, if any, action is appropriate. Some agents are colloquially known as bots, from robot. They may be embodied, as when execution is paired with a robot body, or as software such as a chatbot executing on a phone (e.g.

Agent-based model

An agent-based model (ABM) is a computational model for simulating the actions and interactions of autonomous agents (both individual or collective entities such as organizations or groups) in order to understand the behavior of a system and what governs its outcomes. It combines elements of game theory, complex systems, emergence, computational sociology, multi-agent systems, and evolutionary programming. Monte Carlo methods are used to understand the stochasticity of these models.

Signaling game

In game theory, a signaling game is a simple type of a dynamic Bayesian game. The essence of a signalling game is that one player takes an action, the signal, to convey information to another player, where sending the signal is more costly if they are conveying false information. A manufacturer, for example, might provide a warranty for its product in order to signal to consumers that its product is unlikely to break down. The classic example is of a worker who acquires a college degree not because it increases their skill, but because it conveys their ability to employers.