Publication

Team Policy Learning For Multi-Agent Reinforcement Learning

Related concepts (25)

Reinforcement learning (RL) is an area of machine learning concerned with how intelligent agents ought to take actions in an environment in order to maximize the notion of cumulative reward. Reinforcement learning is one of three basic machine learning paradigms, alongside supervised learning and unsupervised learning. Reinforcement learning differs from supervised learning in not needing labelled input/output pairs to be presented, and in not needing sub-optimal actions to be explicitly corrected.

Deep reinforcement learning

Deep reinforcement learning (deep RL) is a subfield of machine learning that combines reinforcement learning (RL) and deep learning. RL considers the problem of a computational agent learning to make decisions by trial and error. Deep RL incorporates deep learning into the solution, allowing agents to make decisions from unstructured input data without manual engineering of the state space. Deep RL algorithms are able to take in very large inputs (e.g.

Multi-agent system

A multi-agent system (MAS or "self-organized system") is a computerized system composed of multiple interacting intelligent agents. Multi-agent systems can solve problems that are difficult or impossible for an individual agent or a monolithic system to solve. Intelligence may include methodic, functional, procedural approaches, algorithmic search or reinforcement learning. Despite considerable overlap, a multi-agent system is not always the same as an agent-based model (ABM).

Q-learning

Q-learning is a model-free reinforcement learning algorithm to learn the value of an action in a particular state. It does not require a model of the environment (hence "model-free"), and it can handle problems with stochastic transitions and rewards without requiring adaptations. For any finite Markov decision process (FMDP), Q-learning finds an optimal policy in the sense of maximizing the expected value of the total reward over any and all successive steps, starting from the current state.

Intelligent agent

In artificial intelligence, an intelligent agent (IA) is an agent acting in an intelligent manner; It perceives its environment, takes actions autonomously in order to achieve goals, and may improve its performance with learning or acquiring knowledge. An intelligent agent may be simple or complex: A thermostat or other control system is considered an example of an intelligent agent, as is a human being, as is any system that meets the definition, such as a firm, a state, or a biome.

Interactive film

An interactive film is a video game or other interactive media that has characteristics of a cinematic film. In the video game industry, the term refers to a movie game, a video game that presents its gameplay in a cinematic, scripted manner, often through the use of full-motion video of either animated or live-action footage. In the film industry, the term "interactive film" refers to interactive cinema, a film where one or more viewers can interact with the film and influence the events that unfold in the film.

Multiplayer video game

A multiplayer video game is a video game in which more than one person can play in the same game environment at the same time, either locally on the same computing system (couch co-op), on different computing systems via a local area network, or via a wide area network, most commonly the Internet (e.g. World of Warcraft, Call of Duty, DayZ). Multiplayer games usually require players to share a single game system or use networking technology to play together over a greater distance; players may compete against one or more human contestants, work cooperatively with a human partner to achieve a common goal, or supervise other players' activity.

Monetary policy

Monetary policy is the policy adopted by the monetary authority of a nation to affect monetary and other financial conditions to accomplish broader objectives like high employment and price stability (normally interpreted as a low and stable rate of inflation). Further purposes of a monetary policy may be to contribute to economic stability or to maintain predictable exchange rates with other currencies.

Massively multiplayer online role-playing game

A massively multiplayer online role-playing game (MMORPG) is a video game that combines aspects of a role-playing video game and a massively multiplayer online game. As in role-playing games (RPGs), the player assumes the role of a character (often in a fantasy world or science-fiction world) and takes control over many of that character's actions. MMORPGs are distinguished from single-player or small multi-player online RPGs by the number of players able to interact together, and by the game's persistent world (usually hosted by the game's publisher), which continues to exist and evolve while the player is offline and away from the game.

Massively multiplayer online game

A massively multiplayer online game (MMOG or more commonly MMO) is an online video game with a large number of players on the same server. MMOs usually feature a huge, persistent open world, although there are games that differ. These games can be found for most network-capable platforms, including the personal computer, video game console, or smartphones and other mobile devices. MMOs can enable players to cooperate and compete with each other on a large scale, and sometimes to interact meaningfully with people around the world.

Optimal control

Optimal control theory is a branch of mathematical optimization that deals with finding a control for a dynamical system over a period of time such that an objective function is optimized. It has numerous applications in science, engineering and operations research. For example, the dynamical system might be a spacecraft with controls corresponding to rocket thrusters, and the objective might be to reach the moon with minimum fuel expenditure.

Interactive fiction

Interactive fiction, often abbreviated IF, is software simulating environments in which players use text commands to control characters and influence the environment. Works in this form can be understood as literary narratives, either in the form of interactive narratives or interactive narrations. These works can also be understood as a form of video game, either in the form of an adventure game or role-playing game.

Support vector machine

In machine learning, support vector machines (SVMs, also support vector networks) are supervised learning models with associated learning algorithms that analyze data for classification and regression analysis. Developed at AT&T Bell Laboratories by Vladimir Vapnik with colleagues (Boser et al., 1992, Guyon et al., 1993, Cortes and Vapnik, 1995, Vapnik et al., 1997) SVMs are one of the most robust prediction methods, being based on statistical learning frameworks or VC theory proposed by Vapnik (1982, 1995) and Chervonenkis (1974).

Multi-armed bandit

In probability theory and machine learning, the multi-armed bandit problem (sometimes called the K- or N-armed bandit problem) is a problem in which a fixed limited set of resources must be allocated between competing (alternative) choices in a way that maximizes their expected gain, when each choice's properties are only partially known at the time of allocation, and may become better understood as time passes or by allocating resources to the choice. This is a classic reinforcement learning problem that exemplifies the exploration–exploitation tradeoff dilemma.

Goal

A goal or objective is an idea of the future or desired result that a person or a group of people envision, plan and commit to achieve. People endeavour to reach goals within a finite time by setting deadlines. A goal is roughly similar to a purpose or aim, the anticipated result which guides reaction, or an end, which is an object, either a physical object or an abstract object, that has intrinsic value. Goal setting Goal-setting theory was formulated based on empirical research and has been called one of the most important theories in organizational psychology.

Error function

In mathematics, the error function (also called the Gauss error function), often denoted by erf, is a complex function of a complex variable defined as: Some authors define without the factor of . This nonelementary integral is a sigmoid function that occurs often in probability, statistics, and partial differential equations. In many of these applications, the function argument is a real number. If the function argument is real, then the function value is also real.

Genetic algorithm

In computer science and operations research, a genetic algorithm (GA) is a metaheuristic inspired by the process of natural selection that belongs to the larger class of evolutionary algorithms (EA). Genetic algorithms are commonly used to generate high-quality solutions to optimization and search problems by relying on biologically inspired operators such as mutation, crossover and selection. Some examples of GA applications include optimizing decision trees for better performance, solving sudoku puzzles, hyperparameter optimization, causal inference, etc.

Team

A team is a group of individuals (human or non-human) working together to achieve their goal. As defined by Professor Leigh Thompson of the Kellogg School of Management, "[a] team is a group of people who are interdependent with respect to information, resources, knowledge and skills and who seek to combine their efforts to achieve a common goal". A group does not necessarily constitute a team. Teams normally have members with complementary skills and generate synergy through a coordinated effort which allows each member to maximize their strengths and minimize their weaknesses.

Algorithm

In mathematics and computer science, an algorithm (ˈælɡərɪðəm) is a finite sequence of rigorous instructions, typically used to solve a class of specific problems or to perform a computation. Algorithms are used as specifications for performing calculations and data processing. More advanced algorithms can use conditionals to divert the code execution through various routes (referred to as automated decision-making) and deduce valid inferences (referred to as automated reasoning), achieving automation eventually.

Gamma function

In mathematics, the gamma function (represented by Γ, the capital letter gamma from the Greek alphabet) is one commonly used extension of the factorial function to complex numbers. The gamma function is defined for all complex numbers except the non-positive integers. For every positive integer n, Derived by Daniel Bernoulli, for complex numbers with a positive real part, the gamma function is defined via a convergent improper integral: The gamma function then is defined as the analytic continuation of this integral function to a meromorphic function that is holomorphic in the whole complex plane except zero and the negative integers, where the function has simple poles.