Publication

Reinforcement learning: the effect of environment

Related concepts (32)

Reinforcement learning (RL) is an area of machine learning concerned with how intelligent agents ought to take actions in an environment in order to maximize the notion of cumulative reward. Reinforcement learning is one of three basic machine learning paradigms, alongside supervised learning and unsupervised learning. Reinforcement learning differs from supervised learning in not needing labelled input/output pairs to be presented, and in not needing sub-optimal actions to be explicitly corrected.

Deep reinforcement learning

Deep reinforcement learning (deep RL) is a subfield of machine learning that combines reinforcement learning (RL) and deep learning. RL considers the problem of a computational agent learning to make decisions by trial and error. Deep RL incorporates deep learning into the solution, allowing agents to make decisions from unstructured input data without manual engineering of the state space. Deep RL algorithms are able to take in very large inputs (e.g.

Machine learning

Machine learning (ML) is an umbrella term for solving problems for which development of algorithms by human programmers would be cost-prohibitive, and instead the problems are solved by helping machines 'discover' their 'own' algorithms, without needing to be explicitly told what to do by any human-developed algorithms. Recently, generative artificial neural networks have been able to surpass results of many previous approaches.

Supervised learning

Supervised learning (SL) is a paradigm in machine learning where input objects (for example, a vector of predictor variables) and a desired output value (also known as human-labeled supervisory signal) train a model. The training data is processed, building a function that maps new data on expected output values. An optimal scenario will allow for the algorithm to correctly determine output values for unseen instances. This requires the learning algorithm to generalize from the training data to unseen situations in a "reasonable" way (see inductive bias).

Human

Humans, or modern humans (Homo sapiens), are the most common and widespread species of primate. A great ape characterized by their hairlessness, bipedalism, and high intelligence, humans have a large brain and resulting cognitive skills that enable them to thrive in varied environments and develop complex societies and civilizations. Humans are highly social and tend to live in complex social structures composed of many cooperating and competing groups, from families and kinship networks to political states.

Learning

Learning is the process of acquiring new understanding, knowledge, behaviors, skills, values, attitudes, and preferences. The ability to learn is possessed by humans, animals, and some machines; there is also evidence for some kind of learning in certain plants. Some learning is immediate, induced by a single event (e.g. being burned by a hot stove), but much skill and knowledge accumulate from repeated experiences. The changes induced by learning often last a lifetime, and it is hard to distinguish learned material that seems to be "lost" from that which cannot be retrieved.

Type system

In computer programming, a type system is a logical system comprising a set of rules that assigns a property called a type (for example, integer, floating point, string) to every "term" (a word, phrase, or other set of symbols). Usually the terms are various constructs of a computer program, such as variables, expressions, functions, or modules. A type system dictates the operations that can be performed on a term. For variables, the type system determines the allowed values of that term.

Chess

Chess is a board game for two players, called White and Black, each controlling an army of chess pieces in their color, with the objective to checkmate the opponent's king. It is sometimes called international chess or Western chess to distinguish it from related games, such as xiangqi (Chinese chess) and shogi (Japanese chess). The recorded history of chess goes back at least to the emergence of a similar game, chaturanga, in seventh century India.

Chess.com

Chess.com is an internet chess server and social networking website. The site has a freemium model in which some features are available for free, and others are available for accounts with subscriptions. Live online chess can be played against other users in daily, rapid, blitz or bullet time controls, with a number of chess variants also available. Chess versus a chess engine, computer analysis, chess puzzles and teaching resources are also offered.

Q-learning

Q-learning is a model-free reinforcement learning algorithm to learn the value of an action in a particular state. It does not require a model of the environment (hence "model-free"), and it can handle problems with stochastic transitions and rewards without requiring adaptations. For any finite Markov decision process (FMDP), Q-learning finds an optimal policy in the sense of maximizing the expected value of the total reward over any and all successive steps, starting from the current state.

Type theory

In mathematics, logic, and computer science, a type theory is the formal presentation of a specific type system, and in general, type theory is the academic study of type systems. Some type theories serve as alternatives to set theory as a foundation of mathematics. Two influential type theories that were proposed as foundations are Alonzo Church's typed λ-calculus and Per Martin-Löf's intuitionistic type theory. Most computerized proof-writing systems use a type theory for their foundation, a common one is Thierry Coquand's Calculus of Inductive Constructions.

Computer chess

Computer chess includes both hardware (dedicated computers) and software capable of playing chess. Computer chess provides opportunities for players to practice even in the absence of human opponents, and also provides opportunities for analysis, entertainment and training. Computer chess applications that play at the level of a chess master or higher are available on hardware from supercomputers to smart phones. Standalone chess-playing machines are also available.

Type safety

In computer science, type safety and type soundness are the extent to which a programming language discourages or prevents type errors. Type safety is sometimes alternatively considered to be a property of facilities of a computer language; that is, some facilities are type-safe and their usage will not result in type errors, while other facilities in the same language may be type-unsafe and a program using them may encounter type errors.

Chess variant

A chess variant is a game related to, derived from, or inspired by chess. Such variants can differ from chess in many different ways. "International" or "Western" chess itself is one of a family of games which have related origins and could be considered variants of each other. Chess developed from chaturanga, from which other members of this family, such as ouk chatrang, shatranj, Tamerlane chess, shogi, and xiangqi also evolved. Many chess variants are designed to be played with the equipment of regular chess.

Chess engine

In computer chess, a chess engine is a computer program that analyzes chess or chess variant positions, and generates a move or list of moves that it regards as strongest. A chess engine is usually a back end with a command-line interface with no graphics or windowing. Engines are usually used with a front end, a windowed graphical user interface such as Chessbase or WinBoard that the user can interact with via a keyboard, mouse or touchscreen.

Evaluation strategy

In a programming language, an evaluation strategy is a set of rules for evaluating expressions. The term is often used to refer to the more specific notion of a parameter-passing strategy that defines the kind of value that is passed to the function for each parameter (the binding strategy) and whether to evaluate the parameters of a function call, and if so in what order (the evaluation order). The notion of reduction strategy is distinct, although some authors conflate the two terms and the definition of each term is not widely agreed upon.

Feature learning

In machine learning, feature learning or representation learning is a set of techniques that allows a system to automatically discover the representations needed for feature detection or classification from raw data. This replaces manual feature engineering and allows a machine to both learn the features and use them to perform a specific task. Feature learning is motivated by the fact that machine learning tasks such as classification often require input that is mathematically and computationally convenient to process.

Type inference

Type inference refers to the automatic detection of the type of an expression in a formal language. These include programming languages and mathematical type systems, but also natural languages in some branches of computer science and linguistics. Types in a most general view can be associated to a designated use suggesting and restricting the activities possible for an object of that type. Many nouns in language specify such uses. For instance, the word leash indicates a different use than the word line.

Image segmentation

In and computer vision, image segmentation is the process of partitioning a into multiple image segments, also known as image regions or image objects (sets of pixels). The goal of segmentation is to simplify and/or change the representation of an image into something that is more meaningful and easier to analyze. Image segmentation is typically used to locate objects and boundaries (lines, curves, etc.) in images. More precisely, image segmentation is the process of assigning a label to every pixel in an image such that pixels with the same label share certain characteristics.

Unsupervised learning

Unsupervised learning, is paradigm in machine learning where, in contrast to supervised learning and semi-supervised learning, algorithms learn patterns exclusively from unlabeled data. Neural network tasks are often categorized as discriminative (recognition) or generative (imagination). Often but not always, discriminative tasks use supervised methods and generative tasks use unsupervised (see Venn diagram); however, the separation is very hazy. For example, object recognition favors supervised learning but unsupervised learning can also cluster objects into groups.