Publication

CPG-RL: Learning Central Pattern Generators for Quadruped Locomotion

Related concepts (27)

Central pattern generators (CPGs) are self-organizing biological neural circuits that produce rhythmic outputs in the absence of rhythmic input. They are the source of the tightly-coupled patterns of neural activity that drive rhythmic and stereotyped motor behaviors like walking, swimming, breathing, or chewing. The ability to function without input from higher brain areas still requires modulatory inputs, and their outputs are not fixed. Flexibility in response to sensory input is a fundamental quality of CPG-driven behavior.

Multi-agent reinforcement learning

Multi-agent reinforcement learning (MARL) is a sub-field of reinforcement learning. It focuses on studying the behavior of multiple learning agents that coexist in a shared environment. Each agent is motivated by its own rewards, and does actions to advance its own interests; in some environments these interests are opposed to the interests of other agents, resulting in complex group dynamics. Multi-agent reinforcement learning is closely related to game theory and especially repeated games, as well as multi-agent systems.

Reinforcement learning

Reinforcement learning (RL) is an area of machine learning concerned with how intelligent agents ought to take actions in an environment in order to maximize the notion of cumulative reward. Reinforcement learning is one of three basic machine learning paradigms, alongside supervised learning and unsupervised learning. Reinforcement learning differs from supervised learning in not needing labelled input/output pairs to be presented, and in not needing sub-optimal actions to be explicitly corrected.

Proprioception

Proprioception (ˌproʊpri.oʊˈsɛpʃən,_-ə- ), also called kinaesthesia (or kinesthesia), is the sense of self-movement, force, and body position. Proprioception is mediated by proprioceptors, mechanosensory neurons located within muscles, tendons, and joints. Most animals possess multiple subtypes of proprioceptors, which detect distinct kinematic parameters, such as joint position, movement, and load. Although all mobile animals possess proprioceptors, the structure of the sensory organs can vary across species.

Deep reinforcement learning

Deep reinforcement learning (deep RL) is a subfield of machine learning that combines reinforcement learning (RL) and deep learning. RL considers the problem of a computational agent learning to make decisions by trial and error. Deep RL incorporates deep learning into the solution, allowing agents to make decisions from unstructured input data without manual engineering of the state space. Deep RL algorithms are able to take in very large inputs (e.g.

Reinforcement learning from human feedback

In machine learning, reinforcement learning from human feedback (RLHF) or reinforcement learning from human preferences is a technique that trains a "reward model" directly from human feedback and uses the model as a reward function to optimize an agent's policy using reinforcement learning (RL) through an optimization algorithm like Proximal Policy Optimization. The reward model is trained in advance to the policy being optimized to predict if a given output is good (high reward) or bad (low reward).

Oscillation

Oscillation is the repetitive or periodic variation, typically in time, of some measure about a central value (often a point of equilibrium) or between two or more different states. Familiar examples of oscillation include a swinging pendulum and alternating current. Oscillations can be used in physics to approximate complex interactions, such as those between atoms.

Deep learning

Deep learning is part of a broader family of machine learning methods, which is based on artificial neural networks with representation learning. The adjective "deep" in deep learning refers to the use of multiple layers in the network. Methods used can be either supervised, semi-supervised or unsupervised.

Q-learning

Q-learning is a model-free reinforcement learning algorithm to learn the value of an action in a particular state. It does not require a model of the environment (hence "model-free"), and it can handle problems with stochastic transitions and rewards without requiring adaptations. For any finite Markov decision process (FMDP), Q-learning finds an optimal policy in the sense of maximizing the expected value of the total reward over any and all successive steps, starting from the current state.

Sense of balance

The sense of balance or equilibrioception is the perception of balance and spatial orientation. It helps prevent humans and nonhuman animals from falling over when standing or moving. Equilibrioception is the result of a number of sensory systems working together; the eyes (visual system), the inner ears (vestibular system), and the body's sense of where it is in space (proprioception) ideally need to be intact. The vestibular system, the region of the inner ear where three semicircular canals converge, works with the visual system to keep objects in focus when the head is moving.

Intelligent agent

In artificial intelligence, an intelligent agent (IA) is an agent acting in an intelligent manner; It perceives its environment, takes actions autonomously in order to achieve goals, and may improve its performance with learning or acquiring knowledge. An intelligent agent may be simple or complex: A thermostat or other control system is considered an example of an intelligent agent, as is a human being, as is any system that meets the definition, such as a firm, a state, or a biome.

Terrestrial locomotion

Terrestrial locomotion has evolved as animals adapted from aquatic to terrestrial environments. Locomotion on land raises different problems than that in water, with reduced friction being replaced by the increased effects of gravity. As viewed from evolutionary taxonomy, there are three basic forms of animal locomotion in the terrestrial environment: legged – moving by using appendages limbless locomotion – moving without legs, primarily using the body itself as a propulsive structure.

Machine learning

Machine learning (ML) is an umbrella term for solving problems for which development of algorithms by human programmers would be cost-prohibitive, and instead the problems are solved by helping machines 'discover' their 'own' algorithms, without needing to be explicitly told what to do by any human-developed algorithms. Recently, generative artificial neural networks have been able to surpass results of many previous approaches.

Google DeepMind

DeepMind Technologies Limited, doing business as Google DeepMind, is a British-American artificial intelligence research laboratory which serves as a subsidiary of Google. Founded in the UK in 2010, it was acquired by Google in 2014, becoming a wholly owned subsidiary of Google parent company Alphabet Inc. after Google's corporate restructuring in 2015. The company is based in London, with research centres in Canada, France, and the United States.

Self-play

Self-play is a technique for improving the performance of reinforcement learning agents. Intuitively, agents learn to improve their performance by playing "against themselves". In multi-agent reinforcement learning experiments, researchers try to optimize the performance of a learning agent on a given task, in cooperation or competition with one or more agents. These agents learn by trial-and-error, and researchers may choose to have the learning algorithm play the role of two or more of the different agents.

Robot locomotion

Robot locomotion is the collective name for the various methods that robots use to transport themselves from place to place. Wheeled robots are typically quite energy efficient and simple to control. However, other forms of locomotion may be more appropriate for a number of reasons, for example traversing rough terrain, as well as moving and interacting in human environments. Furthermore, studying bipedal and insect-like robots may beneficially impact on biomechanics.

Agent-based social simulation

Agent-based social simulation (or ABSS) consists of social simulations that are based on agent-based modeling, and implemented using artificial agent technologies. Agent-based social simulation is a scientific discipline concerned with simulation of social phenomena, using computer-based multiagent models. In these simulations, persons or group of persons are represented by agents. MABSS is a combination of social science, multiagent simulation and computer simulation.

Animal locomotion

Animal locomotion, in ethology, is any of a variety of methods that animals use to move from one place to another. Some modes of locomotion are (initially) self-propelled, e.g., running, swimming, jumping, flying, hopping, soaring and gliding. There are also many animal species that depend on their environment for transportation, a type of mobility called passive locomotion, e.g., sailing (some jellyfish), kiting (spiders), rolling (some beetles and spiders) or riding other animals (phoresis).

Reticular formation

The reticular formation is a set of interconnected nuclei that are located throughout the brainstem. It is not anatomically well defined, because it includes neurons located in different parts of the brain. The neurons of the reticular formation make up a complex set of networks in the core of the brainstem that extend from the upper part of the midbrain to the lower part of the medulla oblongata. The reticular formation includes ascending pathways to the cortex in the ascending reticular activating system (ARAS) and descending pathways to the spinal cord via the reticulospinal tracts.

Electronic oscillator

An electronic oscillator is an electronic circuit that produces a periodic, oscillating or alternating current (AC) signal, usually a sine wave, square wave or a triangle wave, powered by a direct current (DC) source. Oscillators are found in many electronic devices, such as radio receivers, television sets, radio and television broadcast transmitters, computers, computer peripherals, cellphones, radar, and many other devices.