Publication

Multi-agent reinforcement learning for adaptive demand response in smart cities

Related concepts (28)

Reinforcement learning (RL) is an area of machine learning concerned with how intelligent agents ought to take actions in an environment in order to maximize the notion of cumulative reward. Reinforcement learning is one of three basic machine learning paradigms, alongside supervised learning and unsupervised learning. Reinforcement learning differs from supervised learning in not needing labelled input/output pairs to be presented, and in not needing sub-optimal actions to be explicitly corrected.

Deep reinforcement learning

Deep reinforcement learning (deep RL) is a subfield of machine learning that combines reinforcement learning (RL) and deep learning. RL considers the problem of a computational agent learning to make decisions by trial and error. Deep RL incorporates deep learning into the solution, allowing agents to make decisions from unstructured input data without manual engineering of the state space. Deep RL algorithms are able to take in very large inputs (e.g.

Q-learning

Q-learning is a model-free reinforcement learning algorithm to learn the value of an action in a particular state. It does not require a model of the environment (hence "model-free"), and it can handle problems with stochastic transitions and rewards without requiring adaptations. For any finite Markov decision process (FMDP), Q-learning finds an optimal policy in the sense of maximizing the expected value of the total reward over any and all successive steps, starting from the current state.

Supply and demand

In microeconomics, supply and demand is an economic model of price determination in a market. It postulates that, holding all else equal, in a competitive market, the unit price for a particular good, or other traded item such as labor or liquid financial assets, will vary until it settles at a point where the quantity demanded (at the current price) will equal the quantity supplied (at the current price), resulting in an economic equilibrium for price and quantity transacted.

Dynamic demand (electric power)

Dynamic Demand is the name of a semi-passive technology to support demand response by adjusting the load demand on an electrical power grid. It is also the name of an independent not-for-profit organization in the UK supported by a charitable grant from the Esmée Fairbairn Foundation, dedicated to promoting this technology. The concept is that by monitoring the frequency of the power grid, as well as their own controls, intermittent domestic and industrial loads switch themselves on/off at optimal moments to balance the overall grid load with generation, reducing critical power mismatches.

Demand response

Demand response is a change in the power consumption of an electric utility customer to better match the demand for power with the supply. Until the 21st century decrease in the cost of pumped storage and batteries electric energy could not be easily stored, so utilities have traditionally matched demand and supply by throttling the production rate of their power plants, taking generating units on or off line, or importing power from other utilities.

Price discrimination

Price discrimination is a microeconomic pricing strategy where identical or largely similar goods or services are sold at different prices by the same provider in different market segments. Price discrimination is distinguished from product differentiation by the more substantial difference in production cost for the differently priced products involved in the latter strategy. Price differentiation essentially relies on the variation in the customers' willingness to pay and in the elasticity of their demand.

Multi-agent system

A multi-agent system (MAS or "self-organized system") is a computerized system composed of multiple interacting intelligent agents. Multi-agent systems can solve problems that are difficult or impossible for an individual agent or a monolithic system to solve. Intelligence may include methodic, functional, procedural approaches, algorithmic search or reinforcement learning. Despite considerable overlap, a multi-agent system is not always the same as an agent-based model (ABM).

Pricing

Pricing is the process whereby a business sets the price at which it will sell its products and services, and may be part of the business's marketing plan. In setting prices, the business will take into account the price at which it could acquire the goods, the manufacturing cost, the marketplace, competition, market condition, brand, and quality of product. Pricing is a fundamental aspect of product management and is one of the four Ps of the marketing mix, the other three aspects being product, promotion, and place.

Energy demand management

Energy demand management, also known as demand-side management (DSM) or demand-side response (DSR), is the modification of consumer demand for energy through various methods such as financial incentives and behavioral change through education. Usually, the goal of demand-side management is to encourage the consumer to use less energy during peak hours, or to move the time of energy use to off-peak times such as nighttime and weekends.

Intelligent agent

In artificial intelligence, an intelligent agent (IA) is an agent acting in an intelligent manner; It perceives its environment, takes actions autonomously in order to achieve goals, and may improve its performance with learning or acquiring knowledge. An intelligent agent may be simple or complex: A thermostat or other control system is considered an example of an intelligent agent, as is a human being, as is any system that meets the definition, such as a firm, a state, or a biome.

Machine learning

Machine learning (ML) is an umbrella term for solving problems for which development of algorithms by human programmers would be cost-prohibitive, and instead the problems are solved by helping machines 'discover' their 'own' algorithms, without needing to be explicitly told what to do by any human-developed algorithms. Recently, generative artificial neural networks have been able to surpass results of many previous approaches.

Demand

In economics, demand is the quantity of a good that consumers are willing and able to purchase at various prices during a given time. The relationship between price and quantity demand is also called the demand curve. Demand for a specific item is a function of an item's perceived necessity, price, perceived quality, convenience, available alternatives, purchasers' disposable income and tastes, and many other options. Innumerable factors and circumstances affect a consumer's willingness or to buy a good.

Demand curve

In a demand schedule, a demand curve is a graph depicting the relationship between the price of a certain commodity (the y-axis) and the quantity of that commodity that is demanded at that price (the x-axis). Demand curves can be used either for the price-quantity relationship for an individual consumer (an individual demand curve), or for all consumers in a particular market (a market demand curve). It is generally assumed that demand curves slope down, as shown in the adjacent image.

Artificial neural network

Artificial neural networks (ANNs, also shortened to neural networks (NNs) or neural nets) are a branch of machine learning models that are built using principles of neuronal organization discovered by connectionism in the biological neural networks constituting animal brains. An ANN is based on a collection of connected units or nodes called artificial neurons, which loosely model the neurons in a biological brain. Each connection, like the synapses in a biological brain, can transmit a signal to other neurons.

Dynamic pricing

Dynamic pricing, also referred to as surge pricing, demand pricing, or time-based pricing, is a revenue management pricing strategy in which businesses set flexible prices for products or services based on current market demands. Businesses are able to change prices based on algorithms that take into account competitor pricing, supply and demand, and other external factors in the market. Dynamic pricing is a common practice in several industries such as hospitality, tourism, entertainment, retail, electricity, and public transport.

Price signal

A price signal is information conveyed to consumers and producers, via the prices offered or requested for, and the amount requested or offered of a product or service, which provides a signal to increase or decrease quantity supplied or quantity demanded. It also provides potential business opportunities. When a certain kind of product is in shortage supply and the price rises, people will pay more attention to and produce this kind of product.

Electrical grid

An electrical grid is an interconnected network for electricity delivery from producers to consumers. Electrical grids vary in size and can cover whole countries or continents. It consists of: power stations: often located near energy and away from heavily populated areas electrical substations to step voltage up or down electric power transmission to carry power long distances electric power distribution to individual customers, where voltage is stepped down again to the required service voltage(s).

Price elasticity of demand

A good's price elasticity of demand (, PED) is a measure of how sensitive the quantity demanded is to its price. When the price rises, quantity demanded falls for almost any good, but it falls more for some than for others. The price elasticity gives the percentage change in quantity demanded when there is a one percent increase in price, holding everything else constant. If the elasticity is −2, that means a one percent price rise leads to a two percent decline in quantity demanded.

Generative adversarial network

A generative adversarial network (GAN) is a class of machine learning framework and a prominent framework for approaching generative AI. The concept was initially developed by Ian Goodfellow and his colleagues in June 2014. In a GAN, two neural networks contest with each other in the form of a zero-sum game, where one agent's gain is another agent's loss. Given a training set, this technique learns to generate new data with the same statistics as the training set.