Publication

On the linear convergence of the stochastic gradient method with constant step-size

Related concepts (18)

Stochastic gradient descent (often abbreviated SGD) is an iterative method for optimizing an objective function with suitable smoothness properties (e.g. differentiable or subdifferentiable). It can be regarded as a stochastic approximation of gradient descent optimization, since it replaces the actual gradient (calculated from the entire data set) by an estimate thereof (calculated from a randomly selected subset of the data).

Gradient descent

In mathematics, gradient descent (also often called steepest descent) is a iterative optimization algorithm for finding a local minimum of a differentiable function. The idea is to take repeated steps in the opposite direction of the gradient (or approximate gradient) of the function at the current point, because this is the direction of steepest descent. Conversely, stepping in the direction of the gradient will lead to a local maximum of that function; the procedure is then known as gradient ascent.

Neumann boundary condition

In mathematics, the Neumann (or second-type) boundary condition is a type of boundary condition, named after Carl Neumann. When imposed on an ordinary or a partial differential equation, the condition specifies the values of the derivative applied at the boundary of the domain. It is possible to describe the problem using other boundary conditions: a Dirichlet boundary condition specifies the values of the solution itself (as opposed to its derivative) on the boundary, whereas the Cauchy boundary condition, mixed boundary condition and Robin boundary condition are all different types of combinations of the Neumann and Dirichlet boundary conditions.

Online machine learning

In computer science, online machine learning is a method of machine learning in which data becomes available in a sequential order and is used to update the best predictor for future data at each step, as opposed to batch learning techniques which generate the best predictor by learning on the entire training data set at once. Online learning is a common technique used in areas of machine learning where it is computationally infeasible to train over the entire dataset, requiring the need of out-of-core algorithms.

Dirichlet boundary condition

In the mathematical study of differential equations, the Dirichlet (or first-type) boundary condition is a type of boundary condition, named after Peter Gustav Lejeune Dirichlet (1805–1859). When imposed on an ordinary or a partial differential equation, it specifies the values that a solution needs to take along the boundary of the domain. In finite element method (FEM) analysis, essential or Dirichlet boundary condition is defined by weighted-integral form of a differential equation.

Robin boundary condition

In mathematics, the Robin boundary condition (ˈrɒbɪn; properly ʁɔbɛ̃), or third type boundary condition, is a type of boundary condition, named after Victor Gustave Robin (1855–1897). When imposed on an ordinary or a partial differential equation, it is a specification of a linear combination of the values of a function and the values of its derivative on the boundary of the domain. Other equivalent names in use are Fourier-type condition and radiation condition.

Reinforcement learning

Reinforcement learning (RL) is an area of machine learning concerned with how intelligent agents ought to take actions in an environment in order to maximize the notion of cumulative reward. Reinforcement learning is one of three basic machine learning paradigms, alongside supervised learning and unsupervised learning. Reinforcement learning differs from supervised learning in not needing labelled input/output pairs to be presented, and in not needing sub-optimal actions to be explicitly corrected.

Topologies on spaces of linear maps

In mathematics, particularly functional analysis, spaces of linear maps between two vector spaces can be endowed with a variety of topologies. Studying space of linear maps and these topologies can give insight into the spaces themselves. The article operator topologies discusses topologies on spaces of linear maps between normed spaces, whereas this article discusses topologies on such spaces in the more general setting of topological vector spaces (TVSs).

Learning rate

In machine learning and statistics, the learning rate is a tuning parameter in an optimization algorithm that determines the step size at each iteration while moving toward a minimum of a loss function. Since it influences to what extent newly acquired information overrides old information, it metaphorically represents the speed at which a machine learning model "learns". In the adaptive control literature, the learning rate is commonly referred to as gain. In setting a learning rate, there is a trade-off between the rate of convergence and overshooting.

Regularization (mathematics)

In mathematics, statistics, finance, computer science, particularly in machine learning and inverse problems, regularization is a process that changes the result answer to be "simpler". It is often used to obtain results for ill-posed problems or to prevent overfitting. Although regularization procedures can be divided in many ways, the following delineation is particularly helpful: Explicit regularization is regularization whenever one explicitly adds a term to the optimization problem.

Stochastic optimization

Stochastic optimization (SO) methods are optimization methods that generate and use random variables. For stochastic problems, the random variables appear in the formulation of the optimization problem itself, which involves random objective functions or random constraints. Stochastic optimization methods also include methods with random iterates. Some stochastic optimization methods use random iterates to solve stochastic problems, combining both meanings of stochastic optimization.

Convex function

In mathematics, a real-valued function is called convex if the line segment between any two distinct points on the graph of the function lies above the graph between the two points. Equivalently, a function is convex if its epigraph (the set of points on or above the graph of the function) is a convex set. A twice-differentiable function of a single variable is convex if and only if its second derivative is nonnegative on its entire domain.

Artificial neural network

Artificial neural networks (ANNs, also shortened to neural networks (NNs) or neural nets) are a branch of machine learning models that are built using principles of neuronal organization discovered by connectionism in the biological neural networks constituting animal brains. An ANN is based on a collection of connected units or nodes called artificial neurons, which loosely model the neurons in a biological brain. Each connection, like the synapses in a biological brain, can transmit a signal to other neurons.

Economic growth

Economic growth can be defined as the increase or improvement in the inflation-adjusted market value of the goods and services produced by an economy in a financial year. Statisticians conventionally measure such growth as the percent rate of increase in the real and nominal gross domestic product (GDP). Growth is usually calculated in real terms – i.e., inflation-adjusted terms – to eliminate the distorting effect of inflation on the prices of goods produced. Measurement of economic growth uses national income accounting.

Finite element method

The finite element method (FEM) is a popular method for numerically solving differential equations arising in engineering and mathematical modeling. Typical problem areas of interest include the traditional fields of structural analysis, heat transfer, fluid flow, mass transport, and electromagnetic potential. The FEM is a general numerical method for solving partial differential equations in two or three space variables (i.e., some boundary value problems).

Lasso (statistics)

In statistics and machine learning, lasso (least absolute shrinkage and selection operator; also Lasso or LASSO) is a regression analysis method that performs both variable selection and regularization in order to enhance the prediction accuracy and interpretability of the resulting statistical model. It was originally introduced in geophysics, and later by Robert Tibshirani, who coined the term. Lasso was originally formulated for linear regression models. This simple case reveals a substantial amount about the estimator.

Modes of convergence

In mathematics, there are many senses in which a sequence or a series is said to be convergent. This article describes various modes (senses or species) of convergence in the settings where they are defined. For a list of modes of convergence, see Modes of convergence (annotated index) Note that each of the following objects is a special case of the types preceding it: sets, topological spaces, uniform spaces, TAGs (topological abelian groups), normed spaces, Euclidean spaces, and the real/complex numbers.

Discontinuous linear map

In mathematics, linear maps form an important class of "simple" functions which preserve the algebraic structure of linear spaces and are often used as approximations to more general functions (see linear approximation). If the spaces involved are also topological spaces (that is, topological vector spaces), then it makes sense to ask whether all linear maps are continuous. It turns out that for maps defined on infinite-dimensional topological vector spaces (e.g.