Publication

Complex Dynamics in Simple Neural Networks: Understanding Gradient Flow in Phase Retrieval

Related concepts (26)

In mathematics, gradient descent (also often called steepest descent) is a iterative optimization algorithm for finding a local minimum of a differentiable function. The idea is to take repeated steps in the opposite direction of the gradient (or approximate gradient) of the function at the current point, because this is the direction of steepest descent. Conversely, stepping in the direction of the gradient will lead to a local maximum of that function; the procedure is then known as gradient ascent.

Stochastic gradient descent

Stochastic gradient descent (often abbreviated SGD) is an iterative method for optimizing an objective function with suitable smoothness properties (e.g. differentiable or subdifferentiable). It can be regarded as a stochastic approximation of gradient descent optimization, since it replaces the actual gradient (calculated from the entire data set) by an estimate thereof (calculated from a randomly selected subset of the data).

Global optimization

Global optimization is a branch of applied mathematics and numerical analysis that attempts to find the global minima or maxima of a function or a set of functions on a given set. It is usually described as a minimization problem because the maximization of the real-valued function is equivalent to the minimization of the function . Given a possibly nonlinear and non-convex continuous function with the global minima and the set of all global minimizers in , the standard minimization problem can be given as that is, finding and a global minimizer in ; where is a (not necessarily convex) compact set defined by inequalities .

Newton's method in optimization

In calculus, Newton's method (also called Newton–Raphson) is an iterative method for finding the roots of a differentiable function F, which are solutions to the equation F (x) = 0. As such, Newton's method can be applied to the derivative f ′ of a twice-differentiable function f to find the roots of the derivative (solutions to f ′(x) = 0), also known as the critical points of f. These solutions may be minima, maxima, or saddle points; see section "Several variables" in Critical point (mathematics) and also section "Geometric interpretation" in this article.

Maximum and minimum

In mathematical analysis, the maximum and minimum of a function are, respectively, the largest and smallest value taken by the function. Known generically as extremum, they may be defined either within a given range (the local or relative extrema) or on the entire domain (the global or absolute extrema) of a function. Pierre de Fermat was one of the first mathematicians to propose a general technique, adequality, for finding the maxima and minima of functions.

Reinforcement learning

Reinforcement learning (RL) is an area of machine learning concerned with how intelligent agents ought to take actions in an environment in order to maximize the notion of cumulative reward. Reinforcement learning is one of three basic machine learning paradigms, alongside supervised learning and unsupervised learning. Reinforcement learning differs from supervised learning in not needing labelled input/output pairs to be presented, and in not needing sub-optimal actions to be explicitly corrected.

Statistical mechanics

In physics, statistical mechanics is a mathematical framework that applies statistical methods and probability theory to large assemblies of microscopic entities. It does not assume or postulate any natural laws, but explains the macroscopic behavior of nature from the behavior of such ensembles. Sometimes called statistical physics or statistical thermodynamics, its applications include many problems in the fields of physics, biology, chemistry, and neuroscience.

Mathematical optimization

Mathematical optimization (alternatively spelled optimisation) or mathematical programming is the selection of a best element, with regard to some criterion, from some set of available alternatives. It is generally divided into two subfields: discrete optimization and continuous optimization. Optimization problems arise in all quantitative disciplines from computer science and engineering to operations research and economics, and the development of solution methods has been of interest in mathematics for centuries.

Critical point (mathematics)

Critical point is a wide term used in many branches of mathematics. When dealing with functions of a real variable, a critical point is a point in the domain of the function where the function is either not differentiable or the derivative is equal to zero. When dealing with complex variables, a critical point is, similarly, a point in the function's domain where it is either not holomorphic or the derivative is equal to zero. Likewise, for a function of several real variables, a critical point is a value in its domain where the gradient is undefined or is equal to zero.

Endorheic basin

An endorheic basin (ˌɛndoʊˈriː.ɪk; also endoreic basin and endorreic basin) is a drainage basin that normally retains water and allows no outflow to other, external bodies of water (e.g. rivers and oceans), instead, the water drainage flows into permanent and seasonal lakes and swamps that equilibrate through evaporation. Endorheic basins also are called closed basins, terminal basins, and internal drainage systems. Endorheic regions contrast with open lakes (exorheic regions), where surface waters eventually drain into the ocean.

Dimensionality reduction

Dimensionality reduction, or dimension reduction, is the transformation of data from a high-dimensional space into a low-dimensional space so that the low-dimensional representation retains some meaningful properties of the original data, ideally close to its intrinsic dimension. Working in high-dimensional spaces can be undesirable for many reasons; raw data are often sparse as a consequence of the curse of dimensionality, and analyzing the data is usually computationally intractable (hard to control or deal with).

Drainage basin

A drainage basin is an area of land where all flowing surface water converges to a single point, such as a river mouth, or flows into another body of water, such as a lake or ocean. A basin is separated from adjacent basins by a perimeter, the drainage divide, made up of a succession of elevated features, such as ridges and hills. A basin may consist of smaller basins that merge at river confluences, forming a hierarchical pattern. Other terms for a drainage basin are catchment area, catchment basin, drainage area, river basin, water basin, and impluvium.

Fluid dynamics

In physics, physical chemistry and engineering, fluid dynamics is a subdiscipline of fluid mechanics that describes the flow of fluids—liquids and gases. It has several subdisciplines, including aerodynamics (the study of air and other gases in motion) and hydrodynamics (the study of liquids in motion). Fluid dynamics has a wide range of applications, including calculating forces and moments on aircraft, determining the mass flow rate of petroleum through pipelines, predicting weather patterns, understanding nebulae in interstellar space and modelling fission weapon detonation.

Critical exponent

Critical exponents describe the behavior of physical quantities near continuous phase transitions. It is believed, though not proven, that they are universal, i.e. they do not depend on the details of the physical system, but only on some of its general features. For instance, for ferromagnetic systems, the critical exponents depend only on: the dimension of the system the range of the interaction the spin dimension These properties of critical exponents are supported by experimental data.

Nonlinear dimensionality reduction

Nonlinear dimensionality reduction, also known as manifold learning, refers to various related techniques that aim to project high-dimensional data onto lower-dimensional latent manifolds, with the goal of either visualizing the data in the low-dimensional space, or learning the mapping (either from the high-dimensional space to the low-dimensional embedding or vice versa) itself. The techniques described below can be understood as generalizations of linear decomposition methods used for dimensionality reduction, such as singular value decomposition and principal component analysis.

Multi-objective optimization

Multi-objective optimization or Pareto optimization (also known as multi-objective programming, vector optimization, multicriteria optimization, or multiattribute optimization) is an area of multiple-criteria decision making that is concerned with mathematical optimization problems involving more than one objective function to be optimized simultaneously. Multi-objective is a type of vector optimization that has been applied in many fields of science, including engineering, economics and logistics where optimal decisions need to be taken in the presence of trade-offs between two or more conflicting objectives.

Calculus of variations

The calculus of variations (or variational calculus) is a field of mathematical analysis that uses variations, which are small changes in functions and functionals, to find maxima and minima of functionals: mappings from a set of functions to the real numbers. Functionals are often expressed as definite integrals involving functions and their derivatives. Functions that maximize or minimize functionals may be found using the Euler–Lagrange equation of the calculus of variations.

Critical phenomena

In physics, critical phenomena is the collective name associated with the physics of critical points. Most of them stem from the divergence of the correlation length, but also the dynamics slows down. Critical phenomena include scaling relations among different quantities, power-law divergences of some quantities (such as the magnetic susceptibility in the ferromagnetic phase transition) described by critical exponents, universality, fractal behaviour, and ergodicity breaking.

Curse of dimensionality

The curse of dimensionality refers to various phenomena that arise when analyzing and organizing data in high-dimensional spaces that do not occur in low-dimensional settings such as the three-dimensional physical space of everyday experience. The expression was coined by Richard E. Bellman when considering problems in dynamic programming. Dimensionally cursed phenomena occur in domains such as numerical analysis, sampling, combinatorics, machine learning, data mining and databases.

Non-linear least squares

Non-linear least squares is the form of least squares analysis used to fit a set of m observations with a model that is non-linear in n unknown parameters (m ≥ n). It is used in some forms of nonlinear regression. The basis of the method is to approximate the model by a linear one and to refine the parameters by successive iterations. There are many similarities to linear least squares, but also some significant differences.