Publication

Learning Ridge Functions With Randomized Sampling In High Dimensions

Concepts associés (27)

thumb|Exemple d'échantillonnage aléatoire En statistique, l'échantillonnage désigne les méthodes de sélection d'un sous-ensemble d'individus (un échantillon) à l'intérieur d'une population pour estimer les caractéristiques de l'ensemble de la population. Cette méthode présente plusieurs avantages : une étude restreinte sur une partie de la population, un moindre coût, une collecte des données plus rapide que si l'étude avait été réalisé sur l'ensemble de la population, la réalisation de contrôles destructifs Les résultats obtenus constituent un échantillon.

Simple random sample

In statistics, a simple random sample (or SRS) is a subset of individuals (a sample) chosen from a larger set (a population) in which a subset of individuals are chosen randomly, all with the same probability. It is a process of selecting a sample in a random way. In SRS, each subset of k individuals has the same probability of being chosen for the sample as any other subset of k individuals. A simple random sample is an unbiased sampling technique. Simple random sampling is a basic type of sampling and can be a component of other more complex sampling methods.

Échantillonnage stratifié

vignette|Vous prenez un échantillon aléatoire stratifié en divisant d'abord la population en groupes homogènes (semblables en eux-mêmes) (strates) qui sont distincts les uns des autres, c'est-à-dire. Le groupe 1 est différent du groupe 2. Ensuite, choisissez un EAS (échantillon aléatoire simple) distinct dans chaque strate et combinez ces EAS pour former l'échantillon complet. L'échantillonnage aléatoire stratifié est utilisé pour produire des échantillons non biaisés.

Cluster sampling

In statistics, cluster sampling is a sampling plan used when mutually homogeneous yet internally heterogeneous groupings are evident in a statistical population. It is often used in marketing research. In this sampling plan, the total population is divided into these groups (known as clusters) and a simple random sample of the groups is selected. The elements in each cluster are then sampled. If all elements in each sampled cluster are sampled, then this is referred to as a "one-stage" cluster sampling plan.

Survey sampling

In statistics, survey sampling describes the process of selecting a sample of elements from a target population to conduct a survey. The term "survey" may refer to many different types or techniques of observation. In survey sampling it most often involves a questionnaire used to measure the characteristics and/or attitudes of people. Different ways of contacting members of a sample once they have been selected is the subject of survey data collection.

Dérivabilité

Une fonction réelle d'une variable réelle est dérivable en un point a quand elle admet une dérivée finie en a, c'est-à-dire, intuitivement, quand elle peut être approchée de manière assez fine par une fonction affine au voisinage de a. Elle est dérivable sur un intervalle réel ouvert non vide si elle est dérivable en chaque point de cet intervalle. Elle est dérivable sur un intervalle réel fermé et borné (c'est-à-dire sur un segment réel) non réduit à un point si elle est dérivable sur l'intérieur de cet intervalle et dérivable à droite en sa borne gauche, et dérivable à gauche en sa borne droite.

Nonprobability sampling

Sampling is the use of a subset of the population to represent the whole population or to inform about (social) processes that are meaningful beyond the particular cases, individuals or sites studied. Probability sampling, or random sampling, is a sampling technique in which the probability of getting any particular sample may be calculated. In cases where external validity is not of critical importance to the study's goals or purpose, researchers might prefer to use nonprobability sampling.

Randomization

Randomization is the process of making something random. Randomization is not haphazard; instead, a random process is a sequence of random variables describing a process whose outcomes do not follow a deterministic pattern, but follow an evolution described by probability distributions. For example, a random sample of individuals from a population refers to a sample where every individual has a known probability of being sampled. This would be contrasted with nonprobability sampling where arbitrary individuals are selected.

Non-uniform random variate generation

Non-uniform random variate generation or pseudo-random number sampling is the numerical practice of generating pseudo-random numbers (PRN) that follow a given probability distribution. Methods are typically based on the availability of a uniformly distributed PRN generator. Computational algorithms are then used to manipulate a single random variate, X, or often several such variates, into a new random variate Y such that these values have the required distribution.

Neighbourhood system

In topology and related areas of mathematics, the neighbourhood system, complete system of neighbourhoods, or neighbourhood filter for a point in a topological space is the collection of all neighbourhoods of Neighbourhood of a point or set An of a point (or subset) in a topological space is any open subset of that contains A is any subset that contains open neighbourhood of ; explicitly, is a neighbourhood of in if and only if there exists some open subset with . Equivalently, a neighborhood of is any set that contains in its topological interior.

Voisinage (mathématiques)

En mathématiques, dans un espace topologique, un voisinage d'un point est une partie de l'espace qui contient un ouvert qui comprend ce point. C'est une notion centrale dans la description d'un espace topologique. Par opposition aux voisinages, les ensembles ouverts permettent de définir élégamment des propriétés globales comme la continuité en tout point. En revanche, pour les propriétés locales comme la continuité en un point donné ou la limite, la notion de voisinage (et le formalisme correspondant) est plus adaptée.

Continuous function

In mathematics, a continuous function is a function such that a continuous variation (that is a change without jump) of the argument induces a continuous variation of the value of the function. This means that there are no abrupt changes in value, known as discontinuities. More precisely, a function is continuous if arbitrarily small changes in its value can be assured by restricting to sufficiently small changes of its argument. A discontinuous function is a function that is .

Nonlinear dimensionality reduction

Nonlinear dimensionality reduction, also known as manifold learning, refers to various related techniques that aim to project high-dimensional data onto lower-dimensional latent manifolds, with the goal of either visualizing the data in the low-dimensional space, or learning the mapping (either from the high-dimensional space to the low-dimensional embedding or vice versa) itself. The techniques described below can be understood as generalizations of linear decomposition methods used for dimensionality reduction, such as singular value decomposition and principal component analysis.

Randomized experiment

In science, randomized experiments are the experiments that allow the greatest reliability and validity of statistical estimates of treatment effects. Randomization-based inference is especially important in experimental design and in survey sampling. In the statistical theory of design of experiments, randomization involves randomly allocating the experimental units across the treatment groups. For example, if an experiment compares a new drug against a standard drug, then the patients should be allocated to either the new drug or to the standard drug control using randomization.

Smoothness

In mathematical analysis, the smoothness of a function is a property measured by the number of continuous derivatives it has over some domain, called differentiability class. At the very minimum, a function could be considered smooth if it is differentiable everywhere (hence continuous). At the other end, it might also possess derivatives of all orders in its domain, in which case it is said to be infinitely differentiable and referred to as a C-infinity function (or function).

Fléau de la dimension

Le fléau de la dimension ou malédiction de la dimension (curse of dimensionality) est un terme inventé par Richard Bellman en 1961 pour désigner divers phénomènes qui ont lieu lorsque l'on cherche à analyser ou organiser des données dans des espaces de grande dimension alors qu'ils n'ont pas lieu dans des espaces de dimension moindre. Plusieurs domaines sont concernés et notamment l'apprentissage automatique, la fouille de données, les bases de données, l'analyse numérique ou encore l'échantillonnage.

Semi-differentiability

In calculus, a branch of mathematics, the notions of one-sided differentiability and semi-differentiability of a real-valued function f of a real variable are weaker than differentiability. Specifically, the function f is said to be right differentiable at a point a if, roughly speaking, a derivative can be defined as the function's argument x moves to a from the right, and left differentiable at a if the derivative can be defined as x moves to a from the left.

Apprentissage automatique

L'apprentissage automatique (en anglais : machine learning, « apprentissage machine »), apprentissage artificiel ou apprentissage statistique est un champ d'étude de l'intelligence artificielle qui se fonde sur des approches mathématiques et statistiques pour donner aux ordinateurs la capacité d'« apprendre » à partir de données, c'est-à-dire d'améliorer leurs performances à résoudre des tâches sans être explicitement programmés pour chacune. Plus largement, il concerne la conception, l'analyse, l'optimisation, le développement et l'implémentation de telles méthodes.

Générateur de nombres aléatoires

Un générateur de nombres aléatoires, random number generator (RNG) en anglais, est un dispositif capable de produire une suite de nombres pour lesquels il n'existe aucun lien calculable entre un nombre et ses prédécesseurs, de façon que cette séquence puisse être appelée « suite de nombres aléatoires ». Par extension, on utilise ce terme pour désigner des générateurs de nombres pseudo aléatoires, pour lesquels ce lien calculable existe, mais ne peut pas « facilement » être déduit.

Fonction régulière non analytique

En mathématiques, les fonctions régulières (i.e. les fonctions indéfiniment dérivables) et les fonctions analytiques sont deux types courants et d'importance parmi les fonctions. Si on peut prouver que toute fonction analytique réelle est régulière, la réciproque est fausse. Une des applications des fonctions régulières à support compact est la construction de fonctions régularisantes, qui sont utilisées dans la théorie des fonctions généralisées, telle la théorie des distributions de Laurent Schwartz.