Kullback–Leibler divergenceIn mathematical statistics, the Kullback–Leibler divergence (also called relative entropy and I-divergence), denoted , is a type of statistical distance: a measure of how one probability distribution P is different from a second, reference probability distribution Q. A simple interpretation of the KL divergence of P from Q is the expected excess surprise from using Q as a model when the actual distribution is P.
Rényi entropyIn information theory, the Rényi entropy is a quantity that generalizes various notions of entropy, including Hartley entropy, Shannon entropy, collision entropy, and min-entropy. The Rényi entropy is named after Alfréd Rényi, who looked for the most general way to quantify information while preserving additivity for independent events. In the context of fractal dimension estimation, the Rényi entropy forms the basis of the concept of generalized dimensions. The Rényi entropy is important in ecology and statistics as index of diversity.
ParameterA parameter (), generally, is any characteristic that can help in defining or classifying a particular system (meaning an event, project, object, situation, etc.). That is, a parameter is an element of a system that is useful, or critical, when identifying the system, or when evaluating its performance, status, condition, etc. Parameter has more specific meanings within various disciplines, including mathematics, computer programming, engineering, statistics, logic, linguistics, and electronic musical composition.
Scale parameterIn probability theory and statistics, a scale parameter is a special kind of numerical parameter of a parametric family of probability distributions. The larger the scale parameter, the more spread out the distribution. If a family of probability distributions is such that there is a parameter s (and other parameters θ) for which the cumulative distribution function satisfies then s is called a scale parameter, since its value determines the "scale" or statistical dispersion of the probability distribution.
Divergence (statistics)In information geometry, a divergence is a kind of statistical distance: a binary function which establishes the separation from one probability distribution to another on a statistical manifold. The simplest divergence is squared Euclidean distance (SED), and divergences can be viewed as generalizations of SED. The other most important divergence is relative entropy (also called Kullback–Leibler divergence), which is central to information theory.
Library and information scienceLibrary and information science(s) or studies (LIS) is an interdisciplinary field of study that deals generally with organization, access, collection, and protection/regulation of information, whether in physical or digital forms. In spite of various trends to merge the two fields, some consider the two original disciplines, library science and information science, to be separate. However, it is common today to use the terms synonymously or to drop the term "library" and to speak about information departments or I-schools.
Bregman divergenceIn mathematics, specifically statistics and information geometry, a Bregman divergence or Bregman distance is a measure of difference between two points, defined in terms of a strictly convex function; they form an important class of divergences. When the points are interpreted as probability distributions – notably as either values of the parameter of a parametric model or as a data set of observed values – the resulting distance is a statistical distance. The most basic Bregman divergence is the squared Euclidean distance.
Mandelbrot setThe Mandelbrot set (ˈmændəlbroʊt,_-brɒt) is a two dimensional set with a relatively simple definition that exhibits great complexity, especially as it is magnified. It is popular for its aesthetic appeal and fractal structures. The set is defined in the complex plane as the complex numbers for which the function does not diverge to infinity when iterated starting at , i.e., for which the sequence , , etc., remains bounded in absolute value. This set was first defined and drawn by Robert W.
Public libraryA public library is a library, most often a lending library, that is accessible by the general public and is usually funded from public sources, such as taxes. It is operated by librarians and library paraprofessionals, who are also civil servants. There are five fundamental characteristics shared by public libraries: they are generally supported by taxes (usually local, though any level of government can and may contribute); they are governed by a board to serve the public interest; they are open to all, and every community member can access the collection; they are entirely voluntary, no one is ever forced to use the services provided and they provide library and information services without charge.
Sufficient statisticIn statistics, a statistic is sufficient with respect to a statistical model and its associated unknown parameter if "no other statistic that can be calculated from the same sample provides any additional information as to the value of the parameter". In particular, a statistic is sufficient for a family of probability distributions if the sample from which it is calculated gives no additional information than the statistic, as to which of those probability distributions is the sampling distribution.
SequenceIn mathematics, a sequence is an enumerated collection of objects in which repetitions are allowed and order matters. Like a set, it contains members (also called elements, or terms). The number of elements (possibly infinite) is called the length of the sequence. Unlike a set, the same elements can appear multiple times at different positions in a sequence, and unlike a set, the order does matter. Formally, a sequence can be defined as a function from natural numbers (the positions of elements in the sequence) to the elements at each position.
Statistical parameterIn statistics, as opposed to its general use in mathematics, a parameter is any measured quantity of a statistical population that summarises or describes an aspect of the population, such as a mean or a standard deviation. If a population exactly follows a known and defined distribution, for example the normal distribution, then a small set of parameters can be measured which completely describes the population, and can be considered to define a probability distribution for the purposes of extracting samples from this population.
Efficiency (statistics)In statistics, efficiency is a measure of quality of an estimator, of an experimental design, or of a hypothesis testing procedure. Essentially, a more efficient estimator needs fewer input data or observations than a less efficient one to achieve the Cramér–Rao bound. An efficient estimator is characterized by having the smallest possible variance, indicating that there is a small deviance between the estimated value and the "true" value in the L2 norm sense.
Probability distributionIn probability theory and statistics, a probability distribution is the mathematical function that gives the probabilities of occurrence of different possible outcomes for an experiment. It is a mathematical description of a random phenomenon in terms of its sample space and the probabilities of events (subsets of the sample space). For instance, if X is used to denote the outcome of a coin toss ("the experiment"), then the probability distribution of X would take the value 0.5 (1 in 2 or 1/2) for X = heads, and 0.
Characteristic function (probability theory)In probability theory and statistics, the characteristic function of any real-valued random variable completely defines its probability distribution. If a random variable admits a probability density function, then the characteristic function is the Fourier transform of the probability density function. Thus it provides an alternative route to analytical results compared with working directly with probability density functions or cumulative distribution functions.
LibraryA library is a collection of books, and possibly other materials and media, that is accessible for use by its members and members of allied institutions. Libraries provide physical (hard copies) or digital access (soft copies) materials, and may be a physical location or a virtual space, or both. A library's collection normally includes printed materials which can be borrowed, and a reference section of publications which are not permitted to leave the library and can only be viewed inside the premises.
Sequence alignmentIn bioinformatics, a sequence alignment is a way of arranging the sequences of DNA, RNA, or protein to identify regions of similarity that may be a consequence of functional, structural, or evolutionary relationships between the sequences. Aligned sequences of nucleotide or amino acid residues are typically represented as rows within a matrix. Gaps are inserted between the residues so that identical or similar characters are aligned in successive columns.
Entropy (information theory)In information theory, the entropy of a random variable is the average level of "information", "surprise", or "uncertainty" inherent to the variable's possible outcomes. Given a discrete random variable , which takes values in the alphabet and is distributed according to : where denotes the sum over the variable's possible values. The choice of base for , the logarithm, varies for different applications. Base 2 gives the unit of bits (or "shannons"), while base e gives "natural units" nat, and base 10 gives units of "dits", "bans", or "hartleys".
Library technicianA library technician is a skilled library and information professional to perform the day-to-day functions of a library, and assists libraries in the acquisition, preparation, and organization of information. They also assist patrons in finding information. The widespread use of computerized information storage and retrieval systems has resulted in library technicians assisting in the handling of technical services (such as cataloguing). Especially in small village libraries, a library technician may be the only person (or one of only a few) staffing the library.
Cumulative distribution functionIn probability theory and statistics, the cumulative distribution function (CDF) of a real-valued random variable , or just distribution function of , evaluated at , is the probability that will take a value less than or equal to . Every probability distribution supported on the real numbers, discrete or "mixed" as well as continuous, is uniquely identified by a right-continuous monotone increasing function (a càdlàg function) satisfying and .