Linear classifierIn the field of machine learning, the goal of statistical classification is to use an object's characteristics to identify which class (or group) it belongs to. A linear classifier achieves this by making a classification decision based on the value of a linear combination of the characteristics. An object's characteristics are also known as feature values and are typically presented to the machine in a vector called a feature vector.
Probabilistic classificationIn machine learning, a probabilistic classifier is a classifier that is able to predict, given an observation of an input, a probability distribution over a set of classes, rather than only outputting the most likely class that the observation should belong to. Probabilistic classifiers provide classification that can be useful in its own right or when combining classifiers into ensembles. Formally, an "ordinary" classifier is some rule, or function, that assigns to a sample x a class label ŷ: The samples come from some set X (e.
Support vector machineIn machine learning, support vector machines (SVMs, also support vector networks) are supervised learning models with associated learning algorithms that analyze data for classification and regression analysis. Developed at AT&T Bell Laboratories by Vladimir Vapnik with colleagues (Boser et al., 1992, Guyon et al., 1993, Cortes and Vapnik, 1995, Vapnik et al., 1997) SVMs are one of the most robust prediction methods, being based on statistical learning frameworks or VC theory proposed by Vapnik (1982, 1995) and Chervonenkis (1974).
Naive Bayes classifierIn statistics, naive Bayes classifiers are a family of simple "probabilistic classifiers" based on applying Bayes' theorem with strong (naive) independence assumptions between the features (see Bayes classifier). They are among the simplest Bayesian network models, but coupled with kernel density estimation, they can achieve high accuracy levels. Naive Bayes classifiers are highly scalable, requiring a number of parameters linear in the number of variables (features/predictors) in a learning problem.
Machine learningMachine learning (ML) is an umbrella term for solving problems for which development of algorithms by human programmers would be cost-prohibitive, and instead the problems are solved by helping machines 'discover' their 'own' algorithms, without needing to be explicitly told what to do by any human-developed algorithms. Recently, generative artificial neural networks have been able to surpass results of many previous approaches.
Multiclass classificationIn machine learning and statistical classification, multiclass classification or multinomial classification is the problem of classifying instances into one of three or more classes (classifying instances into one of two classes is called binary classification). While many classification algorithms (notably multinomial logistic regression) naturally permit the use of more than two classes, some are by nature binary algorithms; these can, however, be turned into multinomial classifiers by a variety of strategies.
Statistical classificationIn statistics, classification is the problem of identifying which of a set of categories (sub-populations) an observation (or observations) belongs to. Examples are assigning a given email to the "spam" or "non-spam" class, and assigning a diagnosis to a given patient based on observed characteristics of the patient (sex, blood pressure, presence or absence of certain symptoms, etc.). Often, the individual observations are analyzed into a set of quantifiable properties, known variously as explanatory variables or features.
Online machine learningIn computer science, online machine learning is a method of machine learning in which data becomes available in a sequential order and is used to update the best predictor for future data at each step, as opposed to batch learning techniques which generate the best predictor by learning on the entire training data set at once. Online learning is a common technique used in areas of machine learning where it is computationally infeasible to train over the entire dataset, requiring the need of out-of-core algorithms.
Platt scalingIn machine learning, Platt scaling or Platt calibration is a way of transforming the outputs of a classification model into a probability distribution over classes. The method was invented by John Platt in the context of support vector machines, replacing an earlier method by Vapnik, but can be applied to other classification models. Platt scaling works by fitting a logistic regression model to a classifier's scores. Consider the problem of binary classification: for inputs x, we want to determine whether they belong to one of two classes, arbitrarily labeled +1 and −1.
Therapeutic indexThe therapeutic index (TI; also referred to as therapeutic ratio) is a quantitative measurement of the relative safety of a drug. It is a comparison of the amount of a therapeutic agent that causes the therapeutic effect to the amount that causes toxicity. The related terms therapeutic window or safety window refer to a range of doses optimized between efficacy and toxicity, achieving the greatest therapeutic benefit without resulting in unacceptable side-effects or toxicity.
Rule-based machine learningRule-based machine learning (RBML) is a term in computer science intended to encompass any machine learning method that identifies, learns, or evolves 'rules' to store, manipulate or apply. The defining characteristic of a rule-based machine learner is the identification and utilization of a set of relational rules that collectively represent the knowledge captured by the system. This is in contrast to other machine learners that commonly identify a singular model that can be universally applied to any instance in order to make a prediction.
ProvenceProvence (prəˈvɒ̃s, USalsoprəʊˈ-, UKalsoprɒˈ-, pʁɔvɑ̃s) is a geographical region and historical province of southeastern France, which extends from the left bank of the lower Rhône to the west to the Italian border to the east; it is bordered by the Mediterranean Sea to the south. It largely corresponds with the modern administrative region of Provence-Alpes-Côte d'Azur and includes the departments of Var, Bouches-du-Rhône, Alpes-de-Haute-Provence, as well as parts of Alpes-Maritimes and Vaucluse.
VoltammetryVoltammetry is a category of electroanalytical methods used in analytical chemistry and various industrial processes. In voltammetry, information about an analyte is obtained by measuring the current as the potential is varied. The analytical data for a voltammetric experiment comes in the form of a voltammogram which plots the current produced by the analyte versus the potential of the working electrode. Voltammetry is the study of current as a function of applied potential.
Human serum albuminHuman serum albumin is the serum albumin found in human blood. It is the most abundant protein in human blood plasma; it constitutes about half of serum protein. It is produced in the liver. It is soluble in water, and it is monomeric. Albumin transports hormones, fatty acids, and other compounds, buffers pH, and maintains oncotic pressure, among other functions. Albumin is synthesized in the liver as preproalbumin, which has an N-terminal peptide that is removed before the nascent protein is released from the rough endoplasmic reticulum.