Publication

Variably Scaled Kernels Improve Classification of Hormonally-Treated Patient-Derived Xenografts

Related concepts (28)

Random forests or random decision forests is an ensemble learning method for classification, regression and other tasks that operates by constructing a multitude of decision trees at training time. For classification tasks, the output of the random forest is the class selected by most trees. For regression tasks, the mean or average prediction of the individual trees is returned. Random decision forests correct for decision trees' habit of overfitting to their training set.

Machine learning

Machine learning (ML) is an umbrella term for solving problems for which development of algorithms by human programmers would be cost-prohibitive, and instead the problems are solved by helping machines 'discover' their 'own' algorithms, without needing to be explicitly told what to do by any human-developed algorithms. Recently, generative artificial neural networks have been able to surpass results of many previous approaches.

Peptide hormone

Peptide hormones are hormones whose molecules are peptides. Peptide hormones have shorter amino acid chain lengths than protein hormones. These hormones have an effect on the endocrine system of animals, including humans. Most hormones can be classified as either amino acid–based hormones (amine, peptide, or protein) or steroid hormones. The former are water-soluble and act on the surface of target cells via second messengers; the latter, being lipid-soluble, move through the plasma membranes of target cells (both cytoplasmic and nuclear) to act within their nuclei.

Supervised learning

Supervised learning (SL) is a paradigm in machine learning where input objects (for example, a vector of predictor variables) and a desired output value (also known as human-labeled supervisory signal) train a model. The training data is processed, building a function that maps new data on expected output values. An optimal scenario will allow for the algorithm to correctly determine output values for unseen instances. This requires the learning algorithm to generalize from the training data to unseen situations in a "reasonable" way (see inductive bias).

Support vector machine

In machine learning, support vector machines (SVMs, also support vector networks) are supervised learning models with associated learning algorithms that analyze data for classification and regression analysis. Developed at AT&T Bell Laboratories by Vladimir Vapnik with colleagues (Boser et al., 1992, Guyon et al., 1993, Cortes and Vapnik, 1995, Vapnik et al., 1997) SVMs are one of the most robust prediction methods, being based on statistical learning frameworks or VC theory proposed by Vapnik (1982, 1995) and Chervonenkis (1974).

Androgen

An androgen (from Greek andr-, the stem of the word meaning "man") is any natural or synthetic steroid hormone that regulates the development and maintenance of male characteristics in vertebrates by binding to androgen receptors. This includes the embryological development of the primary male sex organs, and the development of male secondary sex characteristics at puberty. Androgens are synthesized in the testes, the ovaries, and the adrenal glands. Androgens increase in both males and females during puberty.

Online machine learning

In computer science, online machine learning is a method of machine learning in which data becomes available in a sequential order and is used to update the best predictor for future data at each step, as opposed to batch learning techniques which generate the best predictor by learning on the entire training data set at once. Online learning is a common technique used in areas of machine learning where it is computationally infeasible to train over the entire dataset, requiring the need of out-of-core algorithms.

Bootstrap aggregating

Bootstrap aggregating, also called bagging (from bootstrap aggregating), is a machine learning ensemble meta-algorithm designed to improve the stability and accuracy of machine learning algorithms used in statistical classification and regression. It also reduces variance and helps to avoid overfitting. Although it is usually applied to decision tree methods, it can be used with any type of method. Bagging is a special case of the model averaging approach.

Hypothalamus

The hypothalamus () is a part of the brain that contains a number of small nuclei with a variety of functions. One of the most important functions is to link the nervous system to the endocrine system via the pituitary gland. The hypothalamus is located below the thalamus and is part of the limbic system. In the terminology of neuroanatomy, it forms the ventral part of the diencephalon. All vertebrate brains contain a hypothalamus. In humans, it is the size of an almond.

Multiclass classification

In machine learning and statistical classification, multiclass classification or multinomial classification is the problem of classifying instances into one of three or more classes (classifying instances into one of two classes is called binary classification). While many classification algorithms (notably multinomial logistic regression) naturally permit the use of more than two classes, some are by nature binary algorithms; these can, however, be turned into multinomial classifiers by a variety of strategies.

Estrogen

Estrogen or oestrogen (see spelling differences) is a category of sex hormone responsible for the development and regulation of the female reproductive system and secondary sex characteristics. There are three major endogenous estrogens that have estrogenic hormonal activity: estrone (E1), estradiol (E2), and estriol (E3). Estradiol, an estrane, is the most potent and prevalent. Another estrogen called estetrol (E4) is produced only during pregnancy. Estrogens are synthesized in all vertebrates and some insects.

Feature selection

Feature selection is the process of selecting a subset of relevant features (variables, predictors) for use in model construction. Stylometry and DNA microarray analysis are two cases where feature selection is used. It should be distinguished from feature extraction. Feature selection techniques are used for several reasons: simplification of models to make them easier to interpret by researchers/users, shorter training times, to avoid the curse of dimensionality, improve data's compatibility with a learning model class, encode inherent symmetries present in the input space.

K-nearest neighbors algorithm

In statistics, the k-nearest neighbors algorithm (k-NN) is a non-parametric supervised learning method first developed by Evelyn Fix and Joseph Hodges in 1951, and later expanded by Thomas Cover. It is used for classification and regression. In both cases, the input consists of the k closest training examples in a data set. The output depends on whether k-NN is used for classification or regression: In k-NN classification, the output is a class membership.

Binary classification

Binary classification is the task of classifying the elements of a set into two groups (each called class) on the basis of a classification rule. Typical binary classification problems include: Medical testing to determine if a patient has certain disease or not; Quality control in industry, deciding whether a specification has been met; In information retrieval, deciding whether a page should be in the result set of a search or not. Binary classification is dichotomization applied to a practical situation.

Neural architecture search

Neural architecture search (NAS) is a technique for automating the design of artificial neural networks (ANN), a widely used model in the field of machine learning. NAS has been used to design networks that are on par or outperform hand-designed architectures. Methods for NAS can be categorized according to the search space, search strategy and performance estimation strategy used: The search space defines the type(s) of ANN that can be designed and optimized. The search strategy defines the approach used to explore the search space.

G protein-coupled receptor

G protein-coupled receptors (GPCRs), also known as seven-(pass)-transmembrane domain receptors, 7TM receptors, heptahelical receptors, serpentine receptors, and G protein-linked receptors (GPLR), form a large group of evolutionarily related proteins that are cell surface receptors that detect molecules outside the cell and activate cellular responses. They are coupled with G proteins.

Linear discriminant analysis

Linear discriminant analysis (LDA), normal discriminant analysis (NDA), or discriminant function analysis is a generalization of Fisher's linear discriminant, a method used in statistics and other fields, to find a linear combination of features that characterizes or separates two or more classes of objects or events. The resulting combination may be used as a linear classifier, or, more commonly, for dimensionality reduction before later classification.

Decision tree

A decision tree is a decision support hierarchical model that uses a tree-like model of decisions and their possible consequences, including chance event outcomes, resource costs, and utility. It is one way to display an algorithm that only contains conditional control statements. Decision trees are commonly used in operations research, specifically in decision analysis, to help identify a strategy most likely to reach a goal, but are also a popular tool in machine learning.

Thyroid hormone receptor

The thyroid hormone receptor (TR) is a type of nuclear receptor that is activated by binding thyroid hormone. TRs act as transcription factors, ultimately affecting the regulation of gene transcription and translation. These receptors also have non-genomic effects that lead to second messenger activation, and corresponding cellular response. There are four domains that are present in all TRs. Two of these, the DNA-binding (DBD) and hinge domains, are involved in the ability of the receptor to bind hormone response elements( HREs).

Automated machine learning

Automated machine learning (AutoML) is the process of automating the tasks of applying machine learning to real-world problems. AutoML potentially includes every stage from beginning with a raw dataset to building a machine learning model ready for deployment. AutoML was proposed as an artificial intelligence-based solution to the growing challenge of applying machine learning. The high degree of automation in AutoML aims to allow non-experts to make use of machine learning models and techniques without requiring them to become experts in machine learning.