Ridge regressionRidge regression is a method of estimating the coefficients of multiple-regression models in scenarios where the independent variables are highly correlated. It has been used in many fields including econometrics, chemistry, and engineering. Also known as Tikhonov regularization, named for Andrey Tikhonov, it is a method of regularization of ill-posed problems. It is particularly useful to mitigate the problem of multicollinearity in linear regression, which commonly occurs in models with large numbers of parameters.
Linear regressionIn statistics, linear regression is a linear approach for modelling the relationship between a scalar response and one or more explanatory variables (also known as dependent and independent variables). The case of one explanatory variable is called simple linear regression; for more than one, the process is called multiple linear regression. This term is distinct from multivariate linear regression, where multiple correlated dependent variables are predicted, rather than a single scalar variable.
Regularized least squaresRegularized least squares (RLS) is a family of methods for solving the least-squares problem while using regularization to further constrain the resulting solution. RLS is used for two main reasons. The first comes up when the number of variables in the linear system exceeds the number of observations. In such settings, the ordinary least-squares problem is ill-posed and is therefore impossible to fit because the associated optimization problem has infinitely many solutions.
Elastic net regularizationIn statistics and, in particular, in the fitting of linear or logistic regression models, the elastic net is a regularized regression method that linearly combines the L1 and L2 penalties of the lasso and ridge methods. The elastic net method overcomes the limitations of the LASSO (least absolute shrinkage and selection operator) method which uses a penalty function based on Use of this penalty function has several limitations. For example, in the "large p, small n" case (high-dimensional data with few examples), the LASSO selects at most n variables before it saturates.
Density functional theoryDensity-functional theory (DFT) is a computational quantum mechanical modelling method used in physics, chemistry and materials science to investigate the electronic structure (or nuclear structure) (principally the ground state) of many-body systems, in particular atoms, molecules, and the condensed phases. Using this theory, the properties of a many-electron system can be determined by using functionals, i.e. functions of another function. In the case of DFT, these are functionals of the spatially dependent electron density.
Docking (molecular)In the field of molecular modeling, docking is a method which predicts the preferred orientation of one molecule to a second when a ligand and a target are bound to each other to form a stable complex. Knowledge of the preferred orientation in turn may be used to predict the strength of association or binding affinity between two molecules using, for example, scoring functions. The associations between biologically relevant molecules such as proteins, peptides, nucleic acids, carbohydrates, and lipids play a central role in signal transduction.
Computational chemistryComputational chemistry is a branch of chemistry that uses computer simulation to assist in solving chemical problems. It uses methods of theoretical chemistry, incorporated into computer programs, to calculate the structures and properties of molecules, groups of molecules, and solids. It is essential because, apart from relatively recent results concerning the hydrogen molecular ion (dihydrogen cation, see references therein for more details), the quantum many-body problem cannot be solved analytically, much less in closed form.
Regularization (mathematics)In mathematics, statistics, finance, computer science, particularly in machine learning and inverse problems, regularization is a process that changes the result answer to be "simpler". It is often used to obtain results for ill-posed problems or to prevent overfitting. Although regularization procedures can be divided in many ways, the following delineation is particularly helpful: Explicit regularization is regularization whenever one explicitly adds a term to the optimization problem.
Bayesian linear regressionBayesian linear regression is a type of conditional modeling in which the mean of one variable is described by a linear combination of other variables, with the goal of obtaining the posterior probability of the regression coefficients (as well as other parameters describing the distribution of the regressand) and ultimately allowing the out-of-sample prediction of the regressand (often labelled ) conditional on observed values of the regressors (usually ).
Potential energyIn physics, potential energy is the energy held by an object because of its position relative to other objects, stresses within itself, its electric charge, or other factors. The term potential energy was introduced by the 19th-century Scottish engineer and physicist William Rankine, although it has links to the ancient Greek philosopher Aristotle's concept of potentiality. Common types of potential energy include the gravitational potential energy of an object, the elastic potential energy of an extended spring, and the electric potential energy of an electric charge in an electric field.
Force field (chemistry)In the context of chemistry and molecular modelling, a force field is a computational method that is used to estimate the forces between atoms within molecules and also between molecules. More precisely, the force field refers to the functional form and parameter sets used to calculate the potential energy of a system of atoms or coarse-grained particles in molecular mechanics, molecular dynamics, or Monte Carlo simulations. The parameters for a chosen energy function may be derived from experiments in physics and chemistry, calculations in quantum mechanics, or both.
Lasso (statistics)In statistics and machine learning, lasso (least absolute shrinkage and selection operator; also Lasso or LASSO) is a regression analysis method that performs both variable selection and regularization in order to enhance the prediction accuracy and interpretability of the resulting statistical model. It was originally introduced in geophysics, and later by Robert Tibshirani, who coined the term. Lasso was originally formulated for linear regression models. This simple case reveals a substantial amount about the estimator.
Alpha helixAn alpha helix (or α-helix) is a sequence of amino acids in a protein that are twisted into a coil (a helix). The alpha helix is the most common structural arrangement in the secondary structure of proteins. It is also the most extreme type of local structure, and it is the local structure that is most easily predicted from a sequence of amino acids. The alpha helix has a right hand-helix conformation in which every backbone N−H group hydrogen bonds to the backbone C=O group of the amino acid that is four residues earlier in the protein sequence.
Local-density approximationLocal-density approximations (LDA) are a class of approximations to the exchange–correlation (XC) energy functional in density functional theory (DFT) that depend solely upon the value of the electronic density at each point in space (and not, for example, derivatives of the density or the Kohn–Sham orbitals). Many approaches can yield local approximations to the XC energy. However, overwhelmingly successful local approximations are those that have been derived from the homogeneous electron gas (HEG) model.
Molecular dynamicsMolecular dynamics (MD) is a computer simulation method for analyzing the physical movements of atoms and molecules. The atoms and molecules are allowed to interact for a fixed period of time, giving a view of the dynamic "evolution" of the system. In the most common version, the trajectories of atoms and molecules are determined by numerically solving Newton's equations of motion for a system of interacting particles, where forces between the particles and their potential energies are often calculated using interatomic potentials or molecular mechanical force fields.
Grand potentialThe grand potential or Landau potential or Landau free energy is a quantity used in statistical mechanics, especially for irreversible processes in open systems. The grand potential is the characteristic state function for the grand canonical ensemble. Grand potential is defined by where U is the internal energy, T is the temperature of the system, S is the entropy, μ is the chemical potential, and N is the number of particles in the system.
Interatomic potentialInteratomic potentials are mathematical functions to calculate the potential energy of a system of atoms with given positions in space. Interatomic potentials are widely used as the physical basis of molecular mechanics and molecular dynamics simulations in computational chemistry, computational physics and computational materials science to explain and predict materials properties.
Electronic band structureIn solid-state physics, the electronic band structure (or simply band structure) of a solid describes the range of energy levels that electrons may have within it, as well as the ranges of energy that they may not have (called band gaps or forbidden bands). Band theory derives these bands and band gaps by examining the allowed quantum mechanical wave functions for an electron in a large, periodic lattice of atoms or molecules.
Electric potential energyElectric potential energy is a potential energy (measured in joules) that results from conservative Coulomb forces and is associated with the configuration of a particular set of point charges within a defined system. An object may be said to have electric potential energy by virtue of either its own electric charge or its relative position to other electrically charged objects. The term "electric potential energy" is used to describe the potential energy in systems with time-variant electric fields, while the term "electrostatic potential energy" is used to describe the potential energy in systems with time-invariant electric fields.
Logistic regressionIn statistics, the logistic model (or logit model) is a statistical model that models the probability of an event taking place by having the log-odds for the event be a linear combination of one or more independent variables. In regression analysis, logistic regression (or logit regression) is estimating the parameters of a logistic model (the coefficients in the linear combination).