Dimensionality reductionDimensionality reduction, or dimension reduction, is the transformation of data from a high-dimensional space into a low-dimensional space so that the low-dimensional representation retains some meaningful properties of the original data, ideally close to its intrinsic dimension. Working in high-dimensional spaces can be undesirable for many reasons; raw data are often sparse as a consequence of the curse of dimensionality, and analyzing the data is usually computationally intractable (hard to control or deal with).
Curse of dimensionalityThe curse of dimensionality refers to various phenomena that arise when analyzing and organizing data in high-dimensional spaces that do not occur in low-dimensional settings such as the three-dimensional physical space of everyday experience. The expression was coined by Richard E. Bellman when considering problems in dynamic programming. Dimensionally cursed phenomena occur in domains such as numerical analysis, sampling, combinatorics, machine learning, data mining and databases.
Social network analysisSocial network analysis (SNA) is the process of investigating social structures through the use of networks and graph theory. It characterizes networked structures in terms of nodes (individual actors, people, or things within the network) and the ties, edges, or links (relationships or interactions) that connect them. Examples of social structures commonly visualized through social network analysis include social media networks, meme spread, information circulation, friendship and acquaintance networks, peer learner networks, business networks, knowledge networks, difficult working relationships, collaboration graphs, kinship, disease transmission, and sexual relationships.
Complex networkIn the context of network theory, a complex network is a graph (network) with non-trivial topological features—features that do not occur in simple networks such as lattices or random graphs but often occur in networks representing real systems. The study of complex networks is a young and active area of scientific research (since 2000) inspired largely by empirical findings of real-world networks such as computer networks, biological networks, technological networks, brain networks, climate networks and social networks.
Dynamic network analysisDynamic network analysis (DNA) is an emergent scientific field that brings together traditional social network analysis (SNA), link analysis (LA), social simulation and multi-agent systems (MAS) within network science and network theory. Dynamic networks are a function of time (modeled as a subset of the real numbers) to a set of graphs; for each time point there is a graph. This is akin to the definition of dynamical systems, in which the function is from time to an ambient space, where instead of ambient space time is translated to relationships between pairs of vertices.
Network scienceNetwork science is an academic field which studies complex networks such as telecommunication networks, computer networks, biological networks, cognitive and semantic networks, and social networks, considering distinct elements or actors represented by nodes (or vertices) and the connections between the elements or actors as links (or edges). The field draws on theories and methods including graph theory from mathematics, statistical mechanics from physics, data mining and information visualization from computer science, inferential modeling from statistics, and social structure from sociology.
Molecular dynamicsMolecular dynamics (MD) is a computer simulation method for analyzing the physical movements of atoms and molecules. The atoms and molecules are allowed to interact for a fixed period of time, giving a view of the dynamic "evolution" of the system. In the most common version, the trajectories of atoms and molecules are determined by numerically solving Newton's equations of motion for a system of interacting particles, where forces between the particles and their potential energies are often calculated using interatomic potentials or molecular mechanical force fields.
Phase transitionIn chemistry, thermodynamics, and other related fields, a phase transition (or phase change) is the physical process of transition between one state of a medium and another. Commonly the term is used to refer to changes among the basic states of matter: solid, liquid, and gas, and in rare cases, plasma. A phase of a thermodynamic system and the states of matter have uniform physical properties. During a phase transition of a given medium, certain properties of the medium change as a result of the change of external conditions, such as temperature or pressure.
Nonlinear dimensionality reductionNonlinear dimensionality reduction, also known as manifold learning, refers to various related techniques that aim to project high-dimensional data onto lower-dimensional latent manifolds, with the goal of either visualizing the data in the low-dimensional space, or learning the mapping (either from the high-dimensional space to the low-dimensional embedding or vice versa) itself. The techniques described below can be understood as generalizations of linear decomposition methods used for dimensionality reduction, such as singular value decomposition and principal component analysis.
Biological networkA biological network is a method of representing systems as complex sets of binary interactions or relations between various biological entities. In general, networks or graphs are used to capture relationships between entities or objects. A typical graphing representation consists of a set of nodes connected by edges. As early as 1736 Leonhard Euler analyzed a real-world issue known as the Seven Bridges of Königsberg, which established the foundation of graph theory. From the 1930's-1950's the study of random graphs were developed.
Molecular modellingMolecular modelling encompasses all methods, theoretical and computational, used to model or mimic the behaviour of molecules. The methods are used in the fields of computational chemistry, drug design, computational biology and materials science to study molecular systems ranging from small chemical systems to large biological molecules and material assemblies. The simplest calculations can be performed by hand, but inevitably computers are required to perform molecular modelling of any reasonably sized system.
Molecular design softwareMolecular design software is notable software for molecular modeling, that provides special support for developing molecular models de novo. In contrast to the usual molecular modeling programs, such as for molecular dynamics and quantum chemistry, such software directly supports the aspects related to constructing molecular models, including: Molecular graphics interactive molecular drawing and conformational editing building polymeric molecules, crystals, and solvated systems partial charges development g
Protein foldingProtein folding is the physical process where a protein chain is translated into its native three-dimensional structure, typically a "folded" conformation, by which the protein becomes biologically functional. Via an expeditious and reproducible process, a polypeptide folds into its characteristic three-dimensional structure from a random coil. Each protein exists first as an unfolded polypeptide or random coil after being translated from a sequence of mRNA into a linear chain of amino acids.
Gibbs free energyIn thermodynamics, the Gibbs free energy (or Gibbs energy as the recommended name; symbol ) is a thermodynamic potential that can be used to calculate the maximum amount of work, other than pressure-volume work, that may be performed by a thermodynamically closed system at constant temperature and pressure. It also provides a necessary condition for processes such as chemical reactions that may occur under these conditions. The Gibbs free energy is expressed as where p is pressure, T is the temperature, U is the internal energy, V is volume, H is the enthalpy, and S is the entropy.
Thermodynamic free energyIn thermodynamics, the thermodynamic free energy is one of the state functions of a thermodynamic system (the others being internal energy, enthalpy, entropy, etc.). The change in the free energy is the maximum amount of work that the system can perform in a process at constant temperature, and its sign indicates whether the process is thermodynamically favorable or forbidden. Since free energy usually contains potential energy, it is not absolute but depends on the choice of a zero point.
Surface energyIn surface science, surface free energy (also interfacial free energy or surface energy) quantifies the disruption of intermolecular bonds that occurs when a surface is created. In solid-state physics, surfaces must be intrinsically less energetically favorable than the bulk of the material (the atoms on the surface have more energy compared with the atoms in the bulk), otherwise there would be a driving force for surfaces to be created, removing the bulk of the material (see sublimation).
Surface tensionSurface tension is the tendency of liquid surfaces at rest to shrink into the minimum surface area possible. Surface tension is what allows objects with a higher density than water such as razor blades and insects (e.g. water striders) to float on a water surface without becoming even partly submerged. At liquid–air interfaces, surface tension results from the greater attraction of liquid molecules to each other (due to cohesion) than to the molecules in the air (due to adhesion). There are two primary mechanisms in play.
Protein domainIn molecular biology, a protein domain is a region of a protein's polypeptide chain that is self-stabilizing and that folds independently from the rest. Each domain forms a compact folded three-dimensional structure. Many proteins consist of several domains, and a domain may appear in a variety of different proteins. Molecular evolution uses domains as building blocks and these may be recombined in different arrangements to create proteins with different functions.
Protein fold classIn molecular biology, protein fold classes are broad categories of protein tertiary structure topology. They describe groups of proteins that share similar amino acid and secondary structure proportions. Each class contains multiple, independent protein superfamilies (i.e. are not necessarily evolutionarily related to one another). Four large classes of protein that are generally agreed upon by the two main structure classification databases (SCOP and CATH).
Clustering high-dimensional dataClustering high-dimensional data is the cluster analysis of data with anywhere from a few dozen to many thousands of dimensions. Such high-dimensional spaces of data are often encountered in areas such as medicine, where DNA microarray technology can produce many measurements at once, and the clustering of text documents, where, if a word-frequency vector is used, the number of dimensions equals the size of the vocabulary.