Gene expressionGene expression is the process by which information from a gene is used in the synthesis of a functional gene product that enables it to produce end products, proteins or non-coding RNA, and ultimately affect a phenotype. These products are often proteins, but in non-protein-coding genes such as transfer RNA (tRNA) and small nuclear RNA (snRNA), the product is a functional non-coding RNA.
Gene expression profilingIn the field of molecular biology, gene expression profiling is the measurement of the activity (the expression) of thousands of genes at once, to create a global picture of cellular function. These profiles can, for example, distinguish between cells that are actively dividing, or show how the cells react to a particular treatment. Many experiments of this sort measure an entire genome simultaneously, that is, every gene present in a particular cell. Several transcriptomics technologies can be used to generate the necessary data to analyse.
Cluster analysisCluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense) to each other than to those in other groups (clusters). It is a main task of exploratory data analysis, and a common technique for statistical data analysis, used in many fields, including pattern recognition, , information retrieval, bioinformatics, data compression, computer graphics and machine learning.
Open clusterAn open cluster is a type of star cluster made of tens to a few thousand stars that were formed from the same giant molecular cloud and have roughly the same age. More than 1,100 open clusters have been discovered within the Milky Way galaxy, and many more are thought to exist. They are loosely bound by mutual gravitational attraction and become disrupted by close encounters with other clusters and clouds of gas as they orbit the Galactic Center.
GeneIn biology, the word gene (from γένος, génos; meaning generation or birth or gender) can have several different meanings. The Mendelian gene is a basic unit of heredity and the molecular gene is a sequence of nucleotides in DNA that is transcribed to produce a functional RNA. There are two types of molecular genes: protein-coding genes and noncoding genes. During gene expression, the DNA is first copied into RNA. The RNA can be directly functional or be the intermediate template for a protein that performs a function.
K-means clusteringk-means clustering is a method of vector quantization, originally from signal processing, that aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean (cluster centers or cluster centroid), serving as a prototype of the cluster. This results in a partitioning of the data space into Voronoi cells. k-means clustering minimizes within-cluster variances (squared Euclidean distances), but not regular Euclidean distances, which would be the more difficult Weber problem: the mean optimizes squared errors, whereas only the geometric median minimizes Euclidean distances.
Serial analysis of gene expressionSerial Analysis of Gene Expression (SAGE) is a transcriptomic technique used by molecular biologists to produce a snapshot of the messenger RNA population in a sample of interest in the form of small tags that correspond to fragments of those transcripts. Several variants have been developed since, most notably a more robust version, LongSAGE, RL-SAGE and the most recent SuperSAGE. Many of these have improved the technique with the capture of longer tags, enabling more confident identification of a source gene.
Molecular geneticsMolecular genetics is a sub-field of biology that addresses how differences in the structures or expression of DNA molecules manifests as variation among organisms. Molecular genetics often applies an "investigative approach" to determine the structure and/or function of genes in an organism's genome using genetic screens. The field of study is based on the merging of several sub-fields in biology: classical Mendelian inheritance, cellular biology, molecular biology, biochemistry, and biotechnology.
Patch-sequencingPatch-sequencing (patch-seq) is a method designed for tackling specific problems involved in characterizing neurons. As neural tissues are one of the most transcriptomically diverse populations of cells, classifying neurons into cell types in order to understand the circuits they form is a major challenge for neuroscientists. Combining classical classification methods with single cell RNA-sequencing post-hoc has proved to be difficult and slow.
Molecular biologyMolecular biology məˈlɛkjʊlər is the study of chemical and physical structure of biological macromolecules. It is a branch of biology that seeks to understand the molecular basis of biological activity in and between cells, including biomolecular synthesis, modification, mechanisms, and interactions. Molecular biology was first described as an approach focused on the underpinnings of biological phenomena—uncovering the structures of biological molecules as well as their interactions, and how these interactions explain observations of classical biology.
Expressed sequence tagIn genetics, an expressed sequence tag (EST) is a short sub-sequence of a cDNA sequence. ESTs may be used to identify gene transcripts, and were instrumental in gene discovery and in gene-sequence determination. The identification of ESTs has proceeded rapidly, with approximately 74.2 million ESTs now available in public databases (e.g. GenBank 1 January 2013, all species). EST approaches have largely been superseded by whole genome and transcriptome sequencing and metagenome sequencing.
Regulation of gene expressionRegulation of gene expression, or gene regulation, includes a wide range of mechanisms that are used by cells to increase or decrease the production of specific gene products (protein or RNA). Sophisticated programs of gene expression are widely observed in biology, for example to trigger developmental pathways, respond to environmental stimuli, or adapt to new food sources. Virtually any step of gene expression can be modulated, from transcriptional initiation, to RNA processing, and to the post-translational modification of a protein.
Patch clampThe patch clamp technique is a laboratory technique in electrophysiology used to study ionic currents in individual isolated living cells, tissue sections, or patches of cell membrane. The technique is especially useful in the study of excitable cells such as neurons, cardiomyocytes, muscle fibers, and pancreatic beta cells, and can also be applied to the study of bacterial ion channels in specially prepared giant spheroplasts. Patch clamping can be performed using the voltage clamp technique.
Housekeeping geneIn molecular biology, housekeeping genes are typically constitutive genes that are required for the maintenance of basic cellular function, and are expressed in all cells of an organism under normal and patho-physiological conditions. Although some housekeeping genes are expressed at relatively constant rates in most non-pathological situations, the expression of other housekeeping genes may vary depending on experimental conditions. The origin of the term "housekeeping gene" remains obscure.
Hierarchical clusteringIn data mining and statistics, hierarchical clustering (also called hierarchical cluster analysis or HCA) is a method of cluster analysis that seeks to build a hierarchy of clusters. Strategies for hierarchical clustering generally fall into two categories: Agglomerative: This is a "bottom-up" approach: Each observation starts in its own cluster, and pairs of clusters are merged as one moves up the hierarchy. Divisive: This is a "top-down" approach: All observations start in one cluster, and splits are performed recursively as one moves down the hierarchy.
Unsupervised learningUnsupervised learning, is paradigm in machine learning where, in contrast to supervised learning and semi-supervised learning, algorithms learn patterns exclusively from unlabeled data. Neural network tasks are often categorized as discriminative (recognition) or generative (imagination). Often but not always, discriminative tasks use supervised methods and generative tasks use unsupervised (see Venn diagram); however, the separation is very hazy. For example, object recognition favors supervised learning but unsupervised learning can also cluster objects into groups.
DBSCANDensity-based spatial clustering of applications with noise (DBSCAN) is a data clustering algorithm proposed by Martin Ester, Hans-Peter Kriegel, Jörg Sander and Xiaowei Xu in 1996. It is a density-based clustering non-parametric algorithm: given a set of points in some space, it groups together points that are closely packed together (points with many nearby neighbors), marking as outliers points that lie alone in low-density regions (whose nearest neighbors are too far away).
ElectrophysiologyElectrophysiology (from Greek ἥλεκτ, ēlektron, "amber" [see the etymology of "electron"]; φύσις, physis, "nature, origin"; and -λογία, -logia) is the branch of physiology that studies the electrical properties of biological cells and tissues. It involves measurements of voltage changes or electric current or manipulations on a wide variety of scales from single ion channel proteins to whole organs like the heart. In neuroscience, it includes measurements of the electrical activity of neurons, and, in particular, action potential activity.
Decision tree learningDecision tree learning is a supervised learning approach used in statistics, data mining and machine learning. In this formalism, a classification or regression decision tree is used as a predictive model to draw conclusions about a set of observations. Tree models where the target variable can take a discrete set of values are called classification trees; in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels.
Voltage clampThe voltage clamp is an experimental method used by electrophysiologists to measure the ion currents through the membranes of excitable cells, such as neurons, while holding the membrane voltage at a set level. A basic voltage clamp will iteratively measure the membrane potential, and then change the membrane potential (voltage) to a desired value by adding the necessary current. This "clamps" the cell membrane at a desired constant voltage, allowing the voltage clamp to record what currents are delivered.