AntibodyAn antibody (Ab), also known as an immunoglobulin (Ig), is a large, Y-shaped protein used by the immune system to identify and neutralize foreign objects such as pathogenic bacteria and viruses. The antibody recognizes a unique molecule of the pathogen, called an antigen. Each tip of the "Y" of an antibody contains a paratope (analogous to a lock) that is specific for one particular epitope (analogous to a key) on an antigen, allowing these two structures to bind together with precision.
Single-nucleotide polymorphismIn genetics and bioinformatics, a single-nucleotide polymorphism (SNP snɪp; plural SNPs snɪps) is a germline substitution of a single nucleotide at a specific position in the genome that is present in a sufficiently large fraction of considered population (generally regarded as 1% or more). For example, a G nucleotide present at a specific location in a reference genome may be replaced by an A in a minority of individuals. The two possible nucleotide variations of this SNP – G or A – are called alleles.
Fc receptorIn immunology, a Fc receptor is a protein found on the surface of certain cells – including, among others, B lymphocytes, follicular dendritic cells, natural killer cells, macrophages, neutrophils, eosinophils, basophils, human platelets, and mast cells – that contribute to the protective functions of the immune system. Its name is derived from its binding specificity for a part of an antibody known as the Fc (fragment crystallizable) region. Fc receptors bind to antibodies that are attached to infected cells or invading pathogens.
Gene polymorphismA gene is said to be polymorphic if more than one allele occupies that gene's locus within a population. In addition to having more than one allele at a specific locus, each allele must also occur in the population at a rate of at least 1% to generally be considered polymorphic. Gene polymorphisms can occur in any region of the genome. The majority of polymorphisms are silent, meaning they do not alter the function or expression of a gene. Some polymorphisms are visible.
GeneIn biology, the word gene (from γένος, génos; meaning generation or birth or gender) can have several different meanings. The Mendelian gene is a basic unit of heredity and the molecular gene is a sequence of nucleotides in DNA that is transcribed to produce a functional RNA. There are two types of molecular genes: protein-coding genes and noncoding genes. During gene expression, the DNA is first copied into RNA. The RNA can be directly functional or be the intermediate template for a protein that performs a function.
DNA sequencingDNA sequencing is the process of determining the nucleic acid sequence – the order of nucleotides in DNA. It includes any method or technology that is used to determine the order of the four bases: adenine, guanine, cytosine, and thymine. The advent of rapid DNA sequencing methods has greatly accelerated biological and medical research and discovery. Knowledge of DNA sequences has become indispensable for basic biological research, DNA Genographic Projects and in numerous applied fields such as medical diagnosis, biotechnology, forensic biology, virology and biological systematics.
Massive parallel sequencingMassive parallel sequencing or massively parallel sequencing is any of several high-throughput approaches to DNA sequencing using the concept of massively parallel processing; it is also called next-generation sequencing (NGS) or second-generation sequencing. Some of these technologies emerged between 1993 and 1998 and have been commercially available since 2005. These technologies use miniaturized and parallelized platforms for sequencing of 1 million to 43 billion short reads (50 to 400 bases each) per instrument run.
Third-generation sequencingThird-generation sequencing (also known as long-read sequencing) is a class of DNA sequencing methods currently under active development. Third generation sequencing technologies have the capability to produce substantially longer reads than second generation sequencing, also known as next-generation sequencing. Such an advantage has critical implications for both genome science and the study of biology in general. However, third generation sequencing data have much higher error rates than previous technologies, which can complicate downstream genome assembly and analysis of the resulting data.
Sanger sequencingSanger sequencing is a method of DNA sequencing that involves electrophoresis and is based on the random incorporation of chain-terminating dideoxynucleotides by DNA polymerase during in vitro DNA replication. After first being developed by Frederick Sanger and colleagues in 1977, it became the most widely used sequencing method for approximately 40 years. It was first commercialized by Applied Biosystems in 1986. More recently, higher volume Sanger sequencing has been replaced by next generation sequencing methods, especially for large-scale, automated genome analyses.
Restriction fragment length polymorphismIn molecular biology, restriction fragment length polymorphism (RFLP) is a technique that exploits variations in homologous DNA sequences, known as polymorphisms, populations, or species or to pinpoint the locations of genes within a sequence. The term may refer to a polymorphism itself, as detected through the differing locations of restriction enzyme sites, or to a related laboratory technique by which such differences can be illustrated.
Autoimmune diseaseAn autoimmune disease is a condition that results from an anomalous response of the immune system, wherein it mistakenly targets and attacks healthy, functioning parts of the body as if they were foreign organisms. It is estimated that there are more than 80 recognized autoimmune diseases, with recent scientific evidence suggesting the existence of potentially more than 100 distinct conditions. Nearly any body part can be involved. Symptoms of autoimmune diseases can significantly vary, primarily based on the specific type of the disease and the body part that it affects.
Immunoglobulin GImmunoglobulin G (IgG) is a type of antibody. Representing approximately 75% of serum antibodies in humans, IgG is the most common type of antibody found in blood circulation. IgG molecules are created and released by plasma B cells. Each IgG antibody has two paratopes. It is the most common antibody. Antibodies are major components of humoral immunity. IgG is the main type of antibody found in blood and extracellular fluid, allowing it to control infection of body tissues.
Clinical metagenomic sequencingClinical metagenomic next-generation sequencing (mNGS) is the comprehensive analysis of microbial and host genetic material (DNA or RNA) in clinical samples from patients by next-generation sequencing. It uses the techniques of metagenomics to identify and characterize the genome of bacteria, fungi, parasites, and viruses without the need for a prior knowledge of a specific pathogen directly from clinical specimens.
Exome sequencingExome sequencing, also known as whole exome sequencing (WES), is a genomic technique for sequencing all of the protein-coding regions of genes in a genome (known as the exome). It consists of two steps: the first step is to select only the subset of DNA that encodes proteins. These regions are known as exons—humans have about 180,000 exons, constituting about 1% of the human genome, or approximately 30 million base pairs. The second step is to sequence the exonic DNA using any high-throughput DNA sequencing technology.
Genome-wide association studyIn genomics, a genome-wide association study (GWA study, or GWAS), is an observational study of a genome-wide set of genetic variants in different individuals to see if any variant is associated with a trait. GWA studies typically focus on associations between single-nucleotide polymorphisms (SNPs) and traits like major human diseases, but can equally be applied to any other genetic variants and any other organisms. When applied to human data, GWA studies compare the DNA of participants having varying phenotypes for a particular trait or disease.
Whole genome sequencingWhole genome sequencing (WGS), also known as full genome sequencing, complete genome sequencing, or entire genome sequencing, is the process of determining the entirety, or nearly the entirety, of the DNA sequence of an organism's genome at a single time. This entails sequencing all of an organism's chromosomal DNA as well as DNA contained in the mitochondria and, for plants, in the chloroplast. Whole genome sequencing has largely been used as a research tool, but was being introduced to clinics in 2014.
HIV/AIDSHuman immunodeficiency virus infection and acquired immunodeficiency syndrome (HIV/AIDS) is a spectrum of conditions caused by infection with the human immunodeficiency virus (HIV), a retrovirus. Following initial infection an individual may not notice any symptoms, or may experience a brief period of influenza-like illness. Typically, this is followed by a prolonged incubation period with no symptoms.
GenomeIn the fields of molecular biology and genetics, a genome is all the genetic information of an organism. It consists of nucleotide sequences of DNA (or RNA in RNA viruses). The nuclear genome includes protein-coding genes and non-coding genes, other functional regions of the genome such as regulatory sequences (see non-coding DNA), and often a substantial fraction of junk DNA with no evident function. Almost all eukaryotes have mitochondria and a small mitochondrial genome.
Chromosomal inversionAn inversion is a chromosome rearrangement in which a segment of a chromosome becomes inverted within its original position. An inversion occurs when a chromosome undergoes a two breaks within the chromosomal arm, and the segment between the two breaks inserts itself in the opposite direction in the same chromosome arm. The breakpoints of inversions often happen in regions of repetitive nucleotides, and the regions may be reused in other inversions. Chromosomal segments in inversions can be as small as 100 kilobases or as large as 100 megabases.
Copy number variationCopy number variation (CNV) is a phenomenon in which sections of the genome are repeated and the number of repeats in the genome varies between individuals. Copy number variation is a type of structural variation: specifically, it is a type of duplication or deletion event that affects a considerable number of base pairs. Approximately two-thirds of the entire human genome may be composed of repeats and 4.8–9.5% of the human genome can be classified as copy number variations.