GenomeIn the fields of molecular biology and genetics, a genome is all the genetic information of an organism. It consists of nucleotide sequences of DNA (or RNA in RNA viruses). The nuclear genome includes protein-coding genes and non-coding genes, other functional regions of the genome such as regulatory sequences (see non-coding DNA), and often a substantial fraction of junk DNA with no evident function. Almost all eukaryotes have mitochondria and a small mitochondrial genome.
ChromosomeA chromosome is a long DNA molecule with part or all of the genetic material of an organism. In most chromosomes the very long thin DNA fibers are coated with packaging proteins; in eukaryotic cells the most important of these proteins are the histones. These proteins, aided by chaperone proteins, bind to and condense the DNA molecule to maintain its integrity. These chromosomes display a complex three-dimensional structure, which plays a significant role in transcriptional regulation.
Genome sizeGenome size is the total amount of DNA contained within one copy of a single complete genome. It is typically measured in terms of mass in picograms (trillionths (10−12) of a gram, abbreviated pg) or less frequently in daltons, or as the total number of nucleotide base pairs, usually in megabases (millions of base pairs, abbreviated Mb or Mbp). One picogram is equal to 978 megabases. In diploid organisms, genome size is often used interchangeably with the term C-value.
Chromosome 22Chromosome 22 is one of the 23 pairs of chromosomes in human cells. Humans normally have two copies of chromosome 22 in each cell. Chromosome 22 is the second smallest human chromosome, spanning about 51 million DNA base pairs and representing between 1.5 and 2% of the total DNA in cells. In 1999, researchers working on the Human Genome Project announced they had determined the sequence of base pairs that make up this chromosome. Chromosome 22 was the first human chromosome to be fully sequenced.
Human genomeThe human genome is a complete set of nucleic acid sequences for humans, encoded as DNA within the 23 chromosome pairs in cell nuclei and in a small DNA molecule found within individual mitochondria. These are usually treated separately as the nuclear genome and the mitochondrial genome. Human genomes include both protein-coding DNA sequences and various types of DNA that does not encode proteins. The latter is a diverse category that includes DNA coding for non-translated RNA, such as that for ribosomal RNA, transfer RNA, ribozymes, small nuclear RNAs, and several types of regulatory RNAs.
Chromosome 21Chromosome 21 is one of the 23 pairs of chromosomes in humans. Chromosome 21 is both the smallest human autosome and chromosome, with 45 million base pairs (the building material of DNA) representing about 1.5 percent of the total DNA in cells. Most people have two copies of chromosome 21, while those with three copies of chromosome 21 have Down syndrome, also called "trisomy 21". Researchers working on the Human Genome Project announced in May 2000 that they had determined the sequence of base pairs that make up this chromosome.
Chromosome 5Chromosome 5 is one of the 23 pairs of chromosomes in humans. People normally have two copies of this chromosome. Chromosome 5 spans about 182 million base pairs (the building blocks of DNA) and represents almost 6% of the total DNA in cells. Chromosome 5 is the 5th largest human chromosome, yet has one of the lowest gene densities. This is partially explained by numerous gene-poor regions that display a remarkable degree of non-coding and syntenic conservation with non-mammalian vertebrates, suggesting they are functionally constrained.
MedianIn statistics and probability theory, the median is the value separating the higher half from the lower half of a data sample, a population, or a probability distribution. For a data set, it may be thought of as "the middle" value. The basic feature of the median in describing data compared to the mean (often simply described as the "average") is that it is not skewed by a small proportion of extremely large or small values, and therefore provides a better representation of the center.
Geometric medianIn geometry, the geometric median of a discrete set of sample points in a Euclidean space is the point minimizing the sum of distances to the sample points. This generalizes the median, which has the property of minimizing the sum of distances for one-dimensional data, and provides a central tendency in higher dimensions. It is also known as the 1-median, spatial median, Euclidean minisum point, or Torricelli point. The geometric median is an important estimator of location in statistics, where it is also known as the L1 estimator (after the L1 norm).
Circular chromosomeA circular chromosome is a chromosome in bacteria, archaea, mitochondria, and chloroplasts, in the form of a molecule of circular DNA, unlike the linear chromosome of most eukaryotes. Most prokaryote chromosomes contain a circular DNA molecule – there are no free ends to the DNA. Free ends would otherwise create significant challenges to cells with respect to DNA replication and stability. Cells that do contain chromosomes with DNA ends, or telomeres (most eukaryotes), have acquired elaborate mechanisms to overcome these challenges.
Chromosome 7Chromosome 7 is one of the 23 pairs of chromosomes in humans, who normally have two copies of this chromosome. Chromosome 7 spans about 160 million base pairs (the building material of DNA) and represents between 5 and 5.5 percent of the total DNA in cells. The following are some of the gene count estimates of human chromosome 7. Because researchers use different approaches to genome annotation their predictions of the number of genes on each chromosome varies (for technical details, see gene prediction).
Mitochondrial DNAMitochondrial DNA (mtDNA or mDNA) is the DNA located in mitochondria, cellular organelles within eukaryotic cells that convert chemical energy from food into a form that cells can use, such as adenosine triphosphate (ATP). Mitochondrial DNA is only a small portion of the DNA in a eukaryotic cell; most of the DNA can be found in the cell nucleus and, in plants and algae, also in plastids such as chloroplasts. Human mitochondrial DNA was the first significant part of the human genome to be sequenced.
NP-hardnessIn computational complexity theory, NP-hardness (non-deterministic polynomial-time hardness) is the defining property of a class of problems that are informally "at least as hard as the hardest problems in NP". A simple example of an NP-hard problem is the subset sum problem. A more precise specification is: a problem H is NP-hard when every problem L in NP can be reduced in polynomial time to H; that is, assuming a solution for H takes 1 unit time, Hs solution can be used to solve L in polynomial time.
Yeast artificial chromosomeYeast artificial chromosomes (YACs) are genetically engineered chromosomes derived from the DNA of the yeast, Saccharomyces cerevisiae, which is then ligated into a bacterial plasmid. By inserting large fragments of DNA, from 100–1000 kb, the inserted sequences can be cloned and physically mapped using a process called chromosome walking. This is the process that was initially used for the Human Genome Project, however due to stability issues, YACs were abandoned for the use of Bacterial artificial chromosome The bakers' yeast S.
Y chromosomeThe Y chromosome is one of two sex chromosomes in therian mammals and other organisms. The other sex chromosome is the X chromosome. Y is normally the sex-determining chromosome in many species, since it is the presence or absence of Y that determines the male or female sex of offspring produced in sexual reproduction. In mammals, the Y chromosome contains the gene SRY, which triggers male development. The DNA in the human Y chromosome is composed of about 62 million base pairs, making it similar in size to chromosome 19.
GeneIn biology, the word gene (from γένος, génos; meaning generation or birth or gender) can have several different meanings. The Mendelian gene is a basic unit of heredity and the molecular gene is a sequence of nucleotides in DNA that is transcribed to produce a functional RNA. There are two types of molecular genes: protein-coding genes and noncoding genes. During gene expression, the DNA is first copied into RNA. The RNA can be directly functional or be the intermediate template for a protein that performs a function.
Genome projectGenome projects are scientific endeavours that ultimately aim to determine the complete genome sequence of an organism (be it an animal, a plant, a fungus, a bacterium, an archaean, a protist or a virus) and to annotate protein-coding genes and other important genome-encoded features. The genome sequence of an organism includes the collective DNA sequences of each chromosome in the organism. For a bacterium containing a single chromosome, a genome project will aim to map the sequence of that chromosome.
Computational complexity theoryIn theoretical computer science and mathematics, computational complexity theory focuses on classifying computational problems according to their resource usage, and relating these classes to each other. A computational problem is a task solved by a computer. A computation problem is solvable by mechanical application of mathematical steps, such as an algorithm. A problem is regarded as inherently difficult if its solution requires significant resources, whatever the algorithm used.
Homologous chromosomeA couple of homologous chromosomes, or homologs, are a set of one maternal and one paternal chromosome that pair up with each other inside a cell during fertilization. Homologs have the same genes in the same loci where they provide points along each chromosome which enable a pair of chromosomes to align correctly with each other before separating during meiosis. This is the basis for Mendelian inheritance which characterizes inheritance patterns of genetic material from an organism to its offspring parent developmental cell at the given time and area.
Human Genome ProjectThe Human Genome Project (HGP) was an international scientific research project with the goal of determining the base pairs that make up human DNA, and of identifying, mapping and sequencing all of the genes of the human genome from both a physical and a functional standpoint. It started in 1990 and was completed in 2003. It remains the world's largest collaborative biological project. Planning for the project started after it was adopted in 1984 by the US government, and it officially launched in 1990.