Zinc fingerA zinc finger is a small protein structural motif that is characterized by the coordination of one or more zinc ions (Zn2+) which stabilizes the fold. It was originally coined to describe the finger-like appearance of a hypothesized structure from the African clawed frog (Xenopus laevis) transcription factor IIIA. However, it has been found to encompass a wide variety of differing protein structures in eukaryotic cells.
Conserved sequenceIn evolutionary biology, conserved sequences are identical or similar sequences in nucleic acids (DNA and RNA) or proteins across species (orthologous sequences), or within a genome (paralogous sequences), or between donor and receptor taxa (xenologous sequences). Conservation indicates that a sequence has been maintained by natural selection. A highly conserved sequence is one that has remained relatively unchanged far back up the phylogenetic tree, and hence far back in geological time.
Protein tyrosine phosphataseProtein tyrosine phosphatases (EC 3.1.3.48, systematic name protein-tyrosine-phosphate phosphohydrolase) are a group of enzymes that remove phosphate groups from phosphorylated tyrosine residues on proteins: [a protein]-tyrosine phosphate + H2O = [a protein]-tyrosine + phosphate Protein tyrosine (pTyr) phosphorylation is a common post-translational modification that can create novel recognition motifs for protein interactions and cellular localization, affect protein stability, and regulate enzyme activity.
PfamPfam is a database of protein families that includes their annotations and multiple sequence alignments generated using hidden Markov models. The most recent version, Pfam 35.0, was released in November 2021 and contains 19,632 families. The general purpose of the Pfam database is to provide a complete and accurate classification of protein families and domains. Originally, the rationale behind creating the database was to have a semi-automated method of curating information on known protein families to improve the efficiency of annotating genomes.
Multiple sequence alignmentMultiple sequence alignment (MSA) may refer to the process or the result of sequence alignment of three or more biological sequences, generally protein, DNA, or RNA. In many cases, the input set of query sequences are assumed to have an evolutionary relationship by which they share a linkage and are descended from a common ancestor. From the resulting MSA, sequence homology can be inferred and phylogenetic analysis can be conducted to assess the sequences' shared evolutionary origins.
Protein superfamilyA protein superfamily is the largest grouping (clade) of proteins for which common ancestry can be inferred (see homology). Usually this common ancestry is inferred from structural alignment and mechanistic similarity, even if no sequence similarity is evident. Sequence homology can then be deduced even if not apparent (due to low sequence similarity). Superfamilies typically contain several protein families which show sequence similarity within each family.
InterProInterPro is a database of protein families, protein domains and functional sites in which identifiable features found in known proteins can be applied to new protein sequences in order to functionally characterise them. The contents of InterPro consist of diagnostic signatures and the proteins that they significantly match. The signatures consist of models (simple types, such as regular expressions or more complex ones, such as Hidden Markov models) which describe protein families, domains or sites.
CASPCritical Assessment of Structure Prediction (CASP), sometimes called Critical Assessment of Protein Structure Prediction, is a community-wide, worldwide experiment for protein structure prediction taking place every two years since 1994. CASP provides research groups with an opportunity to objectively test their structure prediction methods and delivers an independent assessment of the state of the art in protein structure modeling to the research community and software users.
Protein familyA protein family is a group of evolutionarily related proteins. In many cases, a protein family has a corresponding gene family, in which each gene encodes a corresponding protein with a 1:1 relationship. The term "protein family" should not be confused with family as it is used in taxonomy. Proteins in a family descend from a common ancestor and typically have similar three-dimensional structures, functions, and significant sequence similarity.
Receptor tyrosine kinaseReceptor tyrosine kinases (RTKs) are the high-affinity cell surface receptors for many polypeptide growth factors, cytokines, and hormones. Of the 90 unique tyrosine kinase genes identified in the human genome, 58 encode receptor tyrosine kinase proteins. Receptor tyrosine kinases have been shown not only to be key regulators of normal cellular processes but also to have a critical role in the development and progression of many types of cancer.
Immunoglobulin superfamilyThe immunoglobulin superfamily (IgSF) is a large protein superfamily of cell surface and soluble proteins that are involved in the recognition, binding, or adhesion processes of cells. Molecules are categorized as members of this superfamily based on shared structural features with immunoglobulins (also known as antibodies); they all possess a domain known as an immunoglobulin domain or fold.
Helix bundleA helix bundle is a small protein fold composed of several alpha helices that are usually nearly parallel or antiparallel to each other. Three-helix bundles are among the smallest and fastest known cooperatively folding structural domains. The three-helix bundle in the villin headpiece domain is only 36 amino acids long and is a common subject of study in molecular dynamics simulations because its microsecond-scale folding time is within the timescales accessible to simulation.
Fragment crystallizable regionThe fragment crystallizable region (Fc region) is the tail region of an antibody that interacts with cell surface receptors called Fc receptors and some proteins of the complement system. This region allows antibodies to activate the immune system, for example, through binding to Fc receptors. In IgG, IgA and IgD antibody isotypes, the Fc region is composed of two identical protein fragments, derived from the second and third constant domains of the antibody's two heavy chains; IgM and IgE Fc regions contain three heavy chain constant domains (CH domains 2–4) in each polypeptide chain.
Fusion proteinFusion proteins or chimeric (kī-ˈmir-ik) proteins (literally, made of parts from different sources) are proteins created through the joining of two or more genes that originally coded for separate proteins. Translation of this fusion gene results in a single or multiple polypeptides with functional properties derived from each of the original proteins. Recombinant fusion proteins are created artificially by recombinant DNA technology for use in biological research or therapeutics.
Protein structureProtein structure is the three-dimensional arrangement of atoms in an amino acid-chain molecule. Proteins are polymers - specifically polypeptides - formed from sequences of amino acids, which are the monomers of the polymer. A single amino acid monomer may also be called a residue, which indicates a repeating unit of a polymer. Proteins form by amino acids undergoing condensation reactions, in which the amino acids lose one water molecule per reaction in order to attach to one another with a peptide bond.
Intrinsically disordered proteinsIn molecular biology, an intrinsically disordered protein (IDP) is a protein that lacks a fixed or ordered three-dimensional structure, typically in the absence of its macromolecular interaction partners, such as other proteins or RNA. IDPs range from fully unstructured to partially structured and include random coil, molten globule-like aggregates, or flexible linkers in large multi-domain proteins. They are sometimes considered as a separate class of proteins along with globular, fibrous and membrane proteins.
ArchaeaArchaea (ɑrˈkiːə ; : archaeon ɑrˈkiːən ) is a domain of single-celled organisms. These microorganisms lack cell nuclei and are therefore prokaryotes. Archaea were initially classified as bacteria, receiving the name archaebacteria (in the Archaebacteria kingdom), but this term has fallen out of use. Archaeal cells have unique properties separating them from the other two domains, Bacteria and Eukaryota. Archaea are further divided into multiple recognized phyla.
Cell membraneThe cell membrane (also known as the plasma membrane or cytoplasmic membrane, and historically referred to as the plasmalemma) is a biological membrane that separates and protects the interior of a cell from the outside environment (the extracellular space). The cell membrane consists of a lipid bilayer, made up of two layers of phospholipids with cholesterols (a lipid component) interspersed between them, maintaining appropriate membrane fluidity at various temperatures.
Protein quaternary structureProtein quaternary structure is the fourth (and highest) classification level of protein structure. Protein quaternary structure refers to the structure of proteins which are themselves composed of two or more smaller protein chains (also referred to as subunits). Protein quaternary structure describes the number and arrangement of multiple folded protein subunits in a multi-subunit complex. It includes organizations from simple dimers to large homooligomers and complexes with defined or variable numbers of subunits.
Protein dimerIn biochemistry, a protein dimer is a macromolecular complex or multimer formed by two protein monomers, or single proteins, which are usually non-covalently bound. Many macromolecules, such as proteins or nucleic acids, form dimers. The word dimer has roots meaning "two parts", di- + -mer. A protein dimer is a type of protein quaternary structure. A protein homodimer is formed by two identical proteins. A protein heterodimer is formed by two different proteins. Most protein dimers in biochemistry are not connected by covalent bonds.