You are here Glossary homepage/Search > Biology >Proteins
 
Proteins glossary
Evolving terminology for emerging technologies
Suggestions? Comments? Questions? mchitty@healthtech.com
Last revised December 27, 2001 
Proteins are the main catalysts, structural elements, signaling messengers and molecular machines of biological tissues. [David Eisenberg et al. “Protein function in the post-genomic era” Nature 405: 823-826, 15 June 2000]

Related glossaries include Applications Proteomics, Sequencing, Structural genomics Technologies Chromatography & electrophoresis  Mass Spectrometry; Biology  Biomolecules,  Expression, Protein Structure, Sequences, DNA & beyond. Classes of proteins appear (with definitions) in the In-depth glossary, after the Bibliography.

amino acids: Organic compounds that generally contain an amino (-NH2) and a carboxyl (-COOH) group. Twenty alpha-amino acids are the subunits which are polymerized to form proteins. [MeSH]

A building block of proteins. There are 20 different kinds of amino acids; a protein consists of a specific sequence of amino acids. [NIGMS]

amino acid sequence:  The order of amino acids as they occur in a polypeptide chain. This is referred to as the primary structure of proteins. It is of fundamental importance in determining protein conformation. [MeSH] Related term Sequencing sequence homology

amino terminus: Narrower term N-terminus. Compare carboxyl terminus.

amino terminus domain: In protein-protein interactions, the N-terminal domain binds to specific DNA sequences. [S. Fields and O. Song “Novel genetic system to detect protein- protein interactions” Nature 340:245-246 July 20 1989]

apoptosis: Cell biology glossary

characterization - protein: Some of the information that can be gathered from a protein chip based characterization includes: post translational modification: phosphorylation, glycosylation, biotinylation, ADP-ribosylation (Cardone et al. 1998) C-terminal sequencing, epitope (binding site) mapping (Hinshelwood et al. 1999) [CHI Summit Proteomics] Broader term characterization Biomolecules glossary

C-terminus: The residue that has a free carboxyl group, or at least does not acylate another amino acid residue, (it may, for example, acylate ammonia to give -NH-CHR-CO-NH2), is called C-terminal. [IUPAC Bioinorganic “amino acid residue in a polypeptide”] Also called the carboxyl terminus.

carboxyl terminus: See C terminus. Contrast with N-terminus.

carboxyl terminus domain:  In protein- protein interactions, the C- terminal domain contains acidic regions necessary to activate transcription factors. [S Fields and O Song “Novel genetic system” Nature 340: 245-246 July 20 1989]

co-precipitation: Method to identify interacting proteins by using antibodies to bind to the protein if immunoprecipitated under non- denaturing using conditions … (allow any other proteins bound to the protein known to be involved in a process) to precipitate rather than be washed away. [CHI Proteomics]

cross-annotation: Comparing gels of similar samples and the information attached to them for verification based solely upon mobility [CHI Proteomics]

denaturation: The process of partial or total alteration of the native structure of a macromolecule resulting from the loss of tertiary or tertiary and secondary structure that is a consequence of the disruption of stabilizing weak bonds. Denaturation can occur when proteins and nucleic acids are subjected to elevated temperature or to extremes of pH, or to non- physiological concentrations of salt, organic solvents, urea or other chemical agents. [IUPAC Biotech]

depletion:  Method of sample preparation which removes high abundance proteins (not of interest) from the sample. [CHI Proteomics] Related term low- abundance proteins.

enzymes: Pharmaceutical biology glossary

glycosylation: The adding of a polysaccharide (chain of sugars) to a polypeptide (chain of amino acids) in order to make a glycoprotein. This is done within the endoplasmic reticulum while the polypeptide is being made. [PhytoProtein Biotech Glossary 2000] http://www.phytoprotein.com/glossary.html

The chemical or mechanical addition of carbohydrate or glycosyl groups to other chemicals, especially peptides or proteins. [Metathesaurus] Broader term post- translational modifications.

All predicted protein sequences lacking any significant sequence similarity to characterised proteins are labeled as ‘hypothetical proteins'. The majority of these cases come from the genome sequencing projects.  ["SWISS- PROT" in Introduction to Molecular Biology Databases, R. Apweiler, R. Lopez, B. Marx, 1999] http://www.ebi.ac.uk/swissprot/Publications/mbd1.html 

Laser Capture Microdissection LCM: Cell biology glossary  Can be used to help isolate low- abundance proteins.

methylation: Attachment of methyl groups (-CH3) to DNA most commonly at cytosine residues. May be involved in regulation of gene expression. Also may prevent some restriction endonucleases from cutting DNA at their recognition sites. [Schwindlein] 

DNA Methylation Society http://www.methdb.de/

Related terms Functional genomics Post Translational Gene Silencing; Gene definitions epigenetics; Omes & omics epigenomics, epigenotype 

molecular motors: Miniaturization glossary

N-terminus: The residue in a peptide that has an amino group that is free, or at least not acylated by another amino acid residue (it may, for example, be acylated or formylated), is called N-terminal; it is the N-terminus. [IUPAC Bioinorganic] Also called the amino terminus.

peptide aptamers: Functional genomics glossary

Peptide Mass Fingerprinting PMF: A means of protein identification, using mass spectrometry

peptide sequencing: How is this different from protein sequencing (except that peptides are shorter than proteins)?

peptides: Amides derived from two or more amino carboxylic acid molecules (the same or different) by formation of a covalent bond from the carbonyl carbon of one to the nitrogen atom of another with formal loss of  water. The term is usually applied to structures formed from a- amino acids, but it includes any amino carboxylic acid. [IUPAC Compendium]

phosphorylation: A process involving the transfer of a phosphate group (catalyzed by enzymes) from a donor to a suitable acceptor;. [IUPAC Bioinorganic] Broader term post- translational modifications.

polyadenylation:

post-translational modifications: Proteins once synthesized on the ribosomes, are subject to a multitude of modification steps. They are cleaved (thus eliminating signal sequences, transit or pro- peptides and initiator methionines); many simple chemical groups can be attached to them … as well as some more complex molecules, such as sugars and lipes. Finally they can be internally or externally cross- linked. More than a hundred different types of post- translational modifications are currently known (Aug. 1999) and many more are yet to be discovered. The complexity due to all these modifications is compounded by the high level of diversity that alternative splicing can produce at the level of sequence. Thus the number of different protein molecules expressed by the human genome is probably closer to a million than to the hundred thousand generally considered by genome scientists. [Human Proteomics Initiative, A Bairoch] http://www.expasy.ch/sprot/hpi/   Narrower terms biotinylation, glycosylation, phosphorylation, prenylation; Related terms post- translational protein processing, proteolytic processing.

post-translational protein processing: Any of various enzymically catalyzed post- translational modifications of peptides or proteins in the cell of origin. These modifications include carboxylation, hydroxylation, acetylation, glycosylation, methylation, phosphorylation, oxidation- reduction, degradation and lysis, peptide bond formation, and changes in molecular weight and electrophoretic motility. [MeSH] Related terms post- translational modifications; Cell biology Golgi apparatus

prefractionation: Sample preparation method, capable of  being automated.

prenylation:Attachment of an isoprenoid to the C-terminal cysteine residue. [SWISS-PROT keywords] http://www.expasy.ch/cgi-bin/get-entries?KW=Prenylation

primary structure: In the context of macromolecules such as proteins, the constitutional formula, usually abbreviated to a statement of the sequence and if appropriate cross- linking of chains. [IUPAC Compendium]  See also amino acid sequence.

probable protein (similarity): When a protein exhibits extensive sequence similarity to a characterised protein and/ or has the same conserved regions then the label ‘probable' is used in the DE line. ["SWISS- PROT" in Introduction to Molecular Biology Databases, R. Apweiler, R. Lopez, B. Marx, 1999] http://www.ebi.ac.uk/swissprot/Publications/mbd1.html Related term In-depth putative protein. SWISS-PROT DNA Humor [cartoon] http://au.expasy.org/sprot/terminology.html

protein: See proteins

protein arrays; Microarrays glossary

protein chips: Microarrays glossary

protein detection: New fluorescent stains (such as Sypro) have improved both the dynamic range of protein detection and protein quantification in 2D gels. "Current State of  Proteomic Technology"  CHI's GenomeLink 3, 2001 http://www.chiresource.com/newsarticles/issue3_1.ASP

protein engineering: A technique used to produce proteins with altered or novel amino acid sequences. The methods used are: 1. Transcription and translation systems from synthesized lengths of DNA or RNA with novel sequences. 2. Chemical modification of  'normal' proteins. 3. Solid-  state polypeptide synthesis to form proteins.  [IUPAC Compendium] 

protein expression: Expression glossary

protein identification: Has traditionally been done by electrophoresis.  Mass spectrometry is increasingly used.

protein informatics: Proteomics glossary

protein inhibition: An alternative approach to [gene expression] downregulation, but in this case, the protein, not the gene, is the target. As with downregulation of gene expression, protein inhibition is a powerful target validation tool. The major approach to protein inhibition is based on phage libraries, which are used to select antibodies against targets of interest. [CHI Breaking Bottlenecks]

protein localization: There is an ever- increasing number of genes that have been sequenced but are of completely unknown function. The ability to determine the location of such gene products within the cell, either by the use of antibodies or by the production of chimeras with green fluorescent protein, is a vital step towards understanding what they do. This is one major reason why fluorescence microscopy is enjoying a revival. [Protein Localization by Fluorescence Microscopy: A Practical Approach Edited by VICTORIA J. ALLAN, Oxford University Press, 2000] http://www.oup-usa.org/isbn/0199637415.html

To a large degree, the function of a protein can be inferred from its cellular compartmentalization and its interactions with other cellular components. Therefore, information on the subcellular distribution of a protein is crucial to determine its overall role in the cell [Gene Expression and Protein Localization in Arabidopsis, Cold Spring Harbor Arabidopsis Molecular Genetics Course, July, 1996, Dr. Albrecht von Arnim] http://stein.cshl.org/atir/biology/protocols/cshl-course/7-gene_expression.html  Related terms Cell biology subcellular fractionation; gene localization

protein microarrays: Microarrays glossary

protein purification: John Wagner's Logic of Molecular Approaches to Biological Problems (Cornell Univ. Graduate School of Medical Science, US ) has a section on the value of protein purification. http://www-users.med.cornell.edu/~jawagne/proteins_%26_purification.html

protein regulation:

protein sequence: See amino acid sequence.

protein synthesis: See transcription, translation. Sequences, DNA & beyond glossary

proteins: Naturally occurring and synthetic polypeptides having molecular weights greater than about 10,000 (the limit is not precise). [IUPAC Compendium]

Polymers of amino acids linked by peptide bonds. The specific sequence of amino acids determines the shape and function of the protein. [MeSH]

Narrower terms In-depth: basic proteins, checkpoint control proteins, factitious proteins, fusion proteins, gatekeeper proteins, heat shock proteins, housekeeping proteins, hydrophobic proteins, hypothetical proteins, low- abundance proteins, luxury proteins, membrane proteins, orphan proteins, probable proteins, putative proteins, secreted proteins, zinc finger proteins    

Protein databases see Databases & software directory.

proteolysis: The breaking down of proteins by enzymes called proteases. [Life Sciences]

proteolytic processing: Related terms proteolysis Broader term post- translational modification

residue:  When two or more amino acids combine to form a peptide, the elements of water are removed, and what remains of each amino acid is called an amino acid residue. [IUPAC Bioinorganic]   Related terms C-terminus, N-terminus In-depth.

sequence homology: Sequencing glossary

subcellular localization: A key functional characteristic of proteins. To co-operate for a common physiological function (metabolic pathway, signal transduction cascade, structural associate etc.), proteins must be localized in the same cellular compartment. [Frank Eisenhaber and Peer Bork, "Subcellular Localization: Automatic analysis of SWISS-PROT annotations with Meta _ A(nnotator)", 2000] http://mendel.imp.univie.ac.at/CELL_LOC/

transcription  Sequences, DNA & beyond glossary.

transcription factors: Sequences, DNA & beyond glossary.

translation:  Sequences, DNA & beyond glossary

Another approach to downregulating gene expression (see also antisense, ribozymes, RNAi). 

wild-type proteins: Native proteins, as found in the wild.  This seems analagous to wild- type [genes] (Genetic variations glossary). Are there any other implications? 

Bibliography

"Proteins" Kimball's Biology Pages, John W. Kimball, 1999 http://www.ultranet.com/~jkimball/BiologyPages/P/Proteins.html

SWISS-PROT keywords Swiss Institute of Bioinformatics, Geneva Switzerland and European Bioinformatics Institute, Hinxton, UK, 2001, 800 + definitions.http://www.expasy.ch/cgi-bin/keywlist.pl

Alpha glossary index

IUPAC definitions are reprinted with the permission of the International Union of Pure and Applied Chemistry.

In-depth Proteins glossary

basic proteins: Alkaline proteins, pI approximately above 7.0-7.5 pH.

checkpoint control proteins: Proteins that control passage through critical stages of the cell cycle, these might, for example, halt passage through the cell cycle in the case of DNA damage.  [CHI Functional Genomics]

chromosomal proteins, non-histone:Nucleoproteins which in contrast to histones are acid insoluble. They are involved in chromosomal functions; e.g. they bind selectively to DNA, stimulate transcription resulting in tissue- specific RNA synthesis and undergo specific changes in response to various hormones or phytomitogens. [MeSH]

co-immunoprecipitation  Used to determine protein- protein interactions.  An antibody is used to precipitate a protein along with bound proteins. [John Yates, “Mass spectrometry and the Age of the Proteome”  Journal Mass Spectrometry 33: 16, 1998]

complex proteins: Complex proteins usually have more than one folding domain, each involving a sequence of 100 to 300 amino acids. The entire folding architecture of a complex protein must be precisely constructed in order for protein functionality to exist. [Science Week 1998] http://scienceweek.com/search/wobetio.htm

Proteins may consist of a single polypeptide chain, as myoglobin does, or of multiple chains linked by disulfide bonds; the two chains of insulin are joined by two disulfide bonds. More complex proteins may consist of multiple chains held together by noncovalent forces. Some protein molecules contain organic structures which are not polypeptide chains. Hemoglobin, for example, includes the additional iron- containing heme group which is essential for its transport of oxygen [JA Plambeck, Dept. of Chemistry, Univ. of Alberta] http://www.chem.ualberta.ca/~plambeck/che/p265/p06232.htm

Is there a precise definition of  complex proteins?  Ones with more than two disulfide bonds?  More than one folding domain?  The Nature issue with the human genome sequence noted that "Humans have an unusually high number of complex proteins that fit into more than one functional category".  Various sources describe categories of  "simple proteins" and "conjugated proteins" and "derived proteins" Enzymes are identified as complex proteins.  

cytokines:  Non- antibody proteins secreted by inflammatory leukocytes and some non- leukocytic cells, that act as intercellular mediators. They differ from classical hormones in that they are produced by a number of  tissue or cell types rather than by specialized glands. They generally act locally in a paracrine or autocrine rather than endocrine manner. [MeSH]

Not really different from hormones, but the term tends to be used as a convenient generic shorthand for interleukins, lymphokines and several related signalling molecules such as TNF [Tumor Necrosis Factor] and interferons … Rather an imprecise term, though in very common usage.  [Lackie] 

Compare growth factors

Horst Ibelgauft's Cytokines Online Pathfinder Encyclopaedia, 1999 http://www.copewithcytokines.de/

directed protein evolution: Directed evolution was developed in the early 1980’s by Stuart Kauffman and Marc Ballivet, using recombinant DNA techniques to create enormously diverse libraries of DNA, RNA, proteins and peptides.  Ixys was issued a patent (March 7, 1998) on this approach of generating libraries of  random proteins that can be produced in cells exclusively licensed from Kauffman and Ballivet. [Nature Biotechnology 16:411, May 1998] Related term phage display Functional genomics glossary

factitious protein: A product of genetic engineering; a protein designed for a specific purpose or for its expected properties. [Glick]   Factitious implies not natural or contrived.

full-length proteins: 

fusion proteins, recombinant: Proteins that are the result of genetic engineering.  A regulatory part or promoter of one or more genes is combined with a structural gene. The fusion protein is formed after transcription and translation of the fused gene. This type of fusion protein is used in the study of gene regulation or structure- activity relationships. They might also be used clinically as targeted toxins (immunotoxins). [MeSH] Related term cell fusion.

gatekeeper protein: A protein that monitors transfer of a protein from the endoplasmic reticulum to the Golgi apparatus and prevents transfer of newly synthesized proteins with inappropriate conformations or with unpaired thiol groups. [Glick]

growth factors: This collective term originally referred to substances that promote cell growth. It is used rather loosely now, comprising molecules that function as growth stimulators (mitogens) but also as growth inhibitors (sometimes referred to as negative growth factors ), factors that stimulate cell migration (see: Motogenic cytokines ) or function as chemotactic agents (see also: Chemotaxis ) or inhibit cell migration or invasion of tumor cells, factors that modulate differentiated functions of cells, factors involved in apoptosis , or factors that promote survival of cells without influencing growth and differentiation. ... In many instances the term is used as a synonym for cytokines. [Horst Ibelgauft's Cytokines Online Pathfinder Encyclopaedia, 1999]  http://www.copewithcytokines.de/  

Narrower term IGF-1 Insulin like Growth Factor Compare cytokines

heat shock proteins: Proteins which are synthesized in eukaryotic organisms and bacteria in response to hyperthermia and other environmental stresses. They increase thermal tolerance and perform functions essential to cell survival under these conditions. [MeSH]

housekeeping proteins: Highly expressed proteins >10,000 copies per cell. [Blackstock & Weir “Proteomics” Trends in Biotechnology: 121 Mar 1999] Universal proteins.  Not the proteins of greatest interest, which are often of low abundance.

hydrophobic proteins: Repel water. Related term membrane proteins. Protein structure glossary

hypothetical proteins: Many of the gene products of completely sequenced organisms are “hypothetical” – they cannot be related to any previously characterized proteins – and so are of completely unknown function. ..As each [completely sequenced] organism’s genome is analyzed about one third of the observed open reading frames (ORFs), although conserved among several organisms, encode for “hypothetical ‘ proteins that cannot be related to other proteins of known function or structure. Understanding the physiological function of the protein products of these so-called ‘orphan’ genes has emerged as a major challenge. [E Eisenstein et al “Biological function made crystal clear – annotation of hypothetical proteins via structural genomics” Current Opinion in Biotechnology 11(1): 25-30 Feb. 2000] 

Predicted protein for which there is no experimental evidence that it is expressed in vivo. These predictions are from computer programs only. [SWISS-PROT keywords] http://www.expasy.ch/cgi-bin/get-entries?KW=Hypothetical%20protein

Insulin-like Growth Factor 1 IGF- 1: An “intermediate” hormone in growth. IGF-I transmits messages from the sender (growth hormone) to the receiver (the tissues). When the body’s tissues receive the IGF-I message, the cells grow. Since IGF-I is produced by the tissues when instructed by growth hormone, the amount of IGF-I in the blood can act as a clue to the amount of GH [growth hormone] the body is making. [Humatrope Glossary, Eli Lilly & Co. 2000]  http://www.humatrope.com/Glossary.htm

interferons: A class of glycoproteins (with sugar groups attached at specific locations) important in immune function. They are able to inhibit the multiplication of viruses in cells. [IUPAC Biotech, IUPAC Compendium] 

low-abundance proteins:  Often the proteins of greatest interest, but difficult to detect because more abundant proteins predominate. See related terms depletion In-depth Proteins; Laser Capture Microdissection Cell biology ; sample prep Technologies overview and Assays, labels, signaling & detection glossary.

luxury proteins: I haven't found any dictionary entries for “luxury proteins” (cf. “housekeeping genes” and “luxury genes” in Gene definitions) but did find a few web references in which luxury proteins are clearly those produced by specific cells.

membrane proteins: Protein Structure

nuclear proteins: Proteins found in the nucleus of a cell. Do not confuse with NUCLEOPROTEINS which are proteins conjugated with nucleic acids, that are not necessarily present in the nucleus. [MeSH]

Any other way of characterizing?

nucleoproteins: Proteins conjugated with nucleic acids. [MeSH]

orphan proteins: Receptors which have not been matched with the hormones which activate them. [Graeme Milligan, Institute of Biomedical and Life Sciences, Univ. of Glasgow, Scotland]  http://www.gla.ac.uk/ibls/about/le72.htm

Proteins without sequence (and/or structural?) similarity to previously characterized proteins.

proteases: Enzymes that catalyse the hydrolysis of proteins. Usually several proteolytic enzymes are necessary for the complete breakdown of polypeptides to their amino acids. [IUPAC Biotech, IUPAC Compendium]

putative proteins: There is evidence within the sequence data but we do not want to classify indefinitely until experimental proof is available [V Junker "SWISS-PROT annotation: how is biochemical information assigned to sequence entries" Feb. 2000]   http://www.expasy.ch/cgi-bin/lists?annbioch.txt

The label ‘putative' is used in the DE [descriptor] line of proteins that exhibit limited sequence similarity to characterised proteins. These proteins often have a conserved site e.g. ATP-binding site but no other significant similarity to a characterised protein. It is most frequently used for sequences from genome projects.  The assignment of the labels ‘probable' and ‘putative' is dependent primarily on the results of sequence similarity searches against SWISS- PROT. It is important to point out here that no specific cut- off point is used to assign a protein as ‘putative' or ‘probable'.  ["SWISS- PROT" in Introduction to Molecular Biology Databases, R. Apweiler, R. Lopez, B. Marx, 1999] http://www.ebi.ac.uk/swissprot/Publications/mbd1.html  Related term probable protein (similarity)

secreted proteins: Encoded (usually) by genes with signal sequences, and such proteins include potential therapeutic proteins such as hormones, cytokines, and growth factors. [CHI Functional Genomics]

[Are] of particular interest [as biomarkers] since it is possible they would be detectable in body fluids, highly advantageous for a diagnostic marker. [Proteome Sciences, UK "Cancer" 2001] http://www.proteome.co.uk/research/cancer.html

Zinc Finger Proteins ZFP: A domain, found in certain DNA- binding proteins, comprising a helix- loop structure in which a zinc ion is coordinated to 2- 4 cysteine sulfurs, the remaining ligands being histidines.  In many proteins of this type the domain is repeated several times. [IUPAC Bioinorganic]

ZFPs are naturally occurring, zinc- containing DNA- binding proteins that serve as transcription factors. Researchers have discovered the rules by which ZFPs recognize specific DNA sequences. This knowledge allows them to rapidly generate proteins that selectively regulate target genes of interest. Genetic constructs that code for these ZFPs can be transfected into cells in culture or into animals, resulting in the downregulation or upregulation of target genes. [CHI Breaking Bottlenecks]


Cambridge
Healthtech Institute
1037 Chestnut Street
Newton Upper Falls, Ma 02464
Phone:
617-630-1300
Fax:  617-630-1325
Email: chi@healthtech.com