You are here Glossary homepage/Search > Applications > Structural genomics
 
Structural Genomics Glossary
Evolving Terminology for Emerging Technologies
Comments? Revisions? Suggestions? mchitty@healthtech.com
Last revised December 26, 2001 

The recent determination that the human genome comprises only approximately 35,000 genes - not 60,000 to 100,000 as previously thought - has  directed even more attention to the role of proteins and, therefore, to the field of structural genomics.  One goal of this field is to reveal the structures of all the key “functional” sites of any human protein, information that should make it much easier to develop highly specific drugs, thus leading to more effective, and safer, pharmaceuticals. [CHI Structural proteomics]

Almost inextricably related  glossaries include Applications Proteomics Technologies NMR & X-Ray Crystallography Biology Protein Structures. Other related glossaries are Applications Functional Genomics, Informatics Algorithms & data management,  Molecular Modeling Biology Proteins.  Additional definitions appear in the In-depth glossary, after the Bibliography.

ab initio: From the beginning (Latin).

ab initio modeling: Molecular Modeling glossary

ab initio protein modeling: Predict 3D structure from sequence without using a homologous model/ template; this technology is not at the stage of being broadly applicable to drug discovery. [CHI Structural proteomics]

Ab initio methods use the physiochemical properties of the amino acid sequence of a protein to literally calculate a 3D structure (lowest energy model) based on protein folding. As opposed to determining the structure of an entire protein, ab initio methods are typically used to predict and model protein folds (domains). This method is gaining considerably, in part due to the development of novel mathematical approaches, a boost in available computational resources (for example, tera- and pentaFLOPS supercomputers), and considerable interest from researchers investigating protein- ligand (or drug) interactions.   [Christopher Smith "Bioinformatics, Genomics, and Proteomics"  Scientist 14[23]:26, Nov. 27, 2000] Related terms protein structure prediction.   http://the-scientist.com/yr2000/nov/profile_001127.html

ab initio structure prediction: Prediction of a protein’s structure based on amino acid sequence alone — that is, without mapping the structure to structures of known sequences. [CHI Structural Proteomics]  Broader term protein structure prediction (compared with ab initio).  Narrower term (compared with structure prediction)

atomic resolution data: NMR & X-ray crystallography

biological function: Functional genomics glossary

comparative modeling: See homology modeling.

evolutionary homology:  Functional genomics glossary

fold alignment: A critical step in homology modeling, because it provides the key structures for the model.  If suitably matched folds cannot be identified, a type of fold assignment known as protein threading can be used.  [CHI Structural proteomics]

fold recognition: Methods of protein fold recognition attempt to detect similarities between protein 3D structure that are not accompanied by any significant sequence similarity. There are many approaches, but the unifying theme is to try and find folds that are compatible with a particular sequence. Unlike sequence-only comparison, these methods take advantage of the extra information made available by 3D structure information. In effect, the turn the protein folding  problem on it's head: rather than predicting how a sequence will fold, they predict how well a fold will fit a sequence. [Robert B. Russell, Guide to Structure Prediction "Fold recognition methods and links" Sept. 1999] http://www.bmm.icnet.uk/people/rob/CCP11BBS/foldrec.html

Related terms threading; protein folding, protein folds Protein structure glossary.

foldedness: Methods for analyzing "foldedness" of expressed proteins include NMR and circular dichroism spectroscopies.

granularity: Molecular modeling glossary

Hidden Markov Models HMM: Molecular modeling glossary

homology model: A model of a protein, whose three-dimensional structure is unknown, built from, e.g., the X-ray coordinate data of similar proteins or using alignment techniques and homology arguments.  [IUPAC Computational]  Related terms: homology Functional genomics glossary, alignment Sequencing glossary

homology modeling: A computational method for determining the structure of a protein based on its similarity to known structures. The accuracy of structures determined by homology modeling depends largely on the amount of homology between the unknown and the known protein sequence. [CHI Breaking Bottlenecks] 

The most successful tool for prediction of protein structure from sequence, but with significant room for improvement.  Related terms structural homology; sequence homology Sequencing glossary; Proteins glossary hypothetical protein; Molecular Modeling

pharmacophore: Pharmaceutical biology glossary

protein folding problem: See protein structure prediction.

protein production: A major bottleneck and challenge in structural genomics.

protein sequence space: [J.] Maynard-Smith's (1970. Natural Selection and the concept of a protein space. Nature 225: 563- 564) concept of a "protein sequence space" in which each site in an alignment is represented on its own axis and the number of axes required to represent all conceivable variants for a protein is equal to the number of sites in its sequence. Each sequence occupies a unique point in this space; variants differing at one site are adjacent (Hamming) neighbours. The collection of all viable sequence variants for a particular protein forms a localized interconnected `neighbourhood' of points within the space. This representation has proved conceptually intuitive and analytically powerful  ... In protein sequence space, constraints are reflected in the multidimensional shape of the cluster of points that make up the "neighbourhood" of variants viable for a specific protein. The boundary defining the edge of this neighbourhood is characteristic of the protein's function and can be thought of as its functional "signature".  [Gavin JP Naylor, "Measuring Shifts In Function and Evolutionary Opportunity Using Variability Profiles: A Case Study of the Globins"   http://bioinfo.mbb.yale.edu/e-print/protspace-jme/text.pdf

protein structure prediction: Involves primary sequence alignment, secondary and tertiary structure prediction and homology modelling.

Protein 3D structures are encoded by a linear sequence of amino acid residues. To predict 3D structure from sequence is a task challenging enough to have occupied a generation of researchers. Have we finally succeeded? The bad news is: we still cannot predict structure for any sequence. The good news is: we have come closer, and growing databases facilitate the task. A solution of the structure prediction problem would supposedly change experimental molecular biology more than any other theoretical method. We may witness such a break- through in the near future. However, the lessons from the Asilomar prediction contests were that we may need a common frame- work to co- ordinate the efforts of the researchers in the field. ["Neural networks for protein structure prediction:  hype or hit? Burkhard Rost, Dec. 1999] http://www.embl-heidelberg.de/~rost/Papers/pre1999_tics/paper.html

Narrower term ab initio protein structure prediction Related terms Molecular Modeling glossary

protein structure, primary, secondary, tertiary and quaternary: Protein Structure glossary.

protein threading: See threading.

RNA structural genomics: The systematic determination of all macromolecular structures represented in a genome, is focused at present exclusively on proteins. It is clear, however, that RNA molecules play a variety of significant roles in cells, including protein synthesis and targeting, many forms of RNA processing and splicing, RNA editing and modification, and chromosome end maintenance. To comprehensively understand the biology of a cell, it will ultimately be necessary to know the identity of all encoded RNAs, the molecules with which they interact and the molecular structures of these complexes. This report focuses on the feasibility of structural genomics of RNA, approaches to determining RNA structures and the potential usefulness of an RNA structural database for both predicting folds and deciphering biological functions of RNA molecules. [Jennifer A. Doudna "Structural Genomics of RNA" Nature Structural Biology  7 (11) supp: 954-956 (Nov. 2000] http://www.euchromatin.org/Doudna1.htm 

signal transduction: Functional genomics glossary

structural bioinformatics: Involves the process of determining a protein's three- dimensional structure using comparative primary sequence alignment, secondary and tertiary structure prediction methods, homology modeling, and crystallographic diffraction pattern analyses. Currently, there is no reliable de novo predictive method for protein 3D-structure determination. Over the past half-century, protein structure has been determined by purifying a protein, crystallizing it, then bombarding it with X-rays. The X-ray diffraction pattern from the bombardment is recorded electronically and analyzed using software that creates a rough draft of the 3D structure. Biological scientists and crystallographers then tweak and manipulate the rough draft considerably. The resulting spatial coordinate file can be examined using modeling- structure software to study the gross and subtle features of the protein's structure. [Christopher Smith "Bioinformatics, Genomics, and Proteomics"  Scientist 14[23]:26, Nov. 27, 2000]  http://the-scientist.com/yr2000/nov/profile_001127.html  Related terms Algorithms & data management Molecular Modeling.

structural genomics: Involves quickly determining the 3D structures of large numbers of proteins (or other complex biological molecules, such as nucleic acids), ultimately accounting for an organism’s entire proteome. 

Footnote: As traditionally defined, the term structural genomics referred to the use of sequencing and mapping technologies, with bioinformatic support, to develop complete genome maps (genetic, physical, and transcript maps) and to elucidate genomic sequences for different organisms, particularly humans. Now, however, the term is increasingly used to refer to high- throughput methods for determining protein structures. [CHI Structural proteomics]

Indeed, many of the criticisms leveled at the Human Genome Project in the mid- 1980’s have been redirected toward structural genomics. Unlike high- throughput genome sequencing, it is not a simple matter to decide when a structural genomics effort has reached completion. [SK Burley et al “Structural genomics: beyond the Human Genome Project” Nature Genetics 23: 151 Oct. 1999] 

Related term structural proteomics

Structural genomics project links
Human Proteome/Structural Genomics Pilot Project, Brookhaven National Laboratory, US  http://www.proteome.bnl.gov/   A pilot project to examine the feasibility of  high-throughput determination of 3-dimensional structures of proteins by x-ray crystallography, starting from genome sequences.

Human Proteomics Initiative, Swiss Institute of Bioinformatics, European Bioinformatics Institute http://www.expasy.ch/sprot/hpi/     Effort to annotate, describe a distribute to the life science community a large amount of highly curated information concerning human protein sequences.

Structural genomics databases see Databases & software directory.

structural genomics technologies: NMR & X-Ray Crystallography

structural homology: Identify 3D structures of proteins or domains in the same family as a sequence of interest. [CHI Structural proteomics] Related terms homology Functional genomics glossary homology modeling Molecular modeling glossary

structural proteomics: Often referred to as structural genomics, this discipline involves determining the 3D structures of large numbers of proteins, ultimately accounting for an organism's entire proteome. It adds critical information in at least two points in the drug discovery pathway: (1) target identification, or selecting a pathway in which a drug might function, and (2) medicinal chemistry, or the actual design of compounds to modulate this pathway. [CHI Structural Proteomics]

A high-throughput, system wide means of determining gene function. It typically involves using high- throughput X-ray diffraction methods to determine the structure of proteins encoded by at least one member of each gene family in the genome. This approach is coupled with the use of bioinformatics as a tool in structural proteomics and computational modeling to determine structures of other proteins in the same family. Conversely, an important goal of structural proteomics is the creation of databases of structures. [CHI Target Validation]

structure from sequence: See protein structure prediction, structural homology

structure prediction problem: 

target identification: Drug discovery & development glossary

threading: In this approach, a target sequence is “threaded” through a library of 3D folds to try to find a match.  This method is used when no sequence is clearly related to the target sequence.  [CHI Structural proteomics]

Threading Home Page, NCBI  http://www.ncbi.nlm.nih.gov/Structure/RESEARCH/threading.html

toxicoproteomics: Proteomics glossary

Bibliography

[CHI, Structural Proteomics] Structural Proteomics: High-Throughput Approaches Fuel Drug Discovery and Development, Cambridge Healthtech Institute, Malorye Branca, Allan Haberman, Deidre Lockwood  2001 http://www.chireports.com/content/reports/struc_gen.asp

Nature Structural Biology Structural genomics supplement, Nov. 2000 http://www.nature.com/cgi-taf/dynapage.taf?file=/nsb/journal/v7/n11s/index.html

Alpha glossary index

In-depth Structural genomics glossary

CASP Critical Assessment of  Techniques for Protein Structure Alignment [Protein Structure Prediction Center, Lawrence Livermore National Lab, US]  http://predictioncenter.llnl.gov/  Links to CASP meetings results and information on "Ten most wanted" proteins solicitation.

NIGMS National Institute of General Medical Sciences: Part of NIH, supports biomedical research not targeted to specific diseases or disorders. Divisions of Cell Biology and Biophysics; Genetics and Developmental Biology; and Pharmacology, Physiology, and Biological Chemistry support research   http://www.nigms.nih.gov/  NIGMS Structural Genomics Initiatives http://www.nigms.nih.gov/funding/psi.html

Protein Structure Factory: A common initiative of the German Human Genome Project (DHGP) and structural biologists from the Berlin [Germany] area aimed at the broad- scale analysis of proteins. Established to characterize proteins encoded by the genes or cDNAs available at the Berlin Resource Center of DHGP. At a later stage, it may analyze various sets of input proteins selected by criteria of potential structural novelty or medical or biotechnological usefulness. http://userpage.chemie.fu-berlin.de/~psf/

Protein Structure Initiative: Aims at determination of the 3D structure of all proteins. This aim can be achieved in four steps: Organize known protein sequences into families;  Select family representatives as targets; Solve the 3D structure of targets by X-ray crystallography or NMR spectroscopy; Build models for other proteins by homology to solved 3D structures.  http://www.structuralgenomics.org/

Structural Biology Industrial Platform: Fifteen companies, including representatives of some of Europe's largest pharmaceutical industries, have formed the Structural Biology Industrial Platform to work with each other, the European Commission and Research Centres in Europe to promote structural biology research, training and development. http://www.sbip.org/

Structural Genomics Initiative, NIGMS, US  http://www.nigms.nih.gov/funding/psi.html


Cambridge
Healthtech Institute
1037 Chestnut Street
Newton Upper Falls, Ma 02464
Phone:
617-630-1300
Fax:  617-630-1325
Email: chi@healthtech.com