| You are here Glossary homepage/Search
> Informatics > Molecular modeling Molecular modeling glossary Evolving terminology for emerging
technologiesSuggestions? Comments? Questions? mchitty@healthtech.com Last revised December 26, 2001 Biomedical modeling is still an art form that cannot be
applied well outside of specialized research groups. Yet, the successes,
in particular in the case of molecular simulation and structure prediction,
have dramatically increased, as demonstrated by the routine use of modeling
programs for the interpretation of many types of experiments in crystallography,
and by the advance in the accuracy of predicted structures. In fact, the
technology developed by the computational biology community is being used
by experimental biomedical researchers. [Opportunities in Molecular Biomedicine
in the Era of Teraflop Computing, March 3 & 4, 1999, Rockville,
MD]
http://www.ks.uiuc.edu/Publications/Reports/teraflop/node4.htmlRelated glossaries include Applications Drug
discovery & development Sequencing,
Structural
genomics. Informatics Algorithms
& data management, Bioinformatics,
Chemoinformatics, Computers
& computing Databases
& software directory Biology Protein
Structure. Additional definitions appear in the In-depth glossary,
after the Bibliography. ab initio: From the Latin: from the beginning. In modeling
refers to models devised without experimental data? ab initio calculations: Quantum chemical calculations
using exact equations with no approximations which involve the whole
electronic population of the molecule. [IUPAC Computational] ab initio gene prediction: Traditionally, gene prediction
programs that rely only on the statistical qualities of exons have
been referred to as performing ab initio predictions. Ab initio
prediction of coding sequences is an undeniable success by the standards
of the machine- learning algorithm field, and most of the widely used gene
prediction programs belong to this class of algorithms. It is impressive
that the statistical analysis of raw genomic sequence can detect around 77- 98% of the genes present ... This is, however, little consolation
to the bench biologist, who wants the complete sequences of all genes present,
with some certainty about the accuracy of the predictions involved. As
Ewan Birney (European Bioinformatics Institute, UK) put it, what looks
impressive to the computer scientist is often simply wrong to the biologist.
[Meeting report "Gene prediction: the end of the beginning" Colin Semple,
Genome Biology 2000 1(2): reports 4012.1-4012.3] Broader term gene
prediction. http://www.genomebiology.com/2000/1/2/reports/4012/ All ab initio gene prediction programs have to balance sensitivity
against accuracy. ab initio protein structure prediction: See Structural
genomics glossary ab initio quantum chemistry: Involves the calculation of
chemical properties directly from the molecular Schrodinger equation. The only
empirical data used is the mass and charge of the nuclear particles. In some
sense ab initio quantum chemistry can be viewed as a form of alchemy, in
which computer cycles are transformed into chemical properties. [Michael Colvin
"What is ab initio quantum chemistry?" Lawrence Livermore
National Lab, US] http://gutenberg.llnl.gov/~colvin/ alignment: Sequencing
glossary binding site: Drug discovery &
development glossary CADD: See Computer Assisted Drug Design CAMD: See Computer Aided Molecular Design, Computer Assisted Molecular
Design CAMM See Computer Assisted Molecular Modeling computational chemistry: Chemoinformatics
glossary Related terms binding site, molecular graphics, Van der Waals computational gene recognition: Interpreting nucleotide sequences
by computer, in order to provide tentative annotation on the location,
structure and functional class of protein- coding genes. [JW Fickett 1996]
Related terms gene recognition, molecular recognition. Gene recognition is much more difficult in higher eukaryotes than in
prokaryotes, as coding regions (exons) are often interrupted by
non- coding
regions (introns) and genes are highly variable in size. This
is particularly so for human genes. As someone remarked recently people have
non- coding regions occasionally interrupted by genes. computational genomics: Computers & computing
glossary computational modeling: See ab initio modeling, homology
modeling. Computer Aided Molecular Design (CAMD): Involves all computer-assisted
techniques used to discover, design and optimize compounds with desired
structure and properties. [IUPAC Combinatorial] Also known as molecular modeling or computational chemistry,
uses computers to analyze and model the physicochemical properties of a
molecule. CAMD programs allow integrated molecular design to take drug
discovery to a new level by using a more cross-functional team approach
to drug research and development. [Oxford Molecular] Computer-Assisted Drug Design CADD: Involves all computer- assisted
techniques used to discover, design and optimize biologically active compounds
with a putative use as drugs. Broader term drug design. Drug
discovery & development glossary [IUPAC Computational] Computer-Assisted Molecular Design CAMD: Involves all computer-assisted
techniques used to discover, design and optimize compounds with desired
structure and properties. [IUPAC Computational] Computer-Assisted molecular modeling CAMM: The investigation
of molecular structures and properties using computational chemistry and
graphical visualization techniques. [IUPAC Computational] docking: Three-dimensional molecular structure is one of the
foundations of structure- based drug design. Often, data are available
for the shape of a protein and a drug separately, but not for the two together.
The program AutoDock was originally written in FORTRAN-77 in 1990 by David
S. Goodsell here in Arthur J. Olson's laboratory. It performs automated
docking of ligands (small molecules like a candidate drug) to their macromolecular
targets (usually proteins, sometimes DNA) [Garrett B. Morris, “Molecular
docking web”, Scripps, Dec. 2000] http://www.scripps.edu/pub/olson-web/people/gmm/index.html docking programs: Programs for evaluating lead compounds against
target proteins; these programs are “informed” by structure data. [CHI
Structural genomics] Traditional ligand- docking programs - such as DOCK, developed by Irwin
Kuntz at the University of California at Berkeley; MacroModel, developed
by Clark Still at Columbia University; and GOLD from MSI (now part of
Pharmacopeia) - give information about potential ligands for a known protein structure.
These programs select molecules predicted to be highly complementary to
the receptor structure and can screen many of these ligands against the
protein. This type of virtual screening technology has already been incorporated into many
major pharmaceutical companies’ discovery programs and offers the ability
to screen many more compounds at once than the traditional laboratory- based
method. [CHI Structural genomics] docking studies: Computational techniques for the exploration
of the possible binding modes of a substrate to a given receptor, enzyme
or other binding site. [IUPAC Computational] Related terms drug design, QSAR
Pharmaceutical
biology glossary. drug design: See structure-based drug design Drug
discovery & development glossary Related terms 3D
QSAR, QSAR Algorithms and data
management glossary. dynamic programming methods: Sequencing glossary exon parsing: Identifying precisely the 5' and 3' boundaries of genes
(the transcription unit) in metazoan genomes, as well as the correct sequences
of the resulting mRNA ("exon parsing") has been a major challenge of
bioinformatics for years. Yet, the current program performances are still
totally insufficient for a reliable automated annotation (Claverie 1997;
Ashburner 2000). It is interesting to recapitulate quickly the research in this
area to illustrate the essential limitation plaguing modern bioinformatics.
Encoding a protein imposes a variety of constraints on nucleotide sequences,
which do not apply to noncoding regions of the genome. These constraints induce
statistical biases of various kinds, the most discriminant of which was soon
recognized to be the distribution of six nucleotide-long "words" or
hexamers (Claverie and Bougueleret 1986; Fickett and Tung 1992). [JM
Claverie "From Bioinformatics to Computational
Biology" Genome Res 10: (9) 1277-
1279 Sept. 2000] http://igs-server.cnrs-mrs.fr/igs/abstract/an2000/abstract13.html exon prediction: Since prokaryotes don't have introns,
exon prediction implies working with eukaryotes. Is exon prediction
equivalent to gene prediction in prokaryotes? Related terms ab
initio gene prediction; GRAIL Sequencing
glossary gene identification: Using marker SNPs to hone in on otherwise hard to
find genes. [CHI SNPs] The effectiveness of finding genes by similarity
to a given sequence segment is determined by a much simpler statistic,
the total coverage of the genome by the collective set of sequence
contigs. As the overall coverage of the genome is virtually complete (>
90%), there is a strong likelihood that every gene is represented, at least
in part, in the data. Thus, finding any gene by sequence similarity
searches using sufficient sequence to ensure significance is almost always
possible using the data published this week. Caution must be exercised,
however, as the identification of the gene may still be ambiguous. This
is because a highly similar sequence from a receptor gene from Drosophila,
for example, could be found in several different, homologous genes,
which may have similar or entirely different functions or are nonfunctioning pseudogenes. In other words, common domains or motifs can be present
in many different genes. The use of the approximate similarity search tool BLAST is probably still the best way to find similar sequences. [David
Galas "Making Sense of the Sequence" Science 291: 12257-1260 Feb. 16, 2001] Genes (and their corresponding mRNAs and proteins) are identified by aligning reference sequences
(RefSeq), GenBank, mRNAs, and ESTs to the genome sequence using a program called Acembly.
Acembly takes advantage of paired EST reads, measured clone lengths, and polyA tails. Transcript
models are reconstructed by attempting to settle disagreements between individual sequence
alignments without using an a priori model (such as codon usage, initiation, or polyA signals). In
practice, there is an initial low stringency analysis followed by a clean up procedure
which keeps the
best hits. ... An obvious challenge in using alignments to annotate genes is the treatment of sequence differences
between the mRNA and genomic sequence. These differences could represent sequencing errors,
assembly errors, naturally occurring polymorphisms, or paralogs. It is difficult to resolve these
differences automatically; therefore the default treatment is to provide the mRNA and protein sequence
that corresponds to the genomic sequence. The only exception is where a sequence difference changes
the reading frame relative to the supporting mRNA and EST data; then the genomic sequence is
frameshifted to provide the protein product that corresponds to the mRNA data.
[NCBI Contig Assembly and Annotation Process, 2001] http://www.ncbi.nlm.nih.gov/genome/guide/build.html#contig There are two basic approaches to gene identification: by homology
and ab initio approaches. gene parsing: Initial gene parsing methods were then simply
based on word frequency computation, eventually combined with the detection of
splicing consensus motifs. The next generation of software implemented the same
basic principles into a simulated neural network architecture (Uberbacher and
Mural 1991). Finally, the last generation of software, based on Hidden Markov
Models, added an additional refinement by computing the likelihood of the
predicted gene architectures (e.g., favoring human genes with an average of
seven coding exons, each 150 nucleotides long) is added (Kulp et al. 1996; Burge
and Karlin, 1997)). These ab initio methods are used in conjunction with a
search for sequence similarity with previously characterized genes or expressed
sequence tags (EST). [JM Claverie "From
Bioinformatics to Computational Biology" Genome
Res 10: (9) 1277- 1279.Sept. 2000] http://igs-server.cnrs-mrs.fr/igs/abstract/an2000/abstract13.html gene prediction: Many methods for predicting genes are based
on compositional signals that are found in the DNA sequence. These methods
detect characteristics that are expected to be associated with genes, such
as splice sites and coding regions, and then piece this information together
to determine the complete or partial sequence of a gene. Unfortunately,
these ab initio methods tend to produce false positives, leading
to overestimates of gene numbers, which means that we cannot confidently
use them for annotation. They also do not work well with unfinished sequence
that has gaps and errors, which may give rise to frameshifts, when the
reading frame of the gene is disrupted by the addition or removal of bases.
... The most effective algorithms integrate gene- prediction methods with
similarity comparisons.... The
most powerful tool for finding genes may be other vertebrate genomes. Comparing
conserved sequence regions between two closely related organisms will enable
us to find genes and other important regions in both genomes with no previous
knowledge of the gene content of either. [Ewan Birney et. al "Mining
the draft human genome" Nature 409: 827-828 15 Feb. 2001] Narrower term ab initio gene prediction. Sadly, it is often claimed that matching back cDNA to genomic sequences is
the best gene identification protocol; hence, admitting that the best way to
find genes is to look them up in a previously established catalog! Thus, the two
main principles behind state- of- the- art gene prediction software are (1) common
statistical regularities and (2) plain sequence similarity. From an
epistemological point of view, those concepts are quite primitive. [JM
Claverie "From Bioinformatics to Computational
Biology" Genome Res 10: (9) 1277-
1279.Sept. 2000] http://igs-server.cnrs-mrs.fr/igs/abstract/an2000/abstract13.html Algorithms
have been developed and are combined to recognise gene structural components. gene recognition: Principally used for finding open reading
frames, tools of this type also recognize a number of features of
genes, such as regulatory regions, splice junctions, transcription and
translation stops and starts, GC islands, and poly adenylation sites.
[Laura De Francesco "Some things considered" Scientist 12[20]:18, Oct.
12, 1998]
http://www.the-scientist.com/yr1998/oct/profile1_981012.html granularity: Computers & computing glossary Hidden Markov Models HMM: Searching a protein sequence database
for homologues is a powerful tool for discovering the structure and function
of a sequence. Amongst the algorithms and tools available for this task,
Hidden Markov model (HMM) - based search methods improve both the sensitivity
and selectivity of database searches by employing position- dependent scores
to characterize and build a model for an entire family of sequences. HMMs have been used to analyze proteins using two complementary strategies.
In the first, a sequence is used to a search a collection of protein families,
such as Pfam, to find which of the families it matches. In the second approach
an HMM for a family is used to search a primary sequence database to identify
additional members of the family. The latter approach has yielded insights
into protein involved in both normal and abnormal human pathology. [Lawrence Berkeley Lab, US "Advanced
Computational Structural Genomics"] http://cbcg.lbl.gov/ssi-csb/Meso.html homology model, homology modeling: Structural
genomics glossary in silico: Literally "in the computer". Can be
used to screen out compounds which are not druggable. Narrower
terms: in silico biology, in silico modeling, in silico
proteomics, in silico screening; Cell biology virtual cells in silico;
Related terms
rules of five Chemoinformatics glossary in silico biology: Advances in genomics and proteomics have greatly improved our knowledge of the components of biological systems at the molecular level. The next logical step is to try to understand how these components interact well enough to model those biological systems
in silico. This conference will showcase examples and applications of computational modeling of cells, tissues, and disease. Faced with an overabundance of potential targets such models offer the promise of improved target prioritization compared with relying on empirical research alone. While such models are far from being a complete representation of a biological system, examples are already emerging where this method has aided in a greater understanding of a disease state as well as target prioritization and ultimately
drug development. Anyone interested in utilizing in silico methods as a valuable tool for development of therapeutics strategies should attend this event.
In
Silico Biology: Modeling Systems Biology for Research and
Target Prioritization June 2- 3, 2002 San Diego, CA The
considerable "algorithmic complexity" of biological systems requires a
huge amount of detailed information for their complete description. Although far
from being complete, the overwhelming quantity of small pieces of information
gathered for all kind of biological systems at the molecular and cellular level
requires computational tools to be adequately stored and interpreted.
Interpretation of data means to abstract them as much as allowed to provide a
systematic, an integrative view of biology.
Most of the presently available scientific journals focus either on accumulating
more data from elaborate experimental approaches, or on presenting new
algorithms for the interpretation of these data. Both approaches are
meritorious. However, since both communities do not interact much with each
other, neither the experimental nor the computational biologists really apply
the theoretical tools to that extent which would be possible and desirable to
achieve that progress of research which is already feasible. ["Aims and
Scope" In Silico Biology: An international journal of computational
biology] http://www.bioinfo.de/isb/aims.html Related
term: virtual cells in silico in
silico modeling: Modeling of biological pathways and other biological
processes for drug discovery and development. Given the enormous increase in
genetic and molecular data, such models will continue to improve and are
predicted to become an essential tool for evaluating hypotheses, with only the
more promising ones being subjected to empirical testing. [CHI Breaking
Bottlenecks] in silico proteomics: Prediction of protein
structure and function. [Gareth W. Roberts and Jonathan Swinton "In Silico
Proteomics: Playing by the rules" Current Drug Discovery 5: Aug. 1, 2001] http://www.current-drugs.com/CDD/CDD/CDDPDF/issue%205/Roberts.pdf in silico screening: See virtual screening Chemoinformatics
glossary ligand docking: See under docking. molecular graphics: A technique for the visualization
and manipulation of molecules on a graphical display device. [IUPAC Computational] molecular mimicry: Drug discovery & development glossary molecular modeling, molecular modelling: A technique for the investigation of molecular
structures and properties using computational chemistry and graphical visualization techniques in order to provide a plausible three- dimensional
representation under a given set of circumstances. [IUPAC Medicinal
Chemistry, IUPAC Computational] The scope note for the Journal of Molecular Modeling
includes the following subjects: computer- aided molecular design, rational
drug design, de novo ligand design and receptor modeling, ·
application of computational and modeling methods in the field of medical
chemistry, protein and peptide modeling, quantum chemistry, application of semi
empirical, DFT and ab initio calculations, · prediction of biological
activities (QSAR) and physico- chemical properties (QSPR), molecular
mechanics/ dynamics simulation of polymers and biopolymers, genetic
algorithms and neural nets, modeling of catalysts, advanced
materials, and stationary phases in separation science, enhanced desktop
computational tools for the life sciences visualisation, classification and
handling of chemical data. htttp://link.springer.de/link/service/journals/00894/aims.htm Molecular modeling cannot be better than the forces underlying simulations..
A modeler seeking to describe, for example, a protein for the first time,
needs to often complement through quantum- chemical calculations force fields
provided in existing programs. [Opportunities
in Molecular Biomedicine in the Era of Teraflop Computing: March
3 & 4, 1999, Rockville, MD, NIH Resource for Macromolecular Modeling
and Bioinformatics Beckman Institute for Advanced Science and Technology,
University of Illinois at Urbana- Champaign] http://www.ks.uiuc.edu/Publications/Reports/teraflop/node4.html Molecular modeling applications use falls into two broad categories:
interactive visualization and computational analyses. ... Three of the most prominent uses of modern molecular
modeling applications are structure analysis, homology modeling,
and docking ... in essence, objective modeling revolves around three
different approaches (each based on different underlying physical and chemical
theories): molecular dynamics, molecular mechanics, and quantum
mechanics (In-depth). All of these are concerned with developing a
unique solution to what is referred to as the "protein folding" problem
- designing and testing algorithms and applications that will reliably
predict 3-D structure from primary sequence. [Christopher Smith "Molecular
Modeling - Seeing the Whole Picture with Modeling Software Packages" Scientist
12[17]:0, Aug. 31, 1998] http://www.the-scientist.com/yr1998/august/profile2_980831.html Related terms computational chemistry, Computer Assisted Drug Design;
molecular graphics, In-depth
molecular dynamics, molecular mechanics. Molecular modeling software includes AMBER, DOCK, MODELER, RasMol and
many other programs. molecular models: Models used experimentally or theoretically
to study molecular shape, electronic properties, or interactions; includes
analogous molecules, computer generated graphics, and mechanical structures. [MeSH] molecular recognition: Drug discovery
and development glossary ORF prediction: Related terms exon prediction, gene prediction,
gene recognition. peptidomimetic: Drug discovery & development
glossary phenomics: Omes & omics glossary protein structure prediction: Structural
genomics glossary quantum mechanics: receptor mapping: The technique used to describe the geometric and/or
electronic features of a binding site when insufficient structural data for this
receptor or enzyme
are available. Generally the active site cavity is defined by comparing the
superposition of active to that of inactive molecules. [IUPAC Medicinal
Chemistry, IUPAC Compendium] Over the past ten to fifteen years [before
1987], receptor mapping has expanded from a very minor technique, besieged
by problems and limited in its approach, to one that is widespread, extended
beyond receptors and applied to clinical problems and populations with
modern imaging and scanning techniques. [MJ Kuhar "Imaging receptors for
drugs in neural tissue" Neuropharmacology 1987 Jul. 26 (7B): 911-6] recognition site: Drug
discovery and development glossary scoring methods: Related term Sequencing
glossary simulated annealing SA: A procedure used in molecular dynamics
simulations, in which the system is allowed to equilibrate at high temperatures,
and then cooled down slowly to remove kinetic energy and to permit trajectories
to settle into local minimum energy conformations. [IUPAC Computational] simulations: Up until now, biomolecular simulations in drug design
have been of limited use because of the short time scales, long turnaround
times (implying poor sampling), the limited accuracy of simulations alluded
to above, and the relatively small size of systems simulated when one wishes
to account for proper inclusion of the physiological environment like membranes
and solvent. Developing a new drug goes beyond finding binding compounds
and must rely on good properties from the outset: activity, absorption,
distribution, metabolism, excretion. Pharmacological researchers would
like to predict these properties first, before one optimizes activity as
conventionally done, and before analogs are made. ... When sufficient resources
are available, simulations can determine the relative free energy values
of drugs passing through membranes. These values are required to estimate
the bioavailability of drugs. [Opportunities in Molecular Biomedicine
in the Era of Teraflop Computing March 3 & 4, 1999, Rockville,
MD, NIH Resource for Macromolecular Modeling and Bioinformatics Beckman
Institute for Advanced Science and Technology, University of Illinois at Urbana-
Champaign]
http://www.ks.uiuc.edu/Publications/Reports/teraflop/node4.html structural homology: Structural
genomics glossary structure analysis: The integration of gene identification and
promoter recognition programs will be very important point for a complete
gene structure analysis. [HGMP training course notes: "Gene Structure Prediction"
Luciano Milanesi, I.T.B.A-CNR, Italy, 1998] http://www.hgmp.mrc.ac.uk/Courses/GeneProteinID/milanesi/milanesi.htm structure prediction problem: Structural
genomics VRML Virtual Reality Modeling Language: An open language under
development. [Web3D Consortium] http://www.web3d.org/vrml/vrml.htm VRML was supposed to be the standard language for V[irtual]
R[eality], but VRML browsers and plug- ins tend to be large.
XML (Extensible Markup Language) is emerging as the most likely
alternative to or fix for VRML. [Mike Hurwicz "Virtual Reality in
VRML or XML?" Web Developer's Journal June 21, 2000] http://www.webdevelopersjournal.com/articles/virtual_reality.html van der Waals forces: The attractive or repulsive forces
between molecular entities (or between groups within the same molecular
entity) other than those due to bond formation or to the electrostatic
interaction of ions or of ionic groups with one another or with neutral
molecules. ... The term is sometimes used loosely for the totality of nonspecific
attractive or repulsive forces. [IUPAC Compendium] virtual cells in silico: Rapid accumulation of biological data from
genome, proteome,
transcriptome and metabolome projects can bring us to the point where it is no longer purely speculative to discuss how to construct virtual cells
in silico. This article describes attempts to construct whole cell models. The E-CELL project has completed a couple of virtual cell models, and computer simulations have revealed some biological surprises.
[M. Tomita, "Whole- cell simulation: a grand challenge of the 21st
century" Trends in Biotechnology 19 (6): 205- 210, June 2001] .
Related terms Omes & omics glossary metabolome,
transcriptome Virtual Cell, Dept of Plant Biology, Univ. of Illinois- Urbana Champaign, US http://www.life.uiuc.edu/plantbio/cell/ virtual library: Chemoinformatics
glossary virtual proteomics: See in silico proteomics virtual screening: Selection of compounds by evaluating their
desirability in a computational model. Also termed in silico
screening.
[IUPAC Combinatorial Chemistry] visualization: Algorithms & data
management glossary Bibliography [Tollenaere]
JP, EE Moret, Hyperglossary of [Molecular Modelling in Drug Design] Terminology,
Utrecht University, 1996. 150+ definitions. http://wwwcmc.pharm.uu.nl/webcmc/glossary.html Alpha
glossary index IUPAC definitions are reprinted with the permission of the International
Union of Pure and Applied Chemistry. In-depth Molecular Modeling glossary ab initio quantum mechanical methods (synonymous with
nonempirical quantum mechanical methods): Methods of quantum mechanical
calculations independent of any experiment other than the determination
of fundamental observables. The methods are based on the use of the full
Schrödinger equation to treat all the electrons of a chemical system.
In practice, approximations are necessary to restrict the complexity of the electronic wavefunction and to
make its calculation possible. In this way methods of density functional
theory are usually considered as ab initio quantum mechanical methods.
[IUPAC Theoretical] Methods of quantum
mechanical calculations independent of any experiment other than the determination
of fundamental constants. The methods are based on the use of the
full Schrödinger equation to treat all the electrons of a chemical
system. In practice, approximations are necessary to restrict the complexity
of the electronic wave function and to make its calculation possible. (Synonymous
with non-empirical quantum mechanical methods.) [IUPAC Computational] ab initio quantum mechanical modeling: The application
of ab initio modelling cross diverse fields such as condensed matter
physics, materials science and chemistry has been demonstrated over the past 10 years. It has become clear that
high quality simulations require a proper quantum mechanical treatment
of the bonding and other interatomic forces, and techniques for achieving
this are well established [1]. However, it is only recently that computational
techniques have provided the means to directly solve the quantum mechanical
equations for systems of sufficient complexity to provide useful information
in a biological context. The recent completion of the Human Genome Project will offer an unprecedented
number of protein receptors and enzymes as targets for pharmacological
intervention in disease processes. However, before this wealth of information
can be used to develop pharmaceuticals, an understanding of the biochemistry
of the newly identified proteins and their interactions must be obtained.
First principles quantum mechanical modelling will play an important role
in this process. It is important to achieve a mutual understanding, between
scientists applying ab initio modelling and those working in the biological
sciences, of the capabilities of ab initio modelling and the important
biological problems to which they may be applied. [Matthew
Segall, Ursula Röthlisberger, Paolo Carloni, CECAM/Psi-k Workshop: Ab Initio
Modelling in the Biological Sciences Lyon, France 11-13 June 2001] http://www.tcm.phy.cam.ac.uk/~mds21/Workshop2001/Scientific/node1.html#SECTION00010000000000000000 conformational analysis: Consists of the exploration of energetically
favorable spatial arrangements (shapes) of a molecule (conformations) using
molecular mechanics, molecular dynamics, quantum chemical
calculations or analysis of experimentally- determined structural
data, e.g., NMR or crystal structures. Molecular mechanics and quantum chemical methods are employed to compute
conformational energies, whereas systematic and random searches,
Monte
Carlo, molecular dynamics, and distance geometry are methods
(often combined with energy minimization procedures) used to explore the
conformational space. IUPAC Computational] decoys: Potential energy functions to fold proteins are usually
designed by a learning approach. A learning algorithm is presented with
a large set of wrong shapes [decoys] and a few native sequences. The energy
function is trained on the set to recognize the few correct folds and is
used and tested on other proteins that were not included in the training
set. Clearly the quality of the design will be improved significantly if
more decoys will be presented to the learning algorithm. The mechanism
of the learning also makes a difference. It is useful to make the native
fold the lowest energy state exactly. An exact solution makes it possible to increase the number of decoy structures
without limit, constantly improving the energy function. [Opportunities in Molecular Biomedicine in the Era of Teraflop Computing:
March 3 & 4, 1999, Rockville, MD, NIH Resource for Macromolecular Modeling
and Bioinformatics Beckman Institute for Advanced Science and Technology,
University of Illinois at Urbana- Champaign] http://www.ks.uiuc.edu/Publications/Reports/teraflop/node4.html energy function: Computationally, a shape is assigned to a protein
sequence based on an empirical energy function. The lower the energy of
a given structure, the more likely it is to be the correct fold. The structure
prediction challenge is therefore divided into two: (1) The first challenge
is the creation of many plausible folds or a set of structures that will
include the native shape. The creation of the appropriate set depends on
existing databases (such as the Protein Data Bank) or on the design of
automated algorithms (using physical or statistical information) to generate
plausible folds. Once the set is available, a selection procedure is used
to ``fish'' out the correct fold. (2) The ``fishing'' of the plausible
native shapes critically depends on the quality of the energy function.
The value of the energy function must be the lowest for the native structure.
Otherwise, a wrong structure is predicted. Therefore, the design of
appropriate empirical energy functions has attracted considerable attention
and much research, using a variety of techniques and algorithms. Both challenges
were addressed extensively in the last few years and while significant
progress has been made we still do not have satisfactory solutions. The
search of plausible structures is far from complete and native folds are
missed. Moreover, current scoring (energy) functions assign energy values
that are too high for many native shapes. Here we discuss the design of
new folding energies and why this task requires significant enhancement
in computer power. [Opportunities in Molecular Biomedicine in the Era of
Teraflop Computing: March 3 & 4, 1999, Rockville, MD, NIH Resource
for Macromolecular Modeling and Bioinformatics Beckman Institute
for Advanced Science and Technology, University of Illinois at Urbana-
Champaign]
http://www.ks.uiuc.edu/Publications/Reports/teraflop/node4.html force field: A set of functions and parametrization used in molecular
mechanics calculations. [IUPAC Computational] Long-time simulations will pose a challenging benchmark for the force
fields employed in molecular modeling. One question is, how will proteins
and DNA that were described by the available force fields (and remained
stable over nanosecond periods) behave in microsecond simulations? The
high cost of long- time simulations will require that the issue is addressed
in a systematic way by providing standard cases against which simulations
can be tested ... Much effort has been spent over the past two decades
to establish more and more faithful force fields. The primary concept behind
most present day force fields is to cast them into analytical functions
that are convenient and fast to evaluate. The price is that force fields
are largely based on simple heuristics that stem from knowledge of the
forces of atoms in various constellations of chemical bonds that have been
used in physical chemistry for a long time, e.g., the Lennard- Jones potential.
The difficulty in formulating force fields arises from accuracy as well
as from variety since all constellations of chemical bonds in biopolymers
need to be covered, requiring simplicity to keep the huge complexity in
check and requiring consistency when force fields are amended. The latter
is often achieved not solely on the basis of experimental information,
but increasingly by use of quantum- chemical calculations. Presently, computational
biologists hope to account also for the fact that the partial charges in
biomolecules themselves can be altered through the presence of electric
fields. The resulting polarization of biomolecules requires a new generation
of empirical force fields that is under development. ... The force fields do not cover the description of chemical
reactions in which bonds are altered, nor do they apply to the behavior
of biomolecules in electronically excited states or to novel constituents.
To repair this deficiency systematically requires a combination of quantum-chemical
calculations for the electronic degrees of freedom and classical simulations
for the motion of the atomic nuclei .. .Teraflop computing speeds promise
a dramatic improvement in this regard, permitting modelers to make more
routine use of quantum-chemically improved force fields and ultimately
computing [Opportunities in Molecular Biomedicine in the Era of Teraflop
Computing: March 3 & 4, 1999, Rockville, MD, NIH Resource for Macromolecular
Modeling and Bioinformatics Beckman Institute for Advanced Science
and Technology, University of Illinois at Urbana- Champaign] http://www.ks.uiuc.edu/Publications/Reports/teraflop/node4.html Related term van der Waals. molecular dynamics: A simulation procedure consisting of the
computation of the motion of atoms in a molecule or of individual
atoms or molecules in solids, liquids and gases, according to Newton's
laws of motion. The forces acting on the atoms, required to simulate their
motions, are generally calculated using molecular mechanics force
fields. [IUPAC Computational] The Parrinello group has applied ab initio Molecular Dynamics
(MD) in which all forces were computed quantum- chemically to chemical reactions
in general and to biological systems in particular, with results that compared
favorably with experiment and older force field methods. The ab initio
method was found to be of ``useful accuracy'' for simulations of biomolecules
... With a 1000 times faster computer (relative
to 32 processors on a Cray T3E) the dynamics of a quantum- chemical system
consisting of up to 10 atoms could be simulated for 10 s. For 100 quantum-
chemically treated atoms 10 ns would be possible. Assuming linear scaling,
1000 quantum-chemically treated atoms could be followed for 1 ns. New possibilities
which open up by using a teraflop computer and ab initio MD methods
are the study of enzymatic reactions at the heart of many important biomolecules
and the complete quantum-chemical description of small peptides. Problems
such as RNA and DNA drug interactions, the determination of complex enzymatic
reaction pathways, and the realistic study of electron transfer can be
tackled. [Opportunities in Molecular Biomedicine in the Era of Teraflop
Computing: Report on a Meeting Held March 3 & 4, 1999 in Rockville,
MD, Organized by the NIH Resource for Macromolecular Modeling and Bioinformatics
Beckman Institute for Advanced Science and Technology, University of Illinois
at Urbana- Champaign]
http://www.ks.uiuc.edu/Publications/Reports/teraflop/node4.html molecular mechanics: The calculation of molecular conformational
geometries and energies using a combination of empirical force fields (Burkert
and Allinger, 1982). Method of calculation of geometrical and energy characteristics
of molecular entities on the basis of empirical potential functions
(see force field) the form of which is taken from classical
mechanics. The method implies transferability of the potential functions
within a network of similar molecules. An assumption is made on "natural”
bond lengths and angles, deviations from which result in bond and angle
strain respectively. Repulsive or attractive van der Waals and electrostatic
forces between nonbonded atoms are also taken into account. Synonymous
with force field method. [IUPAC Computational] Monte Carlo technique: A simulation procedure consisting of randomly
sampling the conformational space of a molecule. [IUPAC Computational] Broader
term simulation NIH Guide to Molecular Modeling Gateway: http://cmm.info.nih.gov/modeling/gateway.html quantum chemical calculations: Molecular property calculations
based on the Schrödinger equation, which take into account the interactions
between electrons in the molecule. [IUPAC Computational] semi-empirical methods: Molecular orbital calculations using
various degrees of approximation and
using only valence electrons. [IUPAC Computational] semi-empirical quantum mechanical methods: Use parameters derived
from experimental data to simplify computations.
The simplification may occur at various levels: simplification of the Hamiltonian
(e.g. as in the Extended Hückel method), approximate evaluation of
certain molecular integrals (see, for example, zero differential
overlap), simplification of the wave function (for example, use of p electron
approximation as in Pariser-Parr-Pople). [IUPAC Computational] |