| You are here Glossary homepage/Search
> Applications > Sequencing Sequencing Glossary Evolving Terminology for Emerging
TechnologiesComments? Suggestions Revisions? mchitty@healthtech.com Last revised December 06, 2001. The "race" to sequence the Human Genome is not a 50 yard
dash, but a marathon. Although the Human Genome Project is well ahead of schedule, and a number of genes have been identified, we have
just begun to get a glimpse of what specific genes do and how we might be
able to better use this knowledge for therapeutic interventions. Teasing
apart the interactions of genes and proteins, delineating changes throughout
the cell cycle, and correlating changes with health and disease will take even more time.
But with complete sequences, and the possibility of cross- species comparisons
we can expect new insights and speeding up over time.And sequencing DNA is only a first step towards finding what functions are
connected with specific sequences. Sequencing proteins
(and then determining the structure of proteins – and the function of proteins)
comes next. Progress is being made with proteins, but sequencing of carbohydrates
is even more difficult. Related glossaries include Applications Functional
genomics, Proteomics,
Informatics Algorithms
Bioinformatics, Molecular
modeling, Technologies Chromatography
& electrophoresis, Mass spectrometry, Biology
Genetic
variations, Proteins,
Protein
Structures, Sequences
- DNA & beyond. Additional definitions appear in the In-depth
glossary, after the Bibliography. alignment: The process of lining up two or more sequences to achieve maximal levels of
identity (and conservation, in the case of amino acid sequences) for the purpose of assessing the degree of similarity and the possibility of homology.
[NCBI Bioinformatics] Narrower terms In-depth global alignment, local alignment, optimal
alignment, pairwise alignment.
Related terms BLAST In-depth BEAUTY, BLAST2, FASTA, gapped BLAST, Needleman -
Wunsch, Smith - Waterman alignment. assembled: The term used to describe the process of using a computer
to join up bits of sequence into a larger whole. [Peer Bork, Richard Copley
"Filling in the gaps" Nature 409: 218-820, 15 Feb. 2001] This is different from assembly language, and the source of confusion between
biologists and computer scientists. Related term contig assembly clone, cloning: Cell biology glossary contig: Group of cloned (copied) pieces of DNA representing overlapping regions of a particular chromosome.
[DOE] Narrower terms In-depth initial sequence contigs, merged sequence
contigs. Related terms clone, contig assembly, scaffolds. Published genome sequence has many gaps and interruptions. Concept of
"contig" is crucial to our understanding of current limitations. [David Galas
"Making sense of the sequence" Science 291 (5507):
1257, Feb. 16, 2001] contig assembly: One of the most difficult and critical functions
in DNA sequence analysis is putting together fragments from sets of overlapping
segments. Some programs do this better than others, particularly when dealing
with sequences containing gaps. [Laura De Francesco "Some things considered"
Scientist 12[20]:18, Oct. 12, 1999] http://www.the-scientist.com/yr1998/oct/profile1_981012.html National Center for Biotechnology Information, US, NCBI Contig Assembly
and Annotation Process, Feb. 2001 http://www.ncbi.nlm.nih.gov/genome/guide/build.html coverage (or depth): The average number of times that a nucleotide is represented by a
high- quality base in a collection of random raw sequence. Operationally, a
`high- quality base' is defined as one with an accuracy of at least 99% (corresponding to a Phred score of at least 20).
[UC-Santa Cruz, US, Human Genome Project Working Draft Terminology, 2001] http://genome.ucsc.edu/goldenPath/term.html depth:
See under coverage directed sequencing: See under shotgun sequencing draft genome sequence [human]: The sequence produced by combining the
information from individual sequenced clones (by creating merged sequenced
contigs and them employing linking information to create scaffolds) and
positioning the sequence along the physical map of the chromosomes. (Nickname
"golden path".) [Univ. of California Santa Cruz Human Genome Project
Working Draft terminology] http://genome.ucsc.edu/goldenPath/term.html Sequence with lower accuracy than a finished
sequence; some segments are missing or in the wrong order or orientation.
[History
of the Human Genome Project" A Genome Glossary" Science 291: pullout chart
Feb. 16, 2001] See also working draft, human sequence dynamic programming methods: Assure the optimal global (Needleman and
Wunsch 1970; Sankoff and Kruskal 1983) or local (Smith, et al. 1981)
alignment by simply exploring all possible alignments and choosing the best. ["Pedestrian guide to analysing sequence databases" Burkhard Rost, Reinhard Schneider, 1999]
http://cubic.bioc.columbia.edu/papers/1999_pedestrian/paper.html These methods allow the introduction of artificial gaps in aligned sequences
to create an optimal alignment. Related terms alignment, In-depth gap penalties,
global alignment, local alignment, Needleman-Wunsch, sequencing
algorithms EST assembly: finished sequence - human: Sequence in which bases are identified to
an accuracy of no more than 1 error in 10,000 and are placed in the right
order and orientation along a chromosome with almost no gaps. [History
of the Human Genome Project" A Genome Glossary" Science 291: pullout chart
Feb. 16, 2001] Each base pair has been sequenced 8-10 times, with the remaining gaps
limited by present technology. ... No eukaryotic genome sequenced so
far has been totally sequenced - current technology isn't up to it. Highly
repetitive regions (not expected to contain many protein- coding genes)
can be impossible (or very difficult) to clone. One definition of "finished"
is that fewer than one base in 10,000
is incorrectly assigned. [Peer Bork, Richard Copley "Filling in the gaps" Nature
409: 218-820, 15 Feb. 2001] At some level it’s a little arbitrary when you declare a sequence essentially
complete." says NHGRI Director Francis Collins… The definition
of finished is evolving. Our definition today is different from
10 years ago. Ten years ago we didn’t even think at the level of genomes."
says Laurie Goodman, editor of Genome Research. "I think the community
at large should define done. Not everyone is going to agree, but
when you’re using the word you should define what it means." Francis Collins
says "You’re done when you’ve exhausted the standard methods for closing
the gaps. There should be some biological reason why those last bits of
sequence eluded you – not because you just didn’t bother." ["Are we there
yet?" The Scientist :12 July 19, 1999] http://www.the-scientist.com/yr1999/july/hopkin_p12_990719.html Related terms finished clone, Human Genome Project, post-genomic.
Genomics
glossary finishing standards - Human Genome Project: The International
Human Genome Consortium recognizes the need to maximize the likelihood
that the finished human genome sequence meets consistent standards of quality
across all participating genome centers, and to adopt uniform practices
and annotation for regions that present problems for current sequencing
technology. At the Seventh International Meeting, the Consortium approved
a detailed set of consensus standards for what should be considered as
finished sequence, a set of rules for dealing with regions that are difficult
to resolve, and a set of finishing annotation tags to be submitted with
accessions. [Washington Univ. School of Medicine "Finishing Standards"
Dec. 12, 2000] http://genome.wustl.edu/gsc/Overview/finrules/hgfinrules.html full shotgun coverage: The coverage in random raw sequence needed from
a large-insert clone to ensure that it is ready for finishing; this varies among
centers but is typically 8-10 fold. Clones with full shotgun coverage can
usually be assembled with only a handful of gaps per 100 kb. [Univ. of
California Santa Cruz Human Genome Project Working Draft terminology] http://genome.ucsc.edu/goldenPath/term.html gap: A space introduced into an alignment to compensate for insertions and
deletions in one sequence relative to another. To prevent the accumulation of too many gaps in an alignment, introduction of a gap causes the deduction of a fixed amount (the gap score) from the alignment score. Extension of the gap to encompass additional nucleotides or amino acid is also penalized in the
scoring of an alignment.
[NCBI Bioinformatics] Narrower term In-depth gap penalties genotyping: The determination of relevant nucleotide-base sequences
in each of the two parental chromosomes. [CHI SNPs Update] Broader term sequencing;
Narrower term haplotyping genotyping technologies: Genotyping technologies have proliferated rapidly in recent years, and at least one hundred methods are
currently available for detecting the genotypes of individual
SNPs. In diploid organisms, such as humans, the linkage of particular SNP genotypes on each chromosome in a homologous pair (the haplotype) may provide additional information
not available from SNP
genotyping alone. [Lawrence Berkeley Lab "High
Throughput Haplotying of Diploid Organisms, 2001] http://www.lbl.gov/Tech-Transfer/collaboration/techs/lbnl1748.html
Related terms Genetic
variations glossary haplotyping, Sequencing
glossary global alignment: The alignment of two nucleic acid or protein sequences over their entire length.
[NCBI Bioinformatics] Related term dynamic programming methods, Broader term
alignment haplotyping: Broader
terms genotyping, sequencing haplotyping technologies: Hidden Markov Models HMM: Algorithms
& data management homology: Narrower terms sequence homology, sequence
homology- nucleic acid; Functional
genomics glossary homology Related terms homolog (homologue),
similarity Molecular
modeling homology modeling human sequence: See draft
sequence, finished sequence, published sequence, working draft library; library, genomic: Cell biology glossary protein sequence: Proteins glossary published working drafts - human genome: International Human Genome Sequencing Consortium special issue: Nature
409 (6822) 15 Feb 2001 http://www.nature.com/nature/journal/v409/n6822/
http://www.nature.com/genomics/human/papers/analysis.html Human Genome [Celera Genomics sequence] special issue: Science 291
(5507) Feb. 16, 2001 http://www.sciencemag.org/content/vol291/issue5507/index.shtml random sequencing: See under shotgun sequencing resequencing: Previously sequenced site is resequenced for SNP
discovery or other purposes. [CHI SNPs] Eric Lander, director of the Whitehead Institute's Center for Genome Research, and professor of biology at
MIT notes " The human genome will need to be sequenced only once, but it will be
resequenced thousands of times, in order, for example to unravel the polygenic
factors underlying human susceptibilities and predispositions … Re-sequencing
will also provide the ultimate tool for genotyping studies" [E. Lander
"The New Genomics" Science 274: 536, 25 Oct. 1996] rough drafts - human genome:
Related terms finished sequence, finishing standards, published working drafts,
working drafts scaffolds: Ordered set of contigs placed on the chromosome. [NCBI,
Human Genome Home "Contig Assembly Process" Glossary, Feb. 2001] http://www.ncbi.nlm.nih.gov/genome/guide/build.html#glossary. A series of contigs that are in the right order but are not necessarily
connected in one continuous stretch of sequence. [History of the Human
Genome Project" A Genome Glossary" Science 291: pullout chart Feb. 16,
2001] The definition of a scaffold appears to be quite different in the Science
and Nature draft published sequences. [David Galas "Making sense of sequence"
Science 291: 1257- Feb. 16, 2001] This is also different from the scaffold defined in Drug
discovery and development glossary. The result of connecting contigs by linking information, such as paired-end
reads from plasmids, paired-end reads from BACs, known mRNAs, or other sources.
The contigs in a scaffold are ordered and oriented with respect to one another.
[Univ. of California Santa Cruz Human Genome Project Working Draft
terminology] http://genome.ucsc.edu/goldenPath/term.html Narrower terms In-depth sequence- contig scaffold, sequenced- clone-
contig scaffold Related term In-depth contig assembly. scoring methods: Many choices, best choice often problem dependent.. Nice review "Sequence
Analysis: Which scoring method should I use? Pittsburgh Supercomputing Center,
Carnegie Mellon Univ. 1999] http://www.psc.edu/research/biomed/homologous/scoring_primer.html Related terms In-depth filtering, masking, SNP scoring. Molecular
modeling glossary homology modeling sequence alignments: See alignments. sequence analysis: Sequence analysis is a robust field,
and mining sequence data using bioinformatics
is one of the main activities of genomics- based drug discovery. Using
sequence analysis to understand whole genomes may provide an important
advantage for groups looking for new drug targets among genes, or trying to pick
the best among targets they already have. Sequence analysis is one of the most widely used techniques in genomics. A great deal of sequence work will continue to be done, as researchers
fill in the gaps left in the genome maps of humans and other important
organisms. Studies to confirm sequence, and to identify SNPs, will also need to
continue. [CHI Bioinformatics]. sequence homology: <molecular biology> Strictly, refers
to the situation where nucleic acid or protein sequences are similar because
they have a common evolutionary origin. Often used loosely to indicate
that sequences are very similar. Sequence similarity is observable, homology
is an hypothesis based on observation. (18 Nov. 1997) [OMD] Broader
term Functional genomics glossary homology; Narrower term sequence homology-
nucleic acid; Related terms Functional
genomics glossary evolutionary homology; Proteomics glossary
regulatory homology; Molecular modeling glossary homology
modeling; Structural genomics glossary
structural homology sequence homology - nucleic acid: The sequential correspondence
of nucleotide triplets in a nucleic acid molecule which permits nucleic
acid hybridization. Sequence homology is important in the study of mechanisms
of oncogenesis and also as an indication of the evolutionary relatedness
of different organisms. The concept includes viral homology. [MeSH] Broader term
sequence homology sequencing: (proteins, nucleic acids) Analytical procedures for
the determination of the order of amino acids in a polypeptide chain or
of nucleotides in a DNA or RNA molecule. [IUPAC Compendium] Largely automated. Sequencing of biomolecules began with the insulin B-chain - a thirty residue
peptide - which Saenger and Tuppy deduced through a combination of limited
proteolysis and chemical analysis in 1951. It was a full 14 years later, until
Holley et al. determined the sequence of alanine tRNA from yeast. And it took
another 12 years, until "real" DNA sequencing was developed by Maxam
& Gilbert and Saenger et al in 1977. [Introduction to
bioinformatics, Univ. of Munich Gene Center, Germany, Summer 2000] http://www.lmb.uni-muenchen.de/groups/bioinformatics/01/ch_01_1.html Narrower terms resequencing, sequencing - algorithms, sequencing - cost of,
sequencing - high- throughput, sequencing - throughput, shotgun sequence, single
DNA molecule sequencing, whole genome shotgun sequencing In-depth chain
termination sequencing, chemical cleavage sequencing, chemical degradation
sequencing, de novo sequencing, dideoxy sequencing, microsequencing,
minisequencing, multiplex sequencing, Sanger sequencing sequencing algorithms: See In-depth BLAST, FASTA, Needleman - Wunsch, Smith - Waterman sequencing - cost of: The cost of sequencing a single DNA base
[when the Human Genome Project was iniated] was about $10 then; today, sequencing costs have fallen about 100-fold to $.10 to $.20 a base and still are dropping rapidly.
[Human Genome News 11 (1-2) Nov. 2000] http://www.ornl.gov/hgmis/publicat/hgn/v11n1/01giants.html sequencing - high- throughput: Uses robotics, automated DNA-
sequencing machines and computers. sequencing - throughput: Production of genome sequence has skyrocketed over the past year, with
more than 90 percent of the sequence having been produced in the past 15
months alone. Because of this increased capacity, the next phase is expected
to move much more rapidly than previously expected. [NHGRI, "International
Human Genome Sequencing Consortium Publishes Sequence and Analysis of the Human Genome" Washington, D.C., February 12, 2001]
http://www.nhgri.nih.gov/NEWS/initial_sequencePR.html shotgun sequencing method: Sequencing method which involves randomly sequencing
tiny cloned pieces of the genome, with no foreknowledge of where on a chromosome
the piece originally came from. This can be contrasted with "directed"
[sequencing] strategies, in which pieces of DNA from adjacent stretches of a chromosome
are sequenced. Directed strategies eliminate the need for complex reassembly
techniques. Because there are advantages to both strategies, researchers
expect to use both random (or shotgun) and directed strategies in combination
to sequence the human genome. [DOE] Uses dynamic programming methods.
Narrower term whole genome shotgun
sequencing. similarity: Functional genomics
glossary similarity search: BLAST, FASTA and Smith- Waterman (see
In-depth) are examples of similarity search algorithms. single DNA molecule sequencing: The evolution of technology for single DNA molecule sequencing will ultimately permit
whole genome analysis of populations of cells at high resolution and will obviate current
PCR- based approaches, particularly important for sequencing diploid or polyploid cells.
This is the ultimate in sensitivity, and perhaps difficulty. Further in the future, it might be
possible to utilize the protein synthesis machinery of the cell as a "sequencing engine."
[National Center for Research Resources "Integrated Genomics Technologies
Workshop Report" Jan 1999] http://www.ncrr.nih.gov/newspub/genomic.pdf viral
homology: See under sequence homology- nucleic acid whole genome shotgun sequencing: Celera’s whole genome shotgun sequencing technique involves sequencing from both ends of the double stranded cloned DNA. Celera’s accurately paired clone end sequences are a key tool for assembling the genome much more completely than single stranded sequencing methods allow at comparable levels of sequence coverage. Celera’s paired
end- sequencing strategy, as part of the whole genome shotgun sequencing technique, has now produced sequence pairs from clones that cover the human genome 11 times. The company believes that 99% of the human genome is represented in the cloned DNA.
[press release "Celera Genomics completes sequencing phase of the genome
from one human being" Rockville,
MD, April 6, 2000] http://www.pecorporation.com/press/prccorp040600.html
Broader term shotgun sequencing method. "working draft, human genome sequence":
This
milestone was announced at the White House (Washington DC, US) on June
26, 2000. President Bill Clinton was joined by Francis Collins (National
Human Genome Research Institute) and Craig Venter (Celera Genomics) and
heads of the major US genome sequencing centers. Work continues to be done
on annotating the sequence, but further celebration ensued with publication
of two versions of the sequence in Feb. 2001. Related terms draft
sequence, finished sequence - human, published working drafts. Genomics
glossary Human Genome
Project Bibliography NCBI (US) BLAST Glossary, 2000. 40+ definitions http://www.ncbi.nlm.nih.gov/Education/BLASTinfo/glossary2.html Alpha
glossary index IUPAC definitions are reprinted with the permission of the International
Union of Pure and Applied Chemistry. In-depth Sequencing glossary BEAUTY BLAST Enhanced Alignment Utility: An enhanced version
of the NCBI's BLAST database search tool. BEAUTY, when used to search three
new custom sequence databases that we have developed, incorporates information
on sequence family membership, the location of the conserved domains, and
the locations of any annotated domains and sites directly into BLAST search
results. These enhancements make it much easier to detect weak, but functionally
significant, matches in BLAST database searches. http://searchlauncher.bcm.tmc.edu:9331/seq-search/Help/beauty.html BLAST (Basic Local Alignment Search Tool): Software program from
NCBI for searching public databases for homologous sequences or proteins.
Designed to explore all available sequence databases regardless of whether
query is protein or DNA. http://www.ncbi.nlm.nih.gov/BLAST/ Faster but less rigorous than FASTA or Smith- Waterman In-depth BLAST2: A newer release of BLAST that allows insertions or deletions
in the aligned sequences. Gapped alignments may be more biologically significant.
Synonymous with gapped BLAST [labvelocity.com] chain termination sequencing method: See Sanger sequencing
(under Maxam- Gilbert & Sanger). chemical cleavage sequencing: See Maxim- Gilbert sequencing. chemical degradation sequencing: See Maxim- Gilbert sequencing. consensus sequence: A theoretical representative nucleotide or
amino acid sequence in which each nucleotide or amino acid is the one,
which occurs most frequently at that site in the different forms which
occur in nature. The phrase also refers to an actual sequence, which approximates
the theoretical consensus. [MeSH] A sequence of DNA, RNA, protein or carbohydrate derived from a number
of similar molecules, which comprises the essential features for a particular
function. [IUPAC Bioinorganic] conserved sequence: A base sequence of a DNA molecule or protein
molecule is a sequence that has remained largely unchanged throughout evolution.
[DOE] A "highly conserved sequence" is a DNA sequence that is very similar
in several different kinds of organisms. Scientists regard these cross
species similarities as evidence that a specific gene performs some basic
function essential to many forms of life and that evolution has therefore
conserved its structure by permitting few mutations to accumulate in it.
[NHGRI] de novo sequencing: Determination of sequences (of genes
or amino acids) whose sequence is not yet known. Can be done with LC/MS/MS
or nanoelectrospray MS/MS. [CHI Proteomics] From the Latin
"de novo" from the beginning. See also Mass
spectrometry glossary. dideoxy sequencing: See Sanger sequencing under Maxam-Gilbert
& Sanger. FACS: Fluorescence activated cell sorter. Related terms flow
cytometry, flow sorting. FASTA: The first widely used algorithm for database similarity searching. The program looks for optimal local alignments by scanning the sequence for small matches called "words". Initially, the scores of segments in which there are multiple word hits are calculated ("init1"). Later the scores of several segments may be summed to generate an
"initn" score. An optimized alignment that includes gaps is shown in the output as "opt". The sensitivity and speed of the search are inversely related and controlled by the
"k-tup" variable which specifies the size of a "word". (Pearson and Lipman)
[NCBI Bioinformatics] More rigorous and slower than BLAST. http://fasta.bioch.virginia.edu/ filtering: Also known as masking. The process of hiding regions of (nucleic acid or amino acid) sequence having characteristics that frequently lead to spurious high scores.
[NCBI Bioinformatics] flow cytometry: Technique for characterizing or separating particles
such as beads or cells, usually on the basis of their relative fluorescence.
[IUPAC Combinatorial Chemistry] Technique using an instrument system for making, processing, and displaying
one or more measurements on individual cells obtained from a cell suspension.
Cells are usually stained with one or more fluorescent dyes specific to
cell components of interest, e.g., DNA, and fluorescence of each cell is
measured as it rapidly transverses the excitation beam (laser or mercury
arc lamp). Fluorescence provides a quantitative measure of various
biochemical and biophysical properties of the cell, as well as a basis
for cell sorting. Other measurable optical parameters include light absorption
and light scattering, the latter being applicable to the measurement of
cell size, shape, density, granularity, and stain uptake. [MeSH] Related terms FACS, flow sorting flow sorting: Employs flow cytometry to separate, according to
size, chromosomes isolated from cells during cell division when they
are condensed and stable. As the chromosomes flow singly past a laser beam,
they are differentiated by analyzing the amount of DNA present, and individual
chromosomes are directed to specific collection tubes. [Primer on Molecular
Genetics, ORNL, US] http://www.ornl.gov/hgmis/publicat/primer/intro.html GRAIL: Gene Recognition and Assembly Internet Link software http://compbio.ornl.gov/Grail-1.3/
A suite of tools designed to provide analysis and putative annotation
of DNA sequences both interactively and through the use of automated
computation. [Grail overview, Oak Ridge National Lab, US] http://compbio.ornl.gov/manuals/grail1.3-genquest.9605.shtml#GrailOverview Does this name refer in some way to Walter Gilbert's description of the Human
Genome Project as the "Holy Grail" of molecular biology? gap penalties: An important problem is the treatment of gaps,
i.e., residue inserted (or deleted) to optimise the objective function.
Usually, gap penalties (cost of inserting and extending gaps) are chosen
to be length dependent. Typically, the cost of extending a gap (gap elongation)
is 5-10 times lower than is the cost for introducing a gap (gap open).
The optimal choice of gap penalties depends on the particular method and,
in detail, on the particular sequence family ["Pedestrian guide to analysing
sequence databases" Burkhard Rost, Reinhard Schneider, 1999] http://cubic.bioc.columbia.edu/papers/1999_pedestrian/paper.html
Related terms alignment, dynamic
programming methods. Broader term gaps gapped BLAST: A version of the BLAST algorithm that allows
gaps (deletions and insertions) to be introduced into aligned sequences.
The scoring of these gapped alignments tends to reflect biological relationships
more closely. Synonymous with BLAST2. [labvelocity.com] initial sequence contigs: Derived from sequenced clones [David
Galas "Making sense of the sequence" Science 291: 1257-1260, 16 Feb.
2001] local alignment: The alignment of some portion of two nucleic acid or protein sequences.
[NCBI Bioinformatics] Best alignment method for sequences for whom
no evolutionary relatedness is known. See Smith- Waterman alignment.
Compare global alignment. masking: Also known as filtering. The removal of repeated or low complexity regions from a sequence in order to improve the sensitivity of sequence similarity searches performed with that sequence.
[NCBI Bioinformatics] Maxam-Gilbert sequencing & Sanger sequencing: The two basic
sequencing approaches, Maxam- Gilbert and Sanger, differ primarily in the
way the nested DNA fragments are produced. Both methods work because gel electrophoresis produces very high resolution separations of DNA molecules;
even fragments that differ in size by only a single nucleotide can be resolved.
Almost all steps in these sequencing methods are now automated. Maxam-
Gilbert sequencing (also called the chemical degradation method) uses chemicals
to cleave DNA at specific bases, resulting in fragments of different lengths.
A refinement to the Maxam- Gilbert method known as multiplex sequencing
enables investigators to analyze about 40 clones on a single DNA sequencing
gel. Sanger sequencing (also called the chain termination or dideoxy
method) involves using an enzymatic procedure to synthesize DNA chains
of varying length in four different reactions, stopping the DNA replication
at positions occupied by one of the four bases, and then determining the
resulting fragment lengths. [Primer on Molecular Genetics, Oak Ridge
National Lab,
US] http://www.ornl.gov/hgmis/publicat/primer/intro.html merged sequence contigs: Derived by merging sequence
contigs from overlapping sequenced clones. [David Galas "Making sense of
the sequence" Science 291: 1257-1260, 16 Feb. 2001] microsequencing: Sequencing of proteins or peptides in very small
amounts (sub microgram), sometimes for use as probes. minisequencing: A solid- phase method for the detection of any
known point mutation or allelic variation of DNA. In the method amplified, biotinylated DNA sequences containing the mutation site are immobilized
onto streptavidin coated microplate and primer extension reactions are
carried out using labeled nucleotides. Incorporation of the labeled nucleotide
is dependent on the genotype and is analyzed using ELISA technique. Assay
method allows automation. [Labsystems Oy, Finland] http://www.labsystems.fi/applications/photometry/an104.htm Single base sequencing. Related terms Genetic
variations glossary multiple sequence alignment: An alignment of three or more sequences with gaps inserted in the sequences such that residues with common structural positions and/or ancestral residues are aligned in the same column. ClustalW is one of the most widely used multiple sequence alignment programs.
[NCBI Bioinformatics] The concept of dynamic programming
cannot be extended to align more than three sequences optimally (Murata
1990). A way around this problem is to first find optimal pairwise alignments
and to then merge the pairs .["Pedestrian guide to analysing sequence
databases" Burkhard Rost, Reinhard Schneider, 1999] http://cubic.bioc.columbia.edu/papers/1999_pedestrian/paper.html Related term Hidden Markov Models HMM multiplex sequencing: See under Maxam- Gilbert sequencing. Needleman-Wunsch: Global sequence alignment algorithm. [Needleman,
S. B., Wunsch, C. D., "A general method applicable to the search for similarities
in the amino acid sequence of two proteins" J. Mol. Biol.( 48): 443-453
Mar. 1970] Related terms dynamic programming; Algorithms
& data management glossary, Molecular
modeling glossary optimal alignment: An alignment of two sequences with the highest possible score.
[NCBI Bioinformatics] Alignments are intended to unravel evolutionary pathways and/ or
structural homology between two proteins. These two objectives (functional/
structural) may be mutually
contradictory, i.e., the 'optimal' alignment' may differ according to the
objective. Yet another perspective is the 'mathematical' optimal alignment.
This is the alignment that optimises a given objective function, e.g.,
to find the alignment with the highest number of pairwise identical residues. FASTA and BLAST are not guaranteed to find such a mathematically optimal
alignment. ["Pedestrian guide to analysing sequence databases" Burkhard
Rost, Reinhard Schneider, 1999] http://cubic.bioc.columbia.edu/papers/1999_pedestrian/paper.html pairwise alignment: protein datasets: Available from Ensembl and NCBI. Involves finding new SNPs. ... tools are just beginning to emerge and many
more robust technologies are needed. [NIH, Methods for Discoverying and Scoring
Single Nucleotide Polymorphisms, Request for Applications Jan. 9, 1998] http://grants.nih.gov/grants/guide/rfa-files/RFA-HG-98-001.html SNP scoring: Involves methods to determine the genotypes of many
individuals for particular SNPs that haave already been discovered. ... tools
are just beginning to emerge and many more robust technologies are needed. http://grants.nih.gov/grants/guide/rfa-files/RFA-HG-98-001.html Sanger sequencing: See under Maxam-Gilbert sequencing. sequence tags: Sequence bits 2-4 contig residues in length.
Used to determine the mass of a particular sequence. [CHI Proteomics] Can
be used to search protein and EST databases with high specificity. [Blackstock
& Weir “Proteomics” Trends in Biotechnology 17:121 Mar 1999]' sequence-contig scaffold: Scaffold
produced by connecting a maximal set of sequence contigs joined by bridged gaps.
[Univ. of California Santa Cruz Human Genome Project Working Draft
terminology] http://genome.ucsc.edu/goldenPath/term.html sequenced-clone-contig scaffold: Scaffold
produced by joining sequenced clone contigs by bridged SCC gaps. [Univ. of
California Santa Cruz Human Genome Project Working Draft terminology] http://genome.ucsc.edu/goldenPath/term.html Smith-Waterman alignment: An amino acid sequence alignment
that illustrates sequence similarity. The alignment is generated using
the Smith- Waterman algorithm (Temple Smith and MS Waterman, J Mol Biol.
147: 195-197, 1981; WR Pearson Genomics 11:635-650, 1991) [SGD Saccharomyces
Genome Database glosssary, Stanford Univ.] http://genome-www.stanford.edu/Saccharomyces/help/glossary.htm Related
terms dynamic programming; Algorithms &
data management glossary, Molecular
modeling glossary |