| You are here Glossary
homepage/Search > Biology > Sequences DNA & beyond Sequences – DNA & beyond Evolving terminology for emerging
technologiesComments? Suggestions? Revisions? mchitty@healthtech.com Last revised December 27, 2001 Gene definitionsis inextricably linked to this glossary. Other related glossaries include
Applications: Genomics, Proteomics,
Sequencing Informatics: Algorithms,
Molecular
Modeling, Biology: Biomolecules,
Expression, Proteins,
Protein Structures.
Additional definitions appear in the In-depth glossary, after the Bibliography.alternative splicing: Gene
definitions Broader term splicing Related terms pre- mRNA splicing,
protein splicing, RNA splicing, trans- splicing alternative transcripts: Expression, genes &
beyond antisense DNA: Pharmaceutical
biology glossary antisense RNA: Pharmaceutical
biology glossary cDNA complementary DNA: Gene definitions carbohydrate sequence: The sequence of carbohydrates within POLYSACCHARIDES,
GLYCOPROTEINS, and GLYCOLIPIDS. [MeSH] Biomolecules
glossary central dogma: Horace Judson Freeland quotes Francis Crick talking about the central dogma "Nobody tried
to go from protein sequence back to nucleic acid, because that just wasn't on.
You see. But I don't think it was ever discussed. ... Jim, [Watson] you
might say, had it first. DNA makes RNA makes protein. That became then the
general idea. ... what are all the possible information flows?" [Freeland
asked why he had called it the central dogma?] "It was because, I think, of
my curious religious upbringing. Because Jacques [Monod] has since told me that
a dogma is something which a true believer cannot doubt!" Crick laughed.
... "But that wasn't what was in my mind. My mind was, that a dogma was an
idea for which there was no reasonable evidence. You see?!" And Crick gave
a roar of delight. "I just didn't know what dogma meant. And I could just
as well have called it the "Central Hypothesis" - you know. Which is
what I meant to say. Dogma was just a catch phrase. ... And it's a negative
hypothesis, so it's very very difficult to prove.... The central dogma is much
more powerful [than Crick's sequence hypothesis], and therefore in principle you
might have to say it could never be proved. But it's utility - there was
no doubt about that. Because if you didn't believe that, you could invent
theories, unlimited theories, whereas if you just put in that one assumption,
... then, essentially you were on the right track you see." ... "In
looking back I am struck not only by the brashness which allowed us to venture
powerful statements of a very general nature, but also by the rather delicate
discrimination used in selecting what statements to make. Time has shown that
not everybody appreciated our restraint" [HF Judson, Eighth Day of
Creation Cold Spring Harbor Laboratory Press 1996 pp. 333-334] F.
Crick "Central dogma of molecular biology' Nature227 (258): 561-563 Aug. 8,
1970 [historical article clarifying original explanation] The Oxford
English Dictionary makes clear the duality of dogma,
particularly in the context of dogmatic, defined as "accepted as
true instead of being based upon experience, particularly if done in an
imperious, arrogant manner". Dogma is defined as
"systematised beliefs" (sometimes deprecating). Dogmatic physicians
are cited as "an ancient sect" which "endeavoured to discover by
reasoning the essence and occult causes" of disease. Central dogma
chapter MIT Biology Hypertextbook http://esg-www.mit.edu:8001/esgbio/dogma/dogmadir.html Related terms
transcription, translation In-depth central dogma exceptions cis-splicing: The joining together, after removal of the intron, of
two segments of the same RNA molecule separated by an intron. Related
terms: intron, RNA splicing, Trans
splicing
[California Space Institute, Glossary, 2000]http://calspace.ucsd.edu/origins/Glossary/C.htm clone, cloning: Cell biology glossary coding region(s): Gene definitions codon: The sequence of three consecutive nucleotides that occurs
in mRNA which directs the incorporation of a specific amino acid
into a protein or represents the starting or termination signals of protein
synthesis. [IUPAC Biotech, IUPAC Medicinal Chemistry] A set of three nucleotides in DNA or RNA that codes for a specific amino
acid. The term is also used for the corresponding (and complementary) sequences
of three nucleotides in messenger RNA into which the original DNA
sequence is transcribed. [MeSH] Related terms transcription, translation.
Narrower terms start codon, stop codon. Coined by Sydney Brenner "for a triplet of bases that specifies an
amino acid, introduced partly in satirical reference to Seymour Benzer's "cistron", "recon," and "muton", Brenner's
"codon" is the one that survives in universal biological use. [HJ
Freeman Eighth Day of Creation, Cold Spring Harbor Laboratory Press, 1996
p. 469] DNA: Biomolecules
glossary DNA - RNA - protein: See central dogma DNA synthesis: DNA replication, the process of making copies
of strands of DNA. Existing DNA is used as a template for synthesizing
the new strands. [PhRMA] Related term protein synthesis ds: Double-stranded (DNA or RNA). downstream: Identifies sequences proceeding farther in the
direction of expression; for example, the coding region is downstream
from the initiation codon, toward the 3' end of an mRNA molecule.
Sometimes used to refer to a position within a protein sequence, in which case
downstream is toward the carboxyl end which is synthesized after the amino
end during translation. [Lemon] EST Expressed Sequence Tag: Partial gene sequence data of a cDNA
clone, which provide a sequence tag for a gene. In order to achieve a very
high throughput, these sequences are usually only subjected to a single
pass of sequencing so the error rate in these sequences can be high, perhaps
approaching 5%. [NCBI] Developed by Craig Venter and colleagues and further established by
the Merck Gene Index. Clones from cDNA libraries are sequenced (single
read) from the 3’ end. [R Strausberg et al "The Cancer Genome Anatomy Project"
Trends in Genetics 16(3): 103-106 March 2000] Often, but not necessarily
represent genes, generated through rapid, but error- prone, sequencing methods.
[CHI SNPs Update] Related terms Gene definitions
cDNA, transcript clusters; EST maps Maps
genomic & genetic exons: Gene definitions gene expression: Expression glossary genetic code: Gene definitions genomic DNA: Genomics glossary intron: An intervening section of DNA which occurs almost exclusively
within a eukaryotic gene, but which is not translated to amino acid sequences
in the gene product. The introns are removed from the pre- mature
mRNA through a process called splicing, which leaves the exons untouched,
to form an active mRNA. [IUPAC Bioinorganic, IUPAC Compendium] A segment of DNA that is transcribed, but removed from within the transcript
by splicing together the sequences (exons) on either side of it. [DDBJ/ EMBL/
GenBank
Feature Table] http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html Related terms exon,
"junk DNA", non- coding, In-depth untranslated regions UTR. "junk DNA": A general term that encompasses many different types
of DNA sequences. These sequences run the gamut from introns, the parts
of genes that are edited out during protein synthesis; transposable elements,
repeated DNA sequences that, like parasites, duplicate themselves, adding
nothing to the genome except more redundant sequence; and pseudo genes,
fossils of one- time genes…all of the regulatory elements – promoters and
inhibitors - required for gene transcription are spelled out somewhere between
the genes. The same is true of other elements deemed junk, such as introns
and RNA genes, which clearly hold important clues to understanding alternative
splicing … the term junk DNA is frequently used incorrectly. Numerous articles
in the medical literature use junk and non- coding DNA interchangeably.
[B. Kuska "Bring in Da Noise, Bring in Da Junk" JNCI 90(15): 1125-1127
Aug. 5, 1998] Dr. Susumu Ohno, writing in the Brookhaven Symposium on Biology
in 1972 in the article "So Much ‘Junk DNA" in our Genome’ is credited with
originating the term. But his paper was focused "mainly on the fossilized
genes, called pseudo genes, that are strewn like tombstones throughout
our DNA. But as the term caught on in the 1980’s, its meaning was extended
to all non- coding sequences, the vast stretches of DNA that are not genes
and do not produce proteins" (about 95% of the genome) … some [scientists]
have begun the scrap the notion that all non-coding DNA is junk …
"I don't think people take the term very seriously anymore" says Eric Green [NHGRI] whose group is mapping chromosome
7. [B. Kuska "Should Scientists
Scrap the Notion of Junk DNA?" JNCI 90(14): 1032-1033 July 15 1998] Narrower terms intron, non- coding, repetitive sequences. mRNA messenger RNA: An RNA molecule that transfers the
coding information for protein synthesis from the chromosomes to the
ribosomes.
mRNA is formed from a DNA template by transcription. It may be a copy of
a single gene or of several adjacent genes (polycistronic mRNA). On the ribosome, the sequence is converted into the programmed amino acid
sequence through translation. [IUPAC Biotech] Messenger RNA, an intermediate between DNA sequences and the production of
protein. The coding strand of DNA is transcribed as an mRNA (complementary to
the coding strand), which is then translated by transfer RNA (tRNA) and
building- block amino acids to produce a protein. [CHI Breaking Bottlenecks] Includes 5' untranslated region (5' UTR), coding sequences (CDS, exon)
and 3' untranslated region (3' UTR) [DDBJ/ EMBL/ GenBank Feature Table]
http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html Narrower terms antisense RNA, sense RNA, UTR Broader term RNA Related term reverse transcription methylation: Proteins glossary mtDNA: See mitochondrial DNA. messenger RNA: See mRNA. mitochondrial DNA: The genetic material of the mitochondria,
the organelles that generate energy for the cell. [NHGRI] Related terms mitochondrial
genes Gene Definitions; Cell biology glossary
mitochondria, organelles non-coding DNA: Introns, spliced out of the messenger RNA following transcription. [NHLBI] Non-coding DNA (also known as selfish,
ignorant, parasitic and incidental DNA)
includes introns,
transposable elements, pseudogenes, repeat elements, satellites, UTRs, hnRNAs,
LINEs, SINEs, as well as unidentified junk and makes up approximately 97% of the human genome.
Some scientists were so overwhelmed by the amount of non- coding DNA, that they
referred to the genome as “a
collection of non- coding regions interrupted by small coding regions.”
[Dov. S. Greenbaum "Junk?" Genomics & Bioinformatics MBB 452a,
Yale Univ.] http://bioinfo.mbb.yale.edu/mbb452a/projects/Dov-S-Greenbaum.html Related terms "junk DNA", non-coding regions, repetitive sequences; pseudogenes Gene
Definitions Narrower terms In-depth LINEs, non- coding first exons, SINEs, UTRs, others? non-coding region(s): The part of a gene that does not specify
the structure of a protein. Non- coding regions of DNA often contain elements
that regulate when a protein will be made, and how much of that
protein will be produced [SNP] Related terms introns, "junk DNA", repetitive sequences; pseudogenes Gene
Definitions nucleic acids: DNA or RNA Biomolecules ORESTES open reading frame expressed sequence tags: Approach provides
sequence information along the whole length of each transcript, rather than just
the ends. The method involves low- stringency PCR to produce cDNA libraries,
samples of which are then sequenced. Camargo et al ["The contribution of 700,000 ORF sequence tags to the
definition of the human transcriptome" PNAS 2001, 98:12103-12108] generated
almost 700,000 ORESTES from 24 types of normal or malignant tissue using 3,540
mini- libraries. They predict that their ORESTES dataset may represent as many
as 60% of all human genes (including abundant and rare transcripts). The ORESTES
approach generates a larger coverage and a greater number of contigs per gene
than to standard EST methods, offering the possibility to complete the closure
of most sequences using RT-PCR. http://www.biomedcentral.com/news/20011011/01
Related terms EST, ORF .ORF Open Reading Frame: Corresponds to a stretch of DNA that
could potentially be translated into a polypeptide; i.e., it begins
with an ATG "start" codon and terminates with one of the 3 "stop" codons.
For an ORF to be considered as a good candidate for coding a bona fide
cellular protein, a minimum size requirement is often set, e.g., many of
the systematic sequencing groups define an ORF as a stretch of DNA that
would code for a protein of 100 amino acids or more. An ORF is not usually
considered equivalent to a gene or locus until there has been shown to
be a phenotype associated with a mutation in the ORF, and/ or an mRNA transcript
or a gene product generated from the ORF's DNA has been detected. [SGD glossary, Stanford Univ.
US] http://genome-www.stanford.edu/Saccharomyces/help/glossary.html#fasta Sequences of structural genes
devoid of termination codons and therefore continuously "readable"
by RNA polymerase. [Metathesaurus] Reading frames where successive nucleotide triplets can be read as codons specifying amino acids and where the sequence of these triplets is not interrupted by stop codons.
[MeSH] Broader term reading frame, Narrower term URF Related term: Omes
& omics glossary ORFeome open reading frame: See ORF pre-mRNA: mRNA See under pre-mRNA splicing pre-mRNA splicing: One of the steps at which eukaryotic gene expression can be regulated is the
processing of mRNA precursors (pre- mRNAs), which includes the removal of intervening sequences
(splicing). Regulation at this step is widely used during cell
differentiation and development to turn on or off genes or to generate protein variants
with different properties from the same primary transcript. [Juan Valcárcel
"Research 1996" EMBL Gene Expression] Broader term splicing http://www.embl-heidelberg.de/ExternalInfo/ScientificProgrammes/Valcarcel.html protein: Proteins glossary protein coding, protein coding regions: See coding regions. protein expression: Expression glossary protein splicing: Excision of in- frame internal protein sequences (inteins) of a precursor protein, coupled with ligation of the flanking
sequences (exteins). Protein splicing is an autocatalytic reaction and
results in the production of two proteins from a single primary translation
product: the intein and the mature protein. [MeSH] The excision of an intervening protein sequence (the
intein) from a protein precursor and the concomitant ligation of the flanking protein
fragments (the exteins) to form a mature extein protein
and the free intein (Perler
1994). Protein splicing results in a native peptide bond between the ligated
exteins (Cooper
1993). Extein ligation differentiates protein splicing from other forms of
autoproteolysis. [InBase (Intein Database), New England Biolabs, 2001] http://www.neb.com/inteins/intein_intro.html Related terms In-depth exteins, inteins protein synthesis: See translation. RNA RiboNucleic Acid: Linear polymer molecules composed of a
chain of ribose units linked between positions 3 and 5 by phosphodiester
groups to which the bases adenine or guanine or uracil or cytosine, respectively
are attached … The three most important types of RNAs in the cell are, c.f. mRNA,
tRNA, rRna. [IUPAC Biotech] A single stranded nucleic acid that contains the sugar ribose. There
are several forms of RNA, including messenger RNA (mRNA), transfer RNA (tRNA), and ribosomal RNA
[rRNA] (all involved in protein synthesis), as well
as several small RNA’s whose functions are still being clarified. Certain
viruses have RNA, instead of DNA, as their genetic material. [NIGMS] A DNA like molecule. Different kinds of RNA exist that play specific
roles in the process of gene expression.
[NHLBI] Narrower terms mRNA, In-depth hnRNA precursor RNA, rRNA, scRNA, snRNA,
tRNA Related terms ribosomes, ribozymes, RNA polymerase, RNA splicing; Microarrays
glossary Northern blotting; Omes & omics glossary
ribonome, ribonomics RNA databases see Databases & software
directory. RNA-RNA interactions: http://www.chem.fsu.edu/faculty/grnbm.htm RNA silencing: See Functional
genomics glossary RNAi RNA splicing: The ultimate exclusion of nonsense sequences
or intervening sequences (introns) before the final RNA transcript is sent
to the cytoplasm. [MeSH] Broader term splicing. reading frames: The sequence of codons by which
translation
may occur. A segment of mRNA 5' AUCCGA3' could be translated in three reading
frames, 5' AUC.. or 5' UCC.. or 5' CCG.., depending on the location of
the start codon. [MeSH] Narrower term ORF Open Reading Frames reference sequences: Reference sequence standards for the naturally
occurring molecules of the central dogma, from chromosomes to mRNAs
to proteins. Toward this goal, intermediate larger genomic regions, contigs,
are also produced. RefSeq standards provide a foundation for the functional annotation
of the human genome. They provide a stable reference point for mutation
analysis, gene expression studies, and polymorphism discovery [RefSeq,
LocusLink, NCBI, US] http://www.ncbi.nlm.nih.gov/LocusLink/refseq.html regulatory sequence: A DNA base sequence that controls gene
expression.
[DOE] repetitive sequences: Make up at least 50% of the genome. Repetitive
sequences are thought to have no direct functions, but they shed light
on chromosome structure and dynamics. They hold important clues about evolutionary
events, help chart mutation rates, and by seeding DNA rearrangements, they
can modify genes and create new ones. They also serve as tools for genetic
studies. The vast majority of repeated sequences in the human genome are derived
from transposable elements - sequences like those that form viral genomes
- that propagate by inserting fresh copies of themselves in random places
in the genome. A full 45% of the human genome derives from such transposons.
A major surprise of this new global analysis of the human genome is that
many components in this diverse array of repeated sequences, traditionally
considered to be "junk," appear to have played a beneficial role over the
course of human evolution. [NHGRI "Summary of the Initial Sequencing and
Analysis of the Human Genome" press release, Feb. 11, 2001] Related term "junk
DNA", non- coding DNA.. http://www.nhgri.nih.gov/NEWS/summary_of_sequence.html reverse transcription: Reverse transcription is used naturally by retroviruses
to insert themselves into an organism's genome. Artificially induced reverse
transcription is a useful technique for translating unstable mRNA molecules
into stable cDNA. [J Buhler, Washington Univ.] http://www.cs.washington.edu/homes/jbuhler/research/array/glossary.html
Related term In-depth reverse transcriptases; Gene
definitions cDNA ribonucleic acid: See RNA. selfish DNA: See "junk DNA", non- coding DNA. sequence: The order of neighbouring amino acids in a protein
or the purine and pyrimidine bases [A,C,T,G, uracil] in RNA
and DNA. [IUPAC Bioinorganic] Getting more from your sequence on the web EA Greene & S Henikoff,
1997 http://linkage.rockefeller.edu/wli/news/henikoff.html
Automated ways to keep up to date with sequences of particular interest. Narrower
terms carbohydrate sequence; Proteins
amino acid sequence Related terms Sequencing
draft sequence - human, published sequence - human, working draft sequence -
human specific DNA: See under non- specific DNA In-depth glossary splicing: 1. Of RNA: the procedure by which introns are removed
from eukaryotic precursor mRNA molecules and adjacent exon sequences are
joined together (spliced). 2. Of DNA: manipulation for joining together
double stranded DNA fragments with protruding single stranded "sticky
ends" by means of ligases. [IUPAC Biotech, IUPAC Compendium] Narrower terms protein splicing,
pre- mRNA splicing, RNA splicing, trans- splicing; Gene
Definitions alternative splicing, cDNA; Related terms Cell
biology glossary
spliceosomes trans-acting factors:Trans- acting factors
functionally have two domains. One domain is required for the factor to bind to
DNA, and the second domain is required for the activation of transcription. This
was discovered by studying deletion mutants of the factors. Mutants factors were
found that could bind DNA but could not activate transcription. Other
experiments in which a hybrid protein consisting of the non- DNA binding segment
of one trans-acting factor fused to the DNA-binding region of a second trans-
acting activated transcription defined the second function of trans- acting
factors. [Phil McLean "Control of gene expression in eukaryotes" North
Dakota State Univ. 1997] http://www.ndsu.nodak.edu/instruct/mcclean/plsc431/geneexpress/eukaryex6.htm transcript: Expression
glossary Related terms In-depth 3' UTR, 5' UTR, primary transcript,
terminator transcription: The process by which the genetic information encoded
in a linear sequence of nucleotides in one strand of DNA is copied into
an exactly complementary sequence of RNA. [IUPAC Biotech] The transfer of genetic information from DNA to messenger RNA by DNA
directed RNA polymerase. It includes reverse transcription and transcription
of early and late genes expressed early in an organism’s life cycle or
during later development. [MeSH/ Metathesaurus] The synthesis of an RNA copy from a sequence of DNA (a gene); the first
step in gene expression. Compare translation (the process
in which the genetic code carried by mRNA directs the synthesis of proteins
from amino acids. [DOE] "Transcription" Central dogma chapter MIT Biology Hypertextbook http://esg-www.mit.edu:8001/esgbio/dogma/trx.html Related terms translation; In-depth attenuator, reverse
transcriptases, transcription machinery; Narrower terms:Gene
amplification & PCR reverse transcription; Microarrays
In-depth Northern blotting transcription factors: Endogenous
substances, usually proteins, which are effective in the initiation, stimulation,
or termination of the genetic transcription process. [MeSH] Narrower term In-depth artificial transcription factors translation: The unidirectional process that takes place on the
ribosomes whereby the genetic information present in an mRNA is converted
into a corresponding sequence of amino acids in a protein. [IUPAC Bioinorganic] The conversion of the genetic instructions for a protein from nucleotides
of messenger RNA with amino acids. [NIGMS] "Translation" Central dogma chapter MIT Biology Hypertextbook
http://esg-www.mit.edu:8001/esgbio/dogma/trl.html trans-splicing: The joining of RNA from two different genes. One type
of trans- splicing is the "spliced leader" type (primarily found in
protozoans such as trypanosomes and in lower invertebrates such as nematodes)
which results in the addition of a capped, noncoding, spliced leader sequence to
the 5' end of mRNAs. Another type of trans- splicing is the "discontinuous
group II introns" type (found in plant/ algal chloroplasts and plant
mitochondria) which results in the joining of two independently transcribed
coding sequences. Both are mechanistically similar to conventional nuclear pre-
mRNA cis- splicing. Mammalian cells are also capable of trans- splicing. [MeSH] transposons: Gene definitions URF: Unidentified Reading Frame upstream: Identifies sequences located in a direction opposite to
that of expression; for example, the bacterial promoter is upstream of
the initiation codon. In an mRNA molecule, upstream means toward
the 5' end of the molecule. Occasionally used to refer to a region of a polypeptide chain which is located toward the amino terminus of the molecule.
[Lemon] Bibliography DDBJ/ EMBL/ GenBank
Feature Table, 2001, 100 + definitions. http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html Alpha
glossary index IUPAC definitions are reprinted with the permission of the International
Union of Pure and Applied Chemistry. In-depth Sequences, DNA & beyond 3' UTR (three prime): The sequence at the 3' end of messenger RNA
that does not code for product. This region contains transcription and
translation regulating sequences [MeSH} Region at the 3' end of a mature transcript
(following the stop codon) that is not translated into a protein. [DDBJ/
EMBL/ GenBank Feature Table] http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html A term that identifies one end of a single- stranded nucleic acid molecule. The 3' end is that end of the molecule which terminates in a 3' hydroxyl group. The 3' direction is the direction toward the 3' end. Nucleic acid sequences are written with the 5' end to the left and the 3' end to the right, in reference to the direction of DNA synthesis during replication (from 5' to 3'), RNA synthesis during
transcription (from 5' to 3'), and the reading of mRNA sequence (from 5' to 3') during
translation. Related term 5' (5-prime) [Mouse Genome Informatics]
Broader term UTR Related terms UTR; Gene
amplification & PCR primer extension. 5' (5-prime): The sequence at the 5' end of the messenger RNA
that does not code for product. This sequence contains the ribosome binding site
and other transcription and translation regulating sequences. [MeSH] A term that identifies one end of a single-stranded nucleic acid molecule. The 5' end is that end of the molecule which terminates in a 5' phosphate group. The 5' direction is the direction toward the 5' end. Nucleic acid sequences are written with the 5' end to the left and the 3' end to the right, in reference to the direction of DNA synthesis during replication (from 5' to 3'), RNA synthesis during
transcription (from 5' to 3'), and the reading of mRNA sequence (from 5' to 3') during
translation. . Related term 3' (3-prime). [Mouse Genome
Informatics] 5' UTR (five prime): Region at the 5' end of a mature transcript
(preceding the initiation codon) that is not translated into a protein. [DDBJ/
EMBL/ GenBank Feature Table] http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html 5' Untranslated Region:. That portion of an mRNA from the 5' end to the position of the first codon used in translation.
Related term 3'UTR. [Mouse Genome Informatics] Related term 3' prime; Gene
amplification glossary primer extension Broader term UTR. amino acid sequence: The order of amino acids as they occur
in a polypeptide chain. This is referred to as the primary structure of
proteins. It is of fundamental importance in determining protein conformation.
[MeSH] artificial transcription factors: Regulated gene expression is
critical for cellular existence, and a disruption in the regulatory network can
result in disease or death. Therefore, a goal of primary importance in the
scientific community has been to discover methods of reprogramming gene
expression in diseased cells while leaving normal cells unaffected. Our
understanding of transcription, an early step in gene expression, has now
reached a sufficiently sophisticated level to allow us to tackle this challenge
from a chemical perspective. Dendritic and polymeric structures designed to
functionally mimic the protein participants in activation and repression of
transcription will be examined through in vitro assays and cell culture
experiments. Organic synthesis will play a critical role in this effort. By
varying the synthetic approaches to the artificial transcription factors, their
overall function as activators and/or repressors can be controlled and important
characteristics such as cell membrane permeability and tissue- type specificity
can be addressed. [Anna K. Mapp "Chemistry at the Univ. of Michigan, 2001] http://www.umich.edu/~michchem/faculty/mapp/ attenuator: In prokaryotes. 1) region of DNA at which regulation
of termination of transcription occurs, which controls the expression
of some bacterial operons; 2) sequence segment located between the
promoter and the first structural gene that causes partial termination
of transcription. [DDBJ/ EMBL/ GenBank Feature Table] http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html CpG islands: Regions of DNA rich in CpG dinucleotides, also known as
CpG islands, are often located upstream of the transcription start site in both
tissue specific and housekeeping genes. Overall, CpG dinucleotides are
observed at a density of 25% the expected level from base composition alone,
partially due to 5- methylcytosine decay (Bird, 1993). Since CpG dinucleotides
typically occur with low frequency, CpG islands can be distinguished
statistically in the genome. [Eric C. Rouchka et. Al "Computational
Detection of CpG Islands in DNA" Sept. 1997 Washington Univ. St. Louis,
US] http://stateslab.bioinformatics.med.umich.edu/~ecr/PAPERS/WUCS-97-39.pdf catalytic RNA: Ever since HHMI investigator Thomas Cech
at the University of Colorado in Boulder uncovered the catalytic properties of
RNA in 1982, researchers have been diligently studying these ribozymes.
Scientists have since discovered more than 500 ribozymes in a diverse range of
organisms and have found that they share many similarities with their more
widespread protein cousins, enzymes. [Howard Hughes Medical Institute
News, Oct. 9, 1998]http://www.hhmi.org/news/ribozyme.html Related term ribozymes central dogma exceptions ("busters"): 1. Reverse transcriptase and RNA genomes. DNA is not the only molecule of
heredity in nature and, as David Baltimore and Howard Temin showed, the flow of
information from DNA to RNA is not the only pathway possible. 2. Catalytic RNAs
(ribozymes). Proteins are not the only structures capable
of catalyzing a reaction. Tom Cech demonstrated the catalytic nature of certain
classes of introns (intervening sequences) that are able to
"self-splice." In addition Harry Noller has shown that the synthesis
of the peptide bond during protein synthesis is catalyzed by the 23S rRNA of the
ribosome. 3. Heritable proteins. Stanley Prusiner has given us the novel name
"prion" (proteinaceous infections particle) to describe the agent
responsible for a number of slow, neurological infectious disease, including
scrapie, bovine spongeform encepalopathy (mad cow disease) and Creutzfeld- Jakob
disease. [Martinez Hewlett, Molecular Biology 411, Univ. of Arizona, Tucson US] http://www.blc.arizona.edu/marty/411/Modules/mod4.html cis-acting
sequences: The sequences just 5' of the start site of transcription are
the most important for the initiation of transcription. This is where the
transcription complex is built. In general, this region is called the promoter.
For eukaryotes, several sequences same to be conserved among many genes. One
such sequences is the TATA box. The sequence is located about 30 bases
upstream (-30) from the transcription start site and is the one sequence
required for any significant transcription to occur. Other sequences add in
transcription but are not always part of promoter. The two most found are the CCAAT
box (called the CAT box) and the GC box. Because mutants of these
three sequences only express mRNAs at low levels, these are considered the most
important sequences of the basic transcription complex. [Phillip McClean,
"Control of gene expression in eukaryotes, North Dakota State Univ.
1997] http://www.ndsu.nodak.edu/instruct/mcclean/plsc431/geneexpress/eukaryex3.htm cis-splicing: Splicing of messenger RNA precursors (pre-
mRNAs) is a requisite step in the generation of virtually all mature mRNAs. This
process requires the coordinated interaction of several small nuclear
ribonucleoprotein particles (snRNPs) and many protein factors that assemble to
form an enzymatic complex known as the spliceosome. Components of the
spliceosome recognize the exon- intron boundaries at the 5' and 3' splice sites,
excise the intron and ligate the adjoining exons. In most cases, splicing joins
a 5' splice site and a 3' splice site within the same pre-mRNA molecule, termed cis-splicing.
[Intronn, Inc. "Background cis- and trans- splicing" 1999] http://www.intronn.com/r&t/background.htm cis-trans: Gene
definitions enhancer: A cis- acting sequence that increases the utilization
of (some) eukaryotic promoters, and can function in either orientation
and in any location (upstream or downstream) relative to the promoter.
Eukaryotes and eukaryotic viruses. [DDBJ/ EMBL/ GenBank Feature Table] http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html Related
term promoter. exteins: Flanking protein fragments. [InBase, New England Biolabs, 1999] http://www.neb.com/inteins/int_id.htm
Related terms inteins, protein splicing. hnRNA heteronuclear RNA: RNA transcripts in the nucleus, representing
precursors and processing intermediates of rRNA, mRNA, and tRNA, as well as
mature RNA transcripts not yet transported into the cytoplasm. [ http://newfish.mbl.edu/Course/Glossary/ inteins: A large in-frame insertion in a sequenced gene that is absent
in other sequenced homologs suggests that this gene may contain an intein. These
intervening sequences are often found by running any of the commonly available
sequence comparison programs such as Bestfit, Gap or Blast. Significant Blast
matches are often found to the extein protein AND one or more proteins
containing similar inteins. More sophisticated searches can be performed using
intein motifs (Pietrokovski
1994, Perler
1997 and Pietrokovski
1998A) or a Hidden Markov Model (Dalgaard
1997 and Gorbalenya
1998). The presence of an intein in a particular gene does not necessarily
mean that a homolog from a closely related species or strain will have the same
intein. ... Many inteins are bifunctional proteins with splicing and
endonuclease activity. [InBase, New England Biolabs, 1999] http://www.neb.com/inteins/int_id.htm Inteins are parts of proteins that cut themselves out of the whole protein
entirely on their own accord. This phenomenon has become known only in the past
few years, and it is perplexing because most major alterations to a protein
require a second protein, such as a protease, and other cofactors, such as
energy in the form of ATP. Self- splicing proteins, therefore, represent a
fundamentally new way of protein modification, says [Henry] Paulus, who works at
the Boston Biomedical Research Institute. [Harvard Medical School, Focus, Oct.
31, 1997] http://www.med.harvard.edu/publications/Focus/1997/Oct31_1997/biochem.html Internal protein sequences. Related terms exteins, protein splicing. intron splicing: LINEs Long Interspersed Nuclear Elements or Long INterspersed Elements: Families
of long (average length = 6 500 bp), moderately repetitive (about
10,000 copies). LINEs are cDNA copies of functional genes present in the
same genome; also known as processed pseudo- genes. [FAO Glossary] Related terms non-coding, retrotransposons. LTR Long Terminal Repeat: A sequence directly repeated at both
ends of a defined sequence, of the sort typically found in retroviruses. [DDBJ/
EMBL/ GenBank Feature Table] http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html non-coding first exons: Although conventional programs detect many
parts of genes with ease, they fail when it comes to detecting two important
elements- the very first pieces of genes, and the nearby "on" switches
of genes called promoters. Researchers in the bioinformatics
group at Cold Spring Harbor Laboratory have now developed a computer program
that is especially good at finding these first segments and "on"
switches of genes. "FirstEF is the first program that can readily and
accurately detect a class of gene segments that has previously been
extraordinarily difficult to find," says [Michael] Zhang. Instead, FirstEF
recognizes five other DNA "signatures" that betray the presence and
location of first exons in genes. The biological basis of some of these telltale
genetic signatures is unknown ... One such signature is the frequency with which
two building blocks of DNA, C and G, occur next to each other. [Cold
Spring Harbor Laboratory, US]. "It's like looking for buried
treasure." The gene segments Zhang is referring to occur at the very beginning of genes,
and are called "non- coding first exons." Because they do not encode
protein segments, non- coding first exons are undetectable by conventional
computer programs that rely on protein coding patterns found in DNA. [Cold
Spring Harbor Lab press release Nov. 28, 2001] http://www.cshl.org/public/releases/zhang112801.html Related terms exons, non- coding non-specific DNA: A new discovery about how cells regulate protein synthesis helps explain the complex interactions between proteins and DNA and may have far reaching implications for future biotechnology research. In order to inhibit gene expression, proteins need to bind to specific DNA target sites, which are often located in stretches of
non- specific DNA. The mechanism for recognition and discrimination between non-
specific and specific sites has remained a mystery. Researchers at the Institute of Molecular Biology, University of Oregon, used a new imaging technique called
scanning force microscopy (SFM) to visualize DNA and protein complexes in the process of binding.. SFM fills a need for quantitative analysis of DNA not possible with
x-ray crystallography. SFM provides a topographic image of a molecular surface by scanning a surface underneath a tip modified with an electron beam. Deflections sensed by the tip can be amplified and recorded, providing a quantitative topographic map of the surface. Previous studies have shown that recognition of a specific target site is often accompanied by DNA "bending." However, the significance of this bending has not been understood. SFM studies revealed crucial differences in DNA bending induced by protein binding to
non- specific and specific sites. [Sean Henahan "DNA bends to bind",
Access Excellence, Mar. 2001] http://www.accessexcellence.org/AB/BC/DNA_Bends_to_Bind.html precursor_RNA: Any RNA species that is not yet the mature RNA
product; may include 5' clipped region (5' clip), 5' untranslated
region (5' UTR), coding sequences (CDS, exon), intervening sequences (intron),
3' untranslated region (3' UTR), and 3' clipped region (3' clip). ... used
for RNA which may be the result of post- transcriptional processing. [DDBJ/ EMBL/
GenBank
Feature Table] http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html primary (initial, unprocessed) transcript: Includes 5' clipped
region (5' clip), 5' untranslated region (5' UTR), coding sequences (CDS,
exon), intervening sequences (intron), 3' untranslated region (3' UTR),
and 3' clipped region (3' clip). [DDBJ/ EMBL/ GenBank Feature Table] http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html promiscuous DNA: The occurrence of identical base sequences in
more than one cellular compartment. Evidence for gene flow between organelles,
or organelles and the nucleus. [PJ Bottino Biology 222 Univ. Maryland Fall
1996] http://www.life.umd.edu/classroom/biol222/lect33-37.html promoter: Region on a DNA molecule involved in RNA polymerase
binding to initiate transcription. [DDBJ/ EMBL/ GenBank Feature Table]
http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html
Related terms cis- acting, enhancer, promoter regions promoter regions: The DNA region, usually upstream to the
coding sequence of a gene or operon, which binds and directs RNA polymerase
to the correct transcriptional start site and thus permits the initiation
of transcription. [IUPAC Biotech] DNA sequences which are recognized (directly or indirectly) and bound
by a DNA- dependent RNA polymerase during the initiation of transcription.
Highly conserved sequences within the promoter include the Pribnow box
in bacteria and the TATA BOX in eukaryotes. [MeSH] Related term enhancer. rRNA: Ribosomal RNA, RNA molecules which are essential structural
and functional components of ribosomes, the subcellular units responsible
for protein synthesis. [IUPAC Biotech] Mature ribosomal RNA ; the RNA component of the ribonucleoprotein particle
(ribosome) which assembles amino acids into proteins. [DDBJ/ EMBL/ GenBank
Feature Table] http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html RNA polymerase RNAP: The movement of RNA polymerase (RNAP) along
DNA during transcription is a complex set of different activities,
including initiation, elongation, pausing, backtracking, and arrest. A complete
understanding of how this molecular machinery works requires characterization
of the individual activities, when and why they occur, what structural
components are required in each case, and what the biochemical parameters are.
Since ensemble measurements will give only averages across a mixture of
molecules engaged in a variety of these different behaviors, single molecule measurements
may be the only way to examine the characteristics of each type of behavior
independently [NIGMS "Single Molecule Detection and Manipulation
Workshop"Single Molecule Fluorescence of
Biomolecules and Complexes Protein Folding April 17-18, 2000] http://www.nigms.nih.gov/news/reports/single_molecules.html#examples reverse transcriptases: Gene
amplification & PCR ribosomal RNA: See rRNA. ribosomes: Cell
Biology ribozymes: Naturally occurring RNAs with enzymatic activity that specifically bind to and
cleave- and therefore inactivate- mRNA molecules. Like the antisense
approach, ribozymes provide a means of inhibiting a gene of interest for target
validation studies. [CHI Breaking Bottlenecks] Ribozymes can be engineered to bind naturally to any RNA sequence,
resulting in the cleavage and inactivation of mRNAs containing the target
sequence. [CHI Target Validation] Related term: catalytic RNA scRNA: Small cytoplasmic RNA; any one of several small cytoplasmic
RNA molecules present in the cytoplasm and (sometimes) nucleus of a eukaryote.
[DDBJ/ EMBL/ GenBank Feature Table] http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html SINEs Short Interspersed Nuclear Elements or Short INterspersed Elements: Short
interspersed nuclear elements. Families of short (150 to 300 bp),
moderately repetitive elements of eukaryotes, occurring about 100,000 times in a
genome. SINES appear to be DNA copies of certain tRNA molecules, created
presumably by the unintended action of reverse transcriptase during
retroviral infection. [FAO Glossary] Related terms non- coding, retrotransposons. sequence data, molecular: Descriptions of specific amino
acid, carbohydrate or nucleotide sequences which have appeared in the published
literature an/or are deposited in and maintained by databanks such as GenBank,
EMBL, NBRF or other sequence repositories [databases] [MeSH] small cytoplasmic RNA: See scRNA. small nuclear RNA: See snRNA. snRNA: Small nuclear RNA; any one of many small RNA species confined
to the nucleus; several of the snRNAs are involved in splicing or other
RNA processing reactions. [DDBJ/ EMBL/ GenBank Feature Table] http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html splice sites: Boundaries between exons and intron, there are two
varieties: the border going from exon to intron is called a donor site
or a site, the border separating intron
from exon is called an acceptor site or a
site. [TP Speed, S. Cawley, "Locating splice sites"
Statistics 260 Statistics in Genetics, Univ. of California- Berkeley,
1998] http://www.stat.berkeley.edu/users/terry/Classes/s260.1998/Week12/week12/node14.html splice junctions: start codon: stop codon: sticky ends: The staggered ends of complementary sequences of
DNA which result from cleavage by restriction enzymes. [IUPAC Biotech] tRNA: See transfer RNA. template: Gene amplification
& PCR terminator: A sequence of DNA lying beyond the 3’ end of the
coding segment of a gene which is recognized by RNA polymerase as a signal
to stop synthesizing mRNA. [IUPAC Biotech] Sequence of DNA located either at the end of the transcript that
causes RNA polymerase to terminate transcription [DDBJ/ EMBL/ GenBank
Feature Table] http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html transcription machinery: Consists
of the RNA polymerase II holoenzyme plus two additional “general transcriptions
factors,” which are protein complexes and a histone acetyltransferase that
theoretically exerts its transcriptional activity by modifying chromatin. ... [Several] studies provide evidence for
the function of components of the general transcription machinery, in terms
of their role in regulation of the transcription of specific sets of genes … Apparently, the specific
transcription regulatory activities of components of the general transcription
machinery provide a layer of regulation in addition to that provided by
the gene- specific regulators … Knowledge gained concerning the coordinate
regulation of genes, how gene- specific transcription factors (which are
the targets of many existing drugs (e.g., steroids, selective estrogen
response modifiers, thiazolidinediones) interact with general transcription
factors, and how signal transduction pathways regulate gene transcription
is expected to be important for genomics based identification of targets
that are components of transcriptional regulation and signal transduction
networks. [CHI Functional Genomics] transfer RNA: A single-stranded RNA molecule containing
about 70-90 nucleotides, folded by intrastand base pairing into a characteristic
secondary (“cloverleaf”) structure that carries a specific amino acid and
matches it to its corresponding codon on an mRNA during protein synthesis. [IUPAC Biotech] Mature transfer RNA, a small RNA molecule (75 - 85 bases long) that mediates
the translation of a nucleic acid sequence into an amino acid sequence [DDBJ/
EMBL/ GenBank Feature Table] http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html trans- splicing: Splicing between two independently transcribed pre-
mRNAs is termed trans-splicing. This has been described in
trypanosomes, nematodes, flatworms, and plant mitochondria. In vitro trans-splicing
has been used as a model system to examine the mechanism of splicing. Trans-splicing
of pre-mRNAs in human cells has been postulated to account for some rare
events. [Intronn, Inc. "Background cis- and trans-
splicing" 1999] http://www.intronn.com/r&t/background.htm UTR: The parts of the messenger RNA sequence that do not code
for product, i.e. the 5' UNTRANSLATED REGIONS and 3' UNTRANSLATED REGIONS.
[MeSH] UnTranslated Region: Critical for many aspects of gene regulation
and expression. Narrower terms 3' UTR, 5' UTR. Related term intron |