You are here Glossary homepage/Search
> Informatics > Databases directory Databases directory Comments? Questions?
Revisions?
mchitty@healthtech.com Last revised December 27, 2001 View a Printer-Friendly Version of this Web Page! Related glossaries includes Bioinformatics
(see definitions of databases and various narrower terms),
Chemoinformatics, Computers
& computing, Informatics
Overview. Life Sciences Database Directories DATABANKS, SRS, EBI, UK http://www.ebi.ac.uk/srs5cgi/wgetz?-fun+PageLibInfo+-info+DATABANKS 450+ databases, compiled nightly from public SRS servers worldwide. DBcat: the public catalog of databases, INFOBIOGEN, France http://www.infobiogen.fr/services/dbcat/
Directory of 500+ DNA, RNA, Protein, Genomic, Mapping, Protein structure,
Literature and miscellaneous databases. Introduction to Molecular Biology Databases, R. Apweiler, R. Lopez, B.
Marx, 1999 http://www.ebi.ac.uk/swissprot/Publications/mbd1.html
Covers bibliographic, taxonomy, nucleotide sequence, genetic, protein sequence
databases, PIR, SWISS-PROT, TrEML and specialised protein sequence, protein
databases, secondary protein databases and structure databases. Molecular Biology Database Collection http://nar.oupjournals.org/
The first issue of each year of Nucleic Acids Research has
been a database issue since 1996. ************************************************************************** This is not a comprehensive catalog of databases. Both public and
proprietary databases are included. Many proprietary databases make
special arrangements for academic users. Please consult individual
websites for details. The dividing line between databases, software and
integrated systems gets blurrier all the time. Databases 2D PAGE databases index http://www.expasy.ch/ch2d/2d-index.html 3Dee Database of Protein Domain
Definitions, EBI, UK http://jura.ebi.ac.uk:8080/3Dee/help/help_intro.html
Structural
domain definitions for all protein chains in the Brookhaven Protein Databank
(PDB) that have 20 or more residues and are not theoretical models [listed
here]. In addition, the domains have been clustered on sequence similarity and
structural similarity. The resulting families are stored as a hierarchy. 3DinSight, RIKEN, Japan http://www.rtc.riken.go.jp/jouhou/3dinsight/3DinSight.html
An integrated database and search tool for structure, property and function of biomolecules, which will help
researchers to get insight into their relationship. The structural data, functional data (motifs, mutations, protein-nucleic acid
binding, protein-ligand binding etc.) and property data (amino acid property and thermodynamic data of proteins) of
biomolecules are implemented into a relational database, so that flexible searches can be done by a combination of queries
(SQL). The relationships among structure, function and property are visualized by 3DinSight: e.g., real-time 3D images
displaying structures with automatically mapped functional sites; and graph plots integrating amino acid property vs.
sequence, structural and functional information. ALFRED Allele Frequency Database,
Kidd Lab, Yale University, US http://alfred.med.yale.edu/alfred/index.asp ASDB Alternative Splicing DataBase, National Energy Research
Scientific Computing Center NERSCC, Lawrence Berkeley Lab, US http://devnull.lbl.gov:8888/alt/index.html
Version 2.1 of ASDB consists of two divisions, ASDB (proteins) , which
contains amino acid sequences, and ASDB (nucleotides) with genomic sequences.
Alternative pre-mRNA splicing is an important mechanism for regulating gene
expression in higher eukaryotes. By recent estimates, the primary transcripts of
~30% of human genes are subject to alternative splicing, often regulated in
specific spatial/temporal patterns during normal development. Intended to be a
central, publicly accessible site information about alternatively spliced genes,
their products and expression patterns. The current ASDB format was established
without explicit funding for this project and should be viewed as an early
prototype rather than a completed project. ASTRAL compendium for sequence and
structure analysis, Stanford Univ. US http://astral.stanford.edu/
Provides
databases and tools useful for analyzing protein structures and their sequences.
It is partially derived from, and augments the SCOP: Structural Classification
of Proteins database. Most of the resources provided here depend upon the
coordinate files maintained and distributed by the Protein Data Bank. AceDB http://www.acedb.org/
A
genome database system developed primarily by Jean Thierry- Mieg (CNRS,
Montpellier, France) and Richard Durbin (Sanger Centre. UK). It provides a custom database
kernel, with a non- standard data model designed specifically for handling
scientific data flexibly, and a graphical user interface with many specific
displays and tools for genomic data. AceDB is used both for managing data within
genome projects, and for making genomic data available to the wider scientific
community. AceDB was originally developed for the C. elegans genome project,
from which its name was derived (A C. elegans DataBase). However, the tools in
it have been generalized to be much more flexible and the same software is now
used for many different genomic databases from bacteria to fungi to plants to
man. It is also increasingly used for databases with non- biological content. allgenes.org, Univ. of
Pennsylvania, USA http://www.allgenes.org
Comprehensive gene index or gene catalog that includes
genes/transcripts predicted by two largely independent methods: 1. Genes
(transcripts) predicted by clustering and assembling EST sequences. The EST
clusters on allgenes.org are those in the latest release of the Database of
Transcribed Sequences (DoTS), which was developed by the Computational Biology
and Informatics Laboratory at the University of Pennsylvania. 2. Genes predicted
by running the ab initio gene finders GRAIL- EXP and GENSCAN on all available
human and mouse genomic sequence. This data comes from the Genome Channel, an
effort of the Computational Biosciences Section at Oak Ridge National
Laboratory. Amino Acid Index AAI, GenomeNet,
Japan
An
amino acid index is a set of 20 numerical values representing any of the
different physicochemical and biological properties of amino acids. The AAindex1
section of the Amino Acid Index Database is a collection of published indices
together with the result of cluster analysis using the correlation coefficient
as the distance between two indices. ArrayExpress, EBI, UK http://www.ebi.ac.uk/arrayexpress/
A public repository for microarray based gene expression data. Currently the EBI
is establishing a pilot database containing microarray gene expression data that
are available publicly. Axeldb (A Xenopus laevis database) DKFZ (Germany Cancer
Research Center), Univ. Heidelberg, Germany http://www.dkfz-heidelberg.de/abt0135/axeldb.htm
A database focussing on gene expression in the frog Xenopus laevis. It is
the web companion to our paper describing a large- scale in situ hybridization
screening in Xenopus embryos. The goals of our "large- scale in situ
screen" project are to identify genes by the characterization of their
expression pattern, to partially sequence the corresponding cDNAs and to
maintain a database collecting the results. . BBID Biological Biochemical Image Database, National
Institute on Aging, NIH, US http://bbid.grc.nia.nih.gov/
A searchable database of images of putative biological pathways, macromolecular
structures, gene families, and cellular relationships. It is
of use to those who are working with large sets of genes or proteins using cDNA
arrays, functional genomics, or proteomics. BIND Biomolecular Interaction
Network Database, Samuel Lunenfeld Research Institute, Canada http://bioinfo.mshri.on.ca/BIND/BIND_prop/index.html
A worldwide repository of every
biomolecular interaction forming the mechanisms of cellular communication,
differentiation and growth, from model organisms and from humans. With BIND,
computer simulations of whole- cell models of disease processes spanning medicine
to agriculture will be possible. BIOSIS Biological Abstracts, Zoological Abstracts http://www.biosis.org/
Bibliographic index to biological literature. BLOCKS, Fred Hutchinson Cancer
Research Center, US http://www.blocks.fhcrc.org
From PROSITE BODYMAP, Osaka Univ., Japan
http://bodymap.ims.u-tokyo.ac.jp/
Expression
information of human and mouse genes (novel or known) in various tissues or cell
types. First generation map created by random sequencing of clones in 3’-directed
cDNA libraries. BRITE Biomolecular Relations in
Information Transmission and Expression, GenomeNet, Japan
Cell
cycle controlling pathways. Berkeley Drosophila Genome Project
BDGP, http://www.fruitfly.org/
UC-Berkeley, US http://www.fruitfly.org/
Curated annotated informatics database from the
Berkeley and European Drosophila genome projects, with annotations from
the literature, comparative sequence analysis and the FlyBase research
community. Biochemical Pathways, Boehringer
Mannheim GmbH, Germany http://biochem.boehringer-mannheim.com/prodinfo_fst.htm?/techserv/metmap.htm
A
digitized version of our Biochemical Pathway Chart is available on the ExPASy
Molecular Biology Server of the Geneva University Hospital and the University of
Geneva. An electronic index allows for the quick localization of any metabolite
or enzyme on the chart. In addition most enzyme names on the chart act as links
to the extensive ENZYME database. BioExpress See GeneExpress Biology WorkBench. San Diego Supercomputer Center, US http://workbench.sdsc.edu/
A revolutionary web- based tool for biologists. The WorkBench allows biologists
to search many popular protein and nucleic acid sequence databases. Database
searching is integrated with access to a wide variety of analysis and modeling
tools, all within a point and click interface that eliminates file format
compatibility problems. BioMagRes, Univ. of
Wisconsin-Madison, US http://www.bmrb.wisc.edu/
Contains
NMR chemical shifts derived from proteins and peptides, reference data, amino
acid sequence information, and data describing the source of the protein and the
conditions used to study the protein. In constructing the database, proteins and
larger peptides have been given priority. Shift assignments for hemes,
cofactors, and substrates of a protein are also included, when they are reported
as part of a complex. BioMedCentral (UK) http://www.biomedcentral.com Publisher
of journals covering all areas of biology and medicine. We provide free access
to peer- reviewed research articles and subscription- based access to reviews,
commentaries and other information services BioMedNet, Elsevier Science http://journals.bmn.com/journals
Medline and other databases, Journals (full-text, may involve subscription fees)
HMS Beagle, news, conference reports, weblinks directory. Biomolecule Interaction Growth and Expression Database
(BIGED), George Church Lab, Harvard Medical School, US http://twod.med.harvard.edu/ExpressDB/
In addition to
ExpressDB, we have been working on a Biomolecule Interaction Growth and
Expression Database (BIGED) that we conceive as a general, integrated database
for functional genomics. BIGED will manage more than just RNA expression data
and will maintain strain and condition information in structured, queriable
form. The Total Biomolecule Expression and Interaction Database (TBEID) was an
earlier version of BIGED that covered more kinds of experiments but had less
sophisticated indexing of biological entities. cDNA relational databases,
NHGRI, US http://www.nhgri.nih.gov/DIR/LCG/15K/DATA/
The
current release contains two FileMaker Database templates (Built under FileMaker
Pro 4.0 under Mac environment) and two flat files to be imported into FileMaker
database. (Oct 2000). CATH Protein Structure
Classification, University College, London, UK http://www.biochem.ucl.ac.uk/bsm/cath/
Hierarchical
classification of protein domain structures. CDD Conserved Domain Database,
NCBI, US http://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml
Database and search service. currently contains domains derived from two
popular collections, Smart and Pfam,
plus contributions from colleagues at NCBI. The source databases also provide
descriptions and links to citations. Since conserved domains correspond to
compact structural units, CDs contain links to 3D-structure via Cn3D
whenever possible. CEPH Genotype Database, Centre
d'Etude du Polymorphisme Humain (CEPH), France http://www.cephb.fr/cephdb/
Genotypes for all genetic markers that have been typed in the CEPH
reference families for linkage mapping of the human chromosomes (Genomics
6: 575-577, 1990; Science, 265: 2049-2054, 1994). CGAP Cancer Gene Anatomy Project,
NCBI, US http://www.ncbi.nlm.nih.gov/ncicgap/
An
interdisciplinary program established and administered by the National Cancer
Institute to generate the information and technological tools needed to decipher
the molecular anatomy of the cancer cell. CGAP is divided into five
complementary Initiatives, each with its own goals, informatics tools and
resources. COG Clusters of Orthologous Groups
of Proteins, NCBI, US. http://www.ncbi.nlm.nih.gov/COG/
Delineated
by comparing protein sequences encoded in 21 complete genomes, representing 17
major phylogenetic lineages. Each COG consists of individual proteins or groups
of paralogs from at least 3 lineages and thus corresponds to an ancient
conserved domain. CRISP Computer Retrieval of Information on Scientific Projects, Office
of Extramural Research, NIH, US http://crisp.cit.nih.gov/
A searchable database of federally funded biomedical research projects conducted
at universities, hospitals, and other research institutions. Includes SBIR
grants. CSNDB Cell Signaling Networks Data
base, National Institute of Health Sciences NIHS, Japan http://geo.nihs.go.jp/csndb/ A
data- and knowledge- base for signaling pathways of human cells. It compiles the
information on biological molecules, sequences, structures, functions, and
biological reactions that transfer the cellular signals. Signaling pathways are
compiled as binary relationships of biomolecules and represented by graphs drawn
automatically Caenorhabditis elegans
WWW Server, University of Texas Southwestern Medical Center at Dallas, US http://elegans.swmed.edu C. elegans
Gene Knockout Consortium, University of British Columbia, Canada http://elegans.bcgsc.bc.ca/knockout.shtml Celera DataSets,
Celera Genomics US http://publication.celera.com/cds/login.cfm
Comprehensive information across multiple organisms and data sources. Celera has a curated reference catalog of genes, mRNAs, and proteins encoded in Celera's Human, Mouse and Drosophila genomes, and in the genomes of other organisms important to understanding human biochemistry, disease and genetics. Celera Human Reference Genome, Celera Genomics, US http://publication.celera.com/cds/login.cfmThe
reference standard on which genomic sequences, ESTs, proteins, mRNA, SNPs, syntenic regions, promoter regions, pathway data and other genomic-related data may be compared. Celera uses the Celera Human Reference Genome as it's a focal point for data organization and analysis. Chemical Abstracts CA http://www.cas.org/
Bibliographic index to the chemical literature. ChipDB, Richard Young, Whitehead Institute, MIT, US http://young39.wi.mit.edu/chipdb_public/
We are dissecting genome regulatory circuitry in yeast and human cells. The
transcriptional regulatory circuitry of yeast and human cells is being deduced
through the use of high density oligonucleotide arrays. We are exploring the
role of the transcription apparatus, chromatin and signaling pathways in
regulation of genome expression. (Transcription Initiation Apparatus, Genome-
Wide Expression) Clone Registry, NCBI, US http://www.ncbi.nlm.nih.gov/genome/clone/
A database used by genome sequencing centers to record which clones have
been selected for sequencing, which are currently in the pipeline, and which
are finished and represented by sequence entries in GenBank. Some additional
information about human RPCI-11 clones has been obtained through several
whole- genome library characterization efforts. Conserved Domain Database CDD, NCBI, US http://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml
currently contains
domains derived from two popular collections, Smart
and Pfam, plus contributions from colleagues at NCBI. The source databases also provide descriptions and links to citations. Since conserved domains correspond to compact structural units, CDs contain links to 3D-structure via
Cn3D whenever possible. CrossRef http://www.crossref.org Publishers
International Linking Association 77
publishers of over 4,780 journals. DATABANKS, SRS, EBI, UK http://www.ebi.ac.uk/srs5cgi/wgetz?-fun+PageLibInfo+-info+DATABANKS
450+
databases, compiled nightly from public SRS servers worldwide. DBcat: the public catalog of
databases, INFOBIOGEN, France http://www.infobiogen.fr/services/dbcat/
Directory
of 500+ DNA, RNA, Protein, Genomic, Mapping, Protein structure, Literature and
miscellaneous databases. DBGET/LinkDB, GenomeNet,
Institute for Chemical Research, Kyoto University, Japan Integrated
database retrieval system, currently supports the following databases and gene
catalogs: nucleic acid sequences: GenBank, EMBL protein
sequences: SWISS- PROT, PIR, PRF, PDB, STR, 3D structures: PDB,
sequence motifs: PROSITE, EPD, TRANSFAC, enzyme reactions: LIGAND,
metabolic pathways: PATHWAY, amino acid mutations: PMD, amino acid indices:
AAindex, genetic diseases: OMIM, literature: LITDB, Medline,
gene catalogs: E. coli, H. influenzae,
M. genitalium, M. pneumoniae, M. jannaschii, Synechocystis, S. cerevisiae,
cross reference EMBL and GenBank DDBJ DNA DataBank of Japan Shares information daily with
EMBL and GenBank. http://www.ddbj.nig.ac.jp/
DHMD Dysmorphic Human and Mouse Homology Database, Mothercare Unit
of Clinical Genetics and Fetal Medicine, Institute of Child Health,
University of London, UK http://www.hgmp.mrc.ac.uk/DHMHD/dysmorph.html
This application consists of three separate databases of human and mouse
malformation syndromes together with a database of mouse/ human syntenic
regions. The mouse and human malformation databases are linked together
through the chromosome synteny database. The purpose of the system is to
allow retrieval of syndromes according to detailed phenotypic descriptions
and to be able to carry out homology searches for the purpose of gene
mapping. Databases include the London Dysmorphology Database (LDDB), Mouse
malformation database, and Human Cytogenetic Aberrations. DIP Database of Interacting Proteins,
UCLA/DOE, US http://dip.doe-mbi.ucla.edu/
Documents
experimentally determined protein-protein interactions and interactive methods. DOGS Database of Genome Sizes Center for Biological Sequence
Analysis, Technical University Denmark http://www.cbs.dtu.dk/databases/DOGS/index.html
A comprehensive list of (estimated) genome sizes for different organisms.
The purpose of this database is to provide such a list. The ultimate goal is
to compile a list of all the known organisms and their respective genome
sizes. Both the completed and estimated genomes are listed. The estimated
genome sizes are given for both the organisms currently being sequenced and
those for which no sequencing programme is in progress. DOTS Database of Transcribed Sequences,
Univ. of Pennsylvania, US. http://www.cbil.upenn.edu/DOTS*/dotsweb
Center
for Bioinformatics trying to provide consensus sequences for human and mouse
genes from GenBank and dbEST. Has been superseded by http://www.allgenes.org/
which combines data from DOTS and the Genome Channel (ORNL). DSSP Definition of Secondary
Structures of Proteins, http://bioweb.pasteur.fr/seqanal/interfaces/dssp-simple.html
W. Kabsch and Chris Sander (1983) Biopolymers 22, 2577-2637. Dali, EBI European
Bioinformatics Institute http://www.embl-ebi.ac.uk/dali/
The
Dali server is a network service for comparing protein structures in 3D. You
submit the coordinates of a query protein structure and Dali compares them
against those in the Protein Data Bank. A multiple alignment of structural
neighbours is mailed back to you. In favourable cases, comparing 3D structures
may reveal biologically interesting similarities that are not detectable by
comparing sequences. If you want to know the structural neighbours of a protein
already in the Protein Data Bank, you can find them in the FSSP database. Dali
and HSSP are derived databases organizing protein space in the structurally
known regions. The structure classification by Dali and the sequence families in
HSSP can be browsed jointly from a web interface providing a rich network of
links between domains and proteins and between structures and sequences. This
results in a database of explicit multiple alignments of protein families in the
twilight zone of sequence similarity. Database of Macromolecular
Movements, Molecular Biophysics and Biochemistry, Yale Univ., US http://bioinfo.mbb.yale.edu/MolMovDB/
This describes the motions that occur in proteins and other macromolecules,
particularly using movies. Associated with it are a variety of free software
tools and servers for structural analysis. M Gerstein & WG Krebs (1998).
Nuc. Acid. Res. 26:4280-4290 Database of Ribosomal Crosslinks,
Max Planck Institut, Berlin, Molekulare Genetik, Germany http://www.molgen.mpg.de/~ag_ribo/ag_brimacombe/drc/
To
interpret the molecular basis of the translational process, it is essential to
have a corresponding knowledge of the higher structure of the ribosome. dbEST, NCBI http://www.ncbi.nlm.nih.gov/dbEST/index.html
Sequence
data and other information on "single- pass" cDNA sequences or ESTs,
from a number of organisms, part of GenBank. dbSNP, NCBI http://www.ncbi.nlm.nih.gov/SNP/
Uses
"looser variation" definition for SNPs (no requirement or assumption
about minimum allele frequencies or the polymorphisms…Short deletion and
insertion polymorphisms, and microsatellite repeats, as well as SNPs are
included. Disease causing clinical mutations, as well as neutral polymorphisms,
are also in scope. [dbSNP FAQ] dbSTS, NCBI http://www.bjmu.edu.cn/bi/ncbihtm/345ef2f6.htm
A
subset of GenBank, with sequence and mapping data on short genomic landmark
sequences (STSs). More comprehensive annotation than in GenBank and regularly
updated with BLAST. Dead DNA: See under Mitomap Decoys R’ Us, Stanford Univ.,
US http://dd.stanford.edu/ Computer
generated conformations of protein sequences that possess some characteristics
of native proteins, but are not biologically real. The primary use of decoys is
to test scoring, or energy, functions. DeltaBase, Deltagen, US
http://www.deltagen.com/products/deltabase.html
A
library of functional information about mammalian gene families thought to be
relevant to small molecule drug discovery. DiscoverEase, Genetics
Institute, US http://www.discoverease.com
Secreted
proteins, a protein development platform. Drosophila melanogaster
genome, NCBI, US http://www.ncbi.nlm.nih.gov/PMGifs/Genomes/7227.html
the assembled and annotated genome sequence of the euchromatic arms of the five Drosophila
melanogaster (fruit fly) chromosomes is now available in GenBank. The
sequence, determined in a collaboration between Celera and the Berkeley
Drosophila Genome Project, is described in the March 24, 2000 issue of Science. EGAD, TIGR, US http://www.tigr.org/tdb/egad/sequence/sequence_page.html
Extraction
and curation of sequences from GenBank to create a non- redundant set of
transcript (HT and ET) sequences. EID Exon-Intron database, Walter
Gilbert Lab, Harvard University, US http://golgi.harvard.edu/gilbert/eid/
An
exhaustive database of protein- coding intron- containing genes. Derived from GenBank. EMBASE Excerpta Medica http://www.embase.com/
Bibliographic index to biomedical and pharmacological literature. EMBL (European Molecular Biology Laboratory: Main laboratory
is in Heidelberg, Germany, with outstations in Hamburg, Grenoble, France
(access to high powered instruments for structure studies) and Hinxton,
UK (bioinformatics). Supported by 14 European countries and Israel, shares data
daily with DDBJ and GenBank. http://www.embl-heidelberg.de/ EPD Eukaryotic Promoter Database,
Bioinformatics Group, ISREC Swiss Institute for Experimental Cancer Research
http://www.epd.isb-sib.ch/ an
annotated non-redundant collection of eukaryotic POL II promoters, for which the
transcription start site has been determined experimentally. Access to promoter
sequences is provided by pointers to positions in nucleotide sequence entries.
The annotation part of an entry includes description of the initiation site
mapping data, cross-references to other databases, and bibliographic references.
EPD is structured in a way that facilitates dynamic extraction of biologically
meaningful promoter subsets for comparative sequence analysis. Entrez, NCBI http://www.ncbi.nlm.nih.gov/Entrez/
A
retrieval system for searching several linked databases. It provides access to PubMed
(Medline), Nucleotide sequence database (GenBank) Protein
sequence database, Structure: three- dimensional
macromolecular structures, Genome: complete genome assemblies PopSet:
Population study data sets, Taxonomy: organisms in GenBank, OMIM:
Online Mendelian Inheritance in Man Entrez Genomes, NCBI, US
http://www.ncbi.nlm.nih.gov/Entrez/Genome/org.html
The
whole genomes of over 600 organisms can be found. The genomes
represent both completely sequenced organisms and those for which sequencing is
in progress. All three main domains of life - bacteria, archaea, and eukaryota -
are represented, as well as many viruses and mitochondria. Entrez Nucleotides, NCBI, US
http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=Nucleotide
A collection of sequences from several sources, including GenBank, RefSeq, and PDB. The number of bases grows at an exponential rate. Entrez Proteins, NCBI, US http://www.ncbi.nlm.nih.gov:80/entrez/query.fcgi?db=Protein
The protein entries in the Entrez search and retrieval system have been compiled from a variety of sources, including
SwissProt, PIR, PRF, PDB, and translations from annotated coding regions in
GenBank and RefSeq. ENZYME, ExPASy,
Switzerland http://www.expasy.ch/enzyme/
Enzyme
nomenclature database EpoDB Erythropoiesis database, CBIL (Computational
Biology & Informatics Lab), Univ. of Pennsylvania, US http://www.cbil.upenn.edu/EpoDB/index.html
A database of genes that relate to vertebrate red blood cells. It includes DNA
sequence, structural features, protein information, gene expression information
and transcription factor binding sites. ExInt, National University of
Singapore
http://intron.bic.nus.edu.sg/exint/exint.html
Exons-introns of
eukaryotic organisms ExPASy (Expert Protein Analysis
System), Swiss Institute of Bioinformatics, Switzerland http://www.expasy.ch/
Proteomics
server Express DB, George Church Lab,
Harvard Medical School, US http://arep.med.harvard.edu/ExpressDB/
A relational database for maintaining yeast RNA expression
data. It is intended as a demonstration of how such data can be managed, and
of the benefits such management confers. As of July, 1999, over 17.5 million
pieces of information have been loaded into ExpressDB deriving from 11
source studies. The EXD web query system allows data from multiple source
studies to be retrieved to user specifications and collated by ORF name. A
manuscript on ExpressDB, the data loaded into it, and how it may be
analyzed, has been submitted for publication FSSP/DALI EBI, UK. Fold classification
based on Structure- Structure alignment of Proteins, EBI, UK http://www2.embl-ebi.ac.uk/dali/fssp/fssp.html
The
FSSP database is based on exhaustive all- against- all 3D structure comparison
of protein structures currently in the Protein Data Bank (PDB). The
classification and alignments are automatically maintained and continuously
updated using the Dali search engine. Reference: L. Holm and C. Sander (1996)
Mapping the protein universe. Science 273:595-602. FlyBase SEE Berkeley
Drosophila Genome Project FlyView, Univ. Muenster, Germany
http://pbio07.uni-muenster.de/
Image
database on Drosophila development and genetics, especially expression
patterns of genes GDB Genome DataBase, Hospital
for Sick Children, Toronto, Canada http://www.gdb.org
Genomic
maps, genes, YACs and amplimers 12 mirror sites http://www.gdb.org/gdb/contact.html#nodes GENOTK, Otsuka GEN Research
Institute, University of Tokyo, Japan GGEG Global Gene Expression Database,
MD Anderson Cancer Center http://sciencepark.mdanderson.org/ggeg/default.html
Human
mRNA sequence data specific to the RAGE and SAGE techniques, general mRNA
information. GOBASE Organelle Genome Database,
Univ. of Montreal, Canada http://megasun.bch.umontreal.ca/gobase/
a taxonomically broad organelle genome database that organizes and integrates diverse data related to organelles. The current version focuses on the mitochondrial subset of data. In its second phase, GOBASE will also include information on chloroplasts and representative bacteria that are thought to be specifically related to the bacterial ancestors of mitochondria and chloroplasts. GOLD Genomes Online, Integrated
Genomics, Inc UIUC/Argonne http://igweb.integratedgenomics.com/GOLD/
Complete
and ongoing genome projects information. GSDB See Genome Sequence
DataBase GSS Genome Survey
Sequences, NCBI, US http://www.ncbi.nlm.nih.gov/dbGSS/
The GSS division of GenBank is similar to the EST division, except
that its sequences are genomic in origin, rather than cDNA (mRNA). The
GSS division contains (but is not limited to) the following types of data:
random "single pass read" genome survey sequences, cosmid/BAC/YAC
end sequences, exon trapped genomic sequences, Alu PCR sequences. GXD: Gene Expression Database,
Jackson Laboratory, US http://www.informatics.jax.org/mgihome/GXD/gxdgen.shtml#concept
Gene
expression data on the laboratory mouse. GadFly Genome Annotation Database,
Berkeley Drosophila Genome Project http://www.fruitfly.org/annot/index.html
Genome
annotations GenAtlas France http://bisance.citi2.fr/GENATLAS/
compiles the information relevant to the mapping efforts of the Human Genome
Project. This information is collected from original articles in the
literature or from the proceedings of Human Gene Mapping and Single
Chromosome Workshops. It is repertoried in three interactive directories
GENATLAS/GEN, GENATLAS/ LINK, GENATLAS/REF. GenBank:, NCBI, US http://www.ncbi.nlm.nih.gov/Genbank/
NIH
genetic sequence database, annotated collection of all publicly available DNA
sequence Mirrored at EMBL and DDBJ. Currently estimated (early 2000) that over 2
million bases are deposited here each day. This growth will only accelerate in
the future. Began in the 1980’s by DOE. Cross reference DDBJ and EMBL. See
also Sequencing Glossary. GenCarta TM
, Compugen, US http://www.cgen.com/products/gencarta.htm
A comprehensive database of genes and transcripts, based on Compugen's analysis of public domain genomic and expressed data using LEADS, Compugen's proprietary discovery
platform. GeneCards, Weizmann Institute,
Israel http://bioinfo.weizmann.ac.il/cards/
Numerous
mirrored sites, database of human genes, their products and their involvement in
diseases. It offers concise information about the functions of all human genes
that have an approved symbol, as well as selected others. Gene Census system, Yale
University, US http://bioinfo.mbb.yale.edu/genome/
Comprehensive
statistical accounting of protein structural features in genomes and sequence
databanks. GeneExpressTM, Gene Logic, US
http://www.genelogic.com/gexpress.htm
Databases containing gene expression profiles from tens of
thousands of human tissue samples, animal models, and cell and tissue cultures.
This information allows subscribers to: identify the genes and physiological
pathways associated with diseases; prioritize new drug targets for screening;
and assess the potential toxicity of new therapeutic compounds. The BioExpress
database contains gene expression measurements from a wide range of human tissue
samples in normal, diseased, and treated conditions, as well as extensive
clinical information on all of the tissue donors. We are also adding expression
measurements from selected rat and mouse tissues. The ToxExpress
predictive toxicology database consists of gene expression measurements from rat
and human primary cells and rat tissues exposed to compounds known to be toxic. ToxExpress subscribers will be able to compare these reference expression
profiles to results from expression studies performed using putative lead
compounds. This comparison will enable subscribers to determine the toxic
potential of their drug leads and select those with the best profiles for
further development Gene Expression Omnibus GEO, NCBI, US http://www.ncbi.nlm.nih.gov/geo/
In order to support the public use and dissemination of gene expression data,
NCBI has launched the Gene Expression Omnibus. GEO is our [NCBI’s] effort to
build a gene expression data repository and online resource for the retrieval of
gene expression data from any organism or artificial source. Many types of gene
expression data from platform types such as spotted microarray (microarray),
high-density oligonucleotide array (HDA), hybridization filter (filter) and
serial analysis of gene expression (SAGE) data, will be accepted, accessioned,
and archived as a public data set. A series of precomputed definitions and
descriptions of the data, as well as online tools for the interactive retrieval
and analysis of this expression data will follow shortly thereafter. Gene Map of the Human Genome,
International RH Mapping Consortium http://www.ncbi.nlm.nih.gov/genemap99/
Includes
locations of more than 30,000 genes and provides an early glimpse of some of the
most important pieces of the genome. Gene Scape, CuraGen, US http://portal.curagen.com/
Portal for CuraGen's proprietary software and databases, including GeneCalling,
PathCalling, SeqCalling and SNPCalling. Genetic Annotation Index (GAI)
identifies and characterizes the polymorphisms associated with cancer. ] GeneX Project, National Center
Genome Research (NCGR) http://www.ncgr.org/genex/
An
integrated database system, incorporating tools for data mining and analyzing
data. Genline http://gizmo.lbl.gov/jopmDemo/Genline.html
Lawrence Berkley Lab, US GenLink http://www.genlink.wustl.edu/
Washington Univ. St. Louis, US a multimedia database resource for human
genetics and telomere research. Genome Analysis Pipeline See
In-depth Genomics Glossary Genome Catalog, ORNL, US Genome
Analysis Pipeline See In-depth Genomics Glossary Genome Channel, Oak Ridge
National Laboratory, US http://compbio.ornl.gov/channel/
Search
by organism (including human), chromosome? Genome MOT Genome Monitoring Table,
EBI, UK http://www.ebi.ac.uk/~sterk/genome-MOT/index.html
Status
of a number of large genome sequencing projects. Genome Sequence DataBase GSDB,
National Center for Genome Research http://gizmo.lbl.gov/DM_TOOLS/OPM/OPM_QS/node14.html#SECTION00062000000000000000
Archival database of genome
sequence data. Genotypes DB, Washington Univ.
St. Louis, US http://www.genlink.wustl.edu/gtypes/index.html Makes
all genotypic data used in the construction of linkage
maps presented in GenLink easily accessible through the WWW. German Human cDNA Project,
Munich Information Center for Protein Sequences, Germany http://www.mips.biochem.mpg.de/proj/cDNA/
Consortium
of DNA sequencing labs, aiming to characterize a large set of cDNA clones not
yet identified in other projects, and to contribute a significant part towards
the systematic identification and functional characterization of human genome
genes. HGVbase Human Genome Variation
Database, Karolinska Institute, Sweden; Interactiva GmbH http://hgvbase.cgb.ki.se/A
central depository for mutation collection efforts undertaken in
allegiance with the Human Genome Variation Society (HGVS)An attempt
to summarize all known sequence variations in the human genome, to
facilitate research into how genotypes affect common diseases, drug
responses, and other complex phenotypes. Sequence variations are presented with details of
how they are physically and functionally related to the closest neighbouring
gene. Records include SNPs,
Indels, simple tandem repeats, and other sequence alternatives, regardless
of location, allele frequencies, or known affect upon phenotype. All records
are highly curated
and annotated, ensuring maximal utility and data accuracy Was HGBASE, Human Genic Bi-Allelic
Sequence Database HGMD Human Gene Mutation Database,
Univ. of Wales College of Medicine, UK http://uwcm.web.cf.ac.uk/uwcm/mg/hgmd0.html HOMOLOGENE, NCBI, US
http://www.ncbi.nlm.nih.gov/HomoloGene/
A homology resource which includes both curated and
calculated orthologs and homologs for genes represented in UniGene
and LocusLink for human, mouse, rat, and zebrafish. The curated
orthologs include ortholog gene pairs reported in the Mouse Genome
Database (MGD) at the Jackson Laboratory, the Zebrafish Information
(ZFIN) database at the University of Oregon, and in published reports. The
calculated orthologs and homologs are the result of nucleotide sequence
comparisons between all UniGene clusters for each pair of organisms. These
orthologs and homologs are considered putative since they are based only on
sequence comparisons. HOVERGEN Homologous Vertebrate Genes Database, PBIL (Pôle Bio-Informatique
Lyonnais, Univ. Lyons, France http://biom3.univ-lyon1.fr/databases/hovergen.html
A database of homologous vertebrate genes, structured under ACNUC sequence
database management system. It allows one to select sets of homologous genes
among vertebrate species, and to visualize multiple alignments and
phylogenetic trees. Thus HOVERGEN is particularly useful for comparative
sequence analysis, phylogeny and molecular evolution studies. More generally,
HOVERGEN gives an overall view of what is known about a peculiar
[particular?] gene family. The database itself contains all vertebrate
sequences from GenBank (except ESTs), with some data corrected, clarified or
completed (notably to address the problem of redundancy). Homologous coding
sequences have been classified in gene families and protein multiple
alignments and phylogenetic trees have been computed for each family.
Sequences and related information have been structured in an ACNUC database.
The database is updated every four months HTGS High Throughput Genomic Sequences, NCBI, US http://www.ncbi.nlm.nih.gov/HTGS/
created to accommodate a growing need to make 'unfinished' genomic sequence
data rapidly available to the scientific community. It was done in a
coordinated effort between the three International Nucleotide Sequence
databases: DDBJ, EMBL, and
GenBank. The HTG division contains 'unfinished' DNA sequences generated
by the high-throughput sequencing centers. Sequence data in this division
are available for BLAST homology searches against either the "htgs"
database or the "month" database, which includes all new
submissions for the prior month. The HTG division of GenBank was recently
described in a [Genome Research
(1997) 7(10)] article by Ouellette
and Boguski. Human BAC Ends, TIGR, US http://www.tigr.org/tdb/humgen/bac_end_search/bac_end_intro.html
Sequences from the ends of bacterial artificial chromosome (BAC) clones provide highly specific markers. A whole genome
sequencing approach has been described in a map-as-you-go strategy. The complete sequence of a seed BAC is searched
against a BAC end database and the minimally overlapping clones in each direction are selected for sequencing. As coverage
increases, BAC end sequences provide samples for whole genome survey. ~743,000 end sequences from 470,000 clones (20 X clone coverage and 12% sequence coverage) have been generated by
TIGR, Univ. of Washington and CalTech, providing a sequence marker every 5 kb across the genome. Human Mouse Homology Map, NCBI, US http://www.ncbi.nlm.nih.gov/Homology/
Map is now being computed by integrating orthologs curated by the Mouse
Genome Database with putative orthologs identified by sequence homology.
This version of the Human-Mouse Homology map also differs from the previous
Davis map by including several new features: reporting representative STS
associated with the loci in the map and linked to the dbSTS pages, linking
human cytogenetic locations to NCBI's MapViewer, providing alignments of
representative sequences via BLAST2 , and linking gene symbols to LocusLink HSSP homology-derived secondary
structure of proteins, EMBL, Germany http://www.sander.embl-heidelberg.de/hssp/
A
database of homology- derived secondary structure of proteins (HSSP) by aligning
to each protein of known structure all sequences deemed homologous on the basis
of the threshold curve. For each known protein structure, the derived database
contains the aligned sequences, secondary structure, sequence variability and
sequence profile. Tertiary structures of the aligned sequences are implied, but
not modelled explicitly. HUGE Human Unidentified Gene Encoded Large Proteins, Kazuza DNA
Research Institute, Japan http://www.kazusa.or.jp/huge/The
HUGE protein database has been created to publicize the fruits of our Human
cDNA project at the Kazusa DNA Research Institute. In this project, we plan
to sequence and analyze long (>4 kb) human cDNAs and to establish methods
by using the sequence data how to predict the primary structure of proteins
of various biological activities. Currently, we focus on the analysis of
cDNA clones encoding particularly large proteins (>50 kDa). The basic
concept underlying our project and the strategies employed have been
described elsewhere (Ohara et al., 1997). Our HUGE protein
database contains various types of information derived from the predicted
primary structure data of newly identified human proteins HuGE Human Gene Expression Index,
Brigham & Women’s Hospital, US http://www.hugeindex.org/
A
comprehensive database to understand the expression of human genes in normal
human tissues. Currently, RNA expression of more than 6000 genes is obtained
using high- density oligonucleotide array technology HUGO Mutation Database Initiative, Human Genome Organisation,
Univ. of Melbourne, Australia, http://ariel.ucs.unimelb.edu.au/~cotton/dblist.htm
:Links to Locus specific mutation databases, Central and general mutation
databases, national and ethnic mutation databases, complex disease
databases, clinical and patient aspects, non human mutations, artificial
mutations and other related databases. Highwire Press, Stanford Univ., US http://highwire.org
Free (and fee-based), full- text science journals. Human Gene Index, TIGR, US http://www.tigr.org/tdb/hgi/index.html
Human EST sequences from TIGR
and GenBank. Human Genome Sequencing
(finished, draft, other statistics, progress reports and access to data) http://www.ncbi.nlm.nih.gov/genome/seq/page.cgi?F=HsHome.html&ORG=Hs Human Mouse Homology Map, NCBI, US http://www.ncbi.nlm.nih.gov/Homology/
Map is now being computed by integrating orthologs curated by the Mouse
Genome Database with putative orthologs identified by sequence homology.
This version of the Human- Mouse Homology map also differs from the previous
Davis map by including several new features: reporting representative STS
associated with the loci in the map and linked to the dbSTS pages, linking
human cytogenetic locations to NCBI's MapViewer, providing alignments
of representative sequences via BLAST2 , and linking gene symbols to LocusLink Human Protein Index TMLarge Scale Biology Corp., US http://www.lsbc.com/wt/tert.php?page_name=databases Human SNP Database, Whitehead
Institute, US http://www-genome.wi.mit.edu/SNP/human/index.html IMAGE Consortium: Integrated
Molecular Analysis of Genomes and their Expression, Lawrence Livermore National Lab, US
http://image.llnl.gov/ Shares high quality arrayed cDNA
libraries and places sequence, map and expression data on the clones in these
arrays into the public domain. Human and mouse genomes are first to be studied.
They anticipate arraying (and sharing) cDNA libraries from other species in
time. IMGT, the international ImMunoGeneTics database http://imgt.cines.fr:8104/textes/IMGTScientificChart/3/IMGTnomenclature.html
IMGT gene name nomenclature for IG immunoglobulins, TR T cell receptors and MHC Major Histocompatibility Complex molecules from all vertebrate species INTERACT, Manchester Bioinformatics, Univ. of Manchester, UK
http://www.bioinf.man.ac.uk/resources/interact.shtml
Object oriented database for protein- protein interactions. IXDB Integrated Chromosome X DataBase, Max Planck Institut,
Berlin, Germany http://www.molgen.mpg.de/~xteam/
The purpose of IXDB is to provide an integrated view of the X chromosome
mapping field. Ultimately this will allow the construction of an integrated
map that will take into account all the data generated by the community,
including physical, genetic, transcript and sequence information. This
implies acquiring, understanding and formatting an enormous amount of
experimental results and can only be accomplished progressively. We have
chosen to start the integration process with YAC maps generated by the
community. These provide the basis for future higher resolution physical
maps, as well as emerging transcript and sequence maps. The current content
of IXDB therefore reflects this situation, with the emphasis placed on
YAC mapping data. Due to their immediate value, IXDB has also started to
systematically include bacterial clone contig maps and EST data. Currently
IXDB does not store sequence data, although links to nucleic sequence
databases are provided. Induced Mutant Resource IMR, Jackson Laboratories,
US http://www.jax.org/resources/documents/imr/
Transgenic and targeted mutant mice, national
clearinghouse for the collection and distribution of genetically engineered
mice InBase: Intein Database, New England Biolabs, US http://www.neb.com/inteins/intein_intro.html Interactive Fly, Purdue Univ.,
US http://sdb.bio.purdue.edu/fly/aimain/1aahome.htm
A cyberspace guide to Drosophila genes and their roles in
development, including pathways. International Nucleotide Database: Composed of DDBJ, EMBL
and
GenBank. Often - but inaccurately - referred to as GenBank. InterPro, EBI, UK
http://www.ebi.ac.uk/interpro/ Release 3.2 (July 2001) was
built from Pfam
6.2, PRINTS
30.0, PROSITE 16.37, ProDom
2001.1, SMART 3.1 and the
current SWISS-PROT + TrEMBL
data. This release of InterPro contains 3939 entries, representing 1009
domains, 2850 families, 65 repeats and 15 post-translational modification
sites. InterPro is a useful resource for whole genome analysis and has
already been used for the proteome analysis of a number of completely
sequenced organisms. A preliminary proteome analysis was also
produced for the human genome KEGG Pathway Database, DBGET
Links
to pathway and other databases (metabolic and regulatory) Kabat Database of Sequences of
Proteins of Immunological Interest, Northwestern Univ. US http://immuno.bme.nwu.edu/ KeyNet, Consiglio Nazionale della Ricera, Italy. http://www.ba.cnr.it/keynet.htm
A database of Keywords extracted from EMBL and GenBank databases.
The KeyNet structure is based on biological criteria aimed to assist the user in
data searching and to minimize the risk of loss of information. Klotho, Washington Univ. US
http://www.ibc.wustl.edu/klotho/
An attempt to model biological processes, beginning with
biochemistry. We call the whole project Moirai, after the three Fates of
antiquity, since fundamentally these are questions about the fates of
molecules and cells. LIGAND database, Institute for
Chemical Research, Kyoto Univ. Japan
Enzymes,
compounds and reactions. LPFC Library of Protein Family Cores,
Stanford Univ. US http://smi-web.stanford.edu/projects/helix/LPFC/
We have taken structural alignments of
protein families and computed average core structures for each family. The core
structures can be divided into residues with low spatial variation and those
with high spatial variation. Amino acids with low spatial variance occupy
essentially the same relative position in all family members. This library is
useful for building models, threading, and exploratory analysis. It is also a
useful mechanism for summarizing variability in NMR structures. Life Seq, Incyte Genomics, US
http://www.incyte.com/ LifeSeq Public, Incyte Genomics,
US http://www.incyte.com/aug0100/lspublic.html
1.4 million public sequences obtained
from human tissue sources and human cell lines, processed through Incyte's
proprietary automated bioanalysis system. Access to 90,000 verified clone
reagents. LocusLink, NCBI, US http://www.ncbi.nlm.nih.gov/LocusLink/
A single query interface to curated
sequence and descriptive information about genetic loci. It presents information
on official nomenclature, aliases, sequence accessions, phenotypes, EC numbers,
MIM numbers, UniGene clusters, homology, map locations, and related web sites. MAGEST, GenomeNet, Japan
Expression patterns and sequence tags
for maternal mRNAs of the ascidian egg, Halocynthia roretzi.] MAGPIE Multipurpose Automated Genome
Project Investigation Environment Genome Sequencing Projects (completed and
in progress) http://www-fp.mcs.anl.gov/~gaasterland/genomes.html MGD See Mouse Genome Database MIPS Munich Information Center for
Protein Sequences, Germany http://www.mips.biochem.mpg.de/
We
are a bioinformatics group of the GSF (National Research Center for Environment
and Health) at the Max- Planck- Institut f. Biochemie. MIPS
is a member of PIR- International (Protein Identification Resource) and of
EMBNET (European Molecular Biological Network) MIRAGE (Molecular Informatics
Resource for the Analysis of Gene Expression), Institute for Transcriptional
Informatics, Pittsburgh PA, US http://www.isbi.net
Experimental web resource dedicated to
the study of gene expression. MITOMAP, Emory Univ., US
http://www.gen.emory.edu/mitomap.html
A
human mitochondrial genome database. A compendium of polymorphisms and mutations
of the human mitochondrial DNA. Coming soon: mtDNA mutations from ancient
remains will be included in a separate 'Dead DNA' section. (Currently only DNA
data from contemporary humans is included in the Mitomap database.) MKMD Mouse Knockout and Mutation
Database, BioMedNet, Current Biology http://research.bmn.com/mkmd
Phenotypic
information related to knockout and classical mutations in mice. It includes
extensive links to MEDLINE on BioMedNet. MKMD was originally created from tables
published over 3 issues of Current Biology (Brandon EP, Idzerda R.L., McKnight,
G.S.: Current Biology (1995) 5: 569-694; 627-634; 873-881). The database has
been expanded to include gene insertion mutations and classical mutants whose
molecular nature has been identified. MMDB Molecular Modeling DataBase,
NCBI, US http://www.ncbi.nlm.nih.gov/Entrez/structure.html
A database of macromolecular 3D structures (as well as tools for their
visualization and comparative analysis). Contains experimentally determined
biopolymer structures obtained from the Protein Data Bank (PDB). Structures can
be anything from short oligonucleotides or peptides to very large macromolecular
complexes containing dozens of individual molecules. MODBASE, Rockefeller Univ. US http://pipe.rockefeller.edu/modbase/
Comparative protein structure models. MOT SEE Genome MOT Mammalian Gene Collection, NCBI,
US http://mgc.nci.nih.gov/ The
goal of the Mammalian Gene Collection (MGC) is to provide a complete set of
full-length (open reading frame) sequences and cDNA clones of expressed genes
for human and mouse. The MGC is an NIH initiative that supports the
production of cDNA libraries, clones and sequences. Medline See PubMed Mitelman DataBase of Chromosome Aberrations in Cancer, CGAP, NCI, US
http://cgap.nci.nih.gov/Chromosomes/Mitelman
relates chromosomal aberrations to tumor characteristics, based either on
individual cases or associations. All the data have been manually culled from
the literature by Felix Mitelman, Bertil Johansson, and Fredrik Mertens. Molecular Anatomy and Pathology Database TM,
Large Scale Biology Corp., US http://www.lsbc.com/wt/tert.php?page_name=databases Molecular Effects of Drugs Database TM,
Large Scale Biology Corp. http://www.lsbc.com/wt/tert.php?page_name=databases Molecular Probe Data Base MPDB,
Advanced Biotechnology Center of Genoa, Italy http://www.biotech.ist.unige.it/interlab/mpdb.html
Information on about 4.300 synthetic
oligonucleotides with a sequence of up to 100 nucleotides. Data are mainly taken
from the literature and are encoded on the basis of controlled vocabularies. Mouse Atlas and Gene Expression Database, Human Genetics Unit, MRC
Medical Research Council, Edinburgh, UK http://genex.hgu.mrc.ac.uk/
Not yet available 11/2/00 A digital atlas of mouse development and database
to be a resource for spatially mapped data such as in situ gene expression
and cell lineage. The project is in collaboration with the Department of
Anatomy, University of Edinburgh. The gene expression database is being
developed as part of the Mouse Gene Expression Information Resource (MGEIR)
in collaboration with the Jackson Laboratory, USA. Mouse Gene Expression Information Resource (MGEIR) http://genex.hgu.mrc.ac.uk/MouseGeneExpInfoRes/
The gene- expression resource is a collaborative project to produce a single
gene- expression resource database for the research community. This resource
will be directly linked to the Mouse Genome Database at the Jackson
Laboratory. Database design and development is centered at the MRC Human
Genetics Unit and the Jackson Laboratory, with biological and technical
support from the Department of Anatomy, the ESF Embryonic Databases Network
and other collaborating sites. For further details see Ringwald et. al.,
Science 265(30th Sept, 1994) 2033-4. Mouse Genome Database MGD, http://www.informatics.jax.org/mgihome/MGD/aboutMGD.shtml
Jackson Lab, US Mouse Genome Informatics,
Jackson Laboratory, US http://www.informatics.jax.org/mgihome/
Provides integrated access to data on the genetics, genomics and biology
of the laboratory mouse. The projects contributing to this resource are: Mouse
Genome Database (MGD), Gene Expression Database (GXD, Mouse Genome Sequence
(MGS). Mouse Genome Sequence (MGS), Jackson Lab, US http://www.informatics.jax.org/mgihome/MGS/mgs.shtml
The overall goal of the Mouse Genome Sequence (MGS) project is to integrate
emerging mouse genomic sequence data with the genetic and biological data
available in MGD and GXD. MGS is part of the informatics infrastructure needed
to support mouse-human comparative genomics. Mouse Phenome Database, Jackson Labs, US http://aretha.jax.org/pub-cgi/phenome/mpdcgi?rtn=docs/homeA
collection of baseline phenotypic data on commonly used and genetically
diverse inbred mouse strains through a coordinated international effort. NIST ATP Funded Projects, National Institute of Standards and
Technology, US http://jazz.nist.gov/atpcf/prjbriefs/listmaker.cfm Nucleic Acids Database NDB, Rutgers Univ., US http://ndbserver.rutgers.edu/NDB/index.html
Assembles and distributes structural information about nucleic acids. See
also Protein Data Bank PDB OMIA Online Mendelian Inheritance in
Animals, Univ. of Sydney, Australia http://www.angis.su.oz.au/Databases/BIRX/omia/omia_form.html
A database of the genes and phenes*
that have been documented in a wide range of animal species other than those for
which databases already exist (human, rat and mouse). It is modelled on, and is
complementary to, McKusick's Mendelian Inheritance in Man (MIM). * A phene is a word or words that
identify a familial trait. For single- locus traits, the word(s) correspond to
one of the phenotypes that arise from segregation at that locus. For example,
CITRULLINAEMIA is the phene for the ARGININOSUCCINATE SYNTHETASE locus; and
FECUNDITY, BOOROOLA is the phene for a locus that has not yet been identified at
the biochemical/ molecular level. OMIA also includes multifactorial traits and
disorders. Thus, for example, HIP DYSPLASIA is a phene. OMIM, Online Mendelian Inheritance
in Man, NCBI, US http://www.ncbi.nlm.nih.gov/Omim/searchomim.html
Gene maps (cytogenetic locations of
genes described in OMIM) and morbid maps (alphabetical list of diseases
described in OMIM and their corresponding cytogenetic locations). [from the OMIM
FAQ] OMIM Locus Specific Mutation Databases, NCBI, US http://www.ncbi.nlm.nih.gov/Omim/Index/mutation.html
Links to a number of locus specific mutation databases. OSU Human Genome Database, LabBook, Inc. US http://www.labbook.com/products/query.aspIntegrates
multiple independent gene and EST collections including Ensembl and
LabBook’s own proprietary assemblage of UniGene ESTs aligned to the
draft human genome sequence.Features a gene index based on
assembled and mapped transcripts rather than gene prediction OWL, Division of Biomedical
Information Sciences, Johns Hopkins Medical Institution http://www.bis.med.jhmi.edu/Dan/proteins/owl.html
A
non- redundant protein sequence database produced from SWISSPROT, PIR, GenBank, OmniBank, Lexicon Genetics, US
http://www.lexgen.com/omnibank/omnibank.htm
A
library of tens of thousands of genetically modified mouse clones. Each OmniBank
mouse clone contains a gene trap event in a single gene that may be used to
identify the function of genes and their importance to the therapy of human
diseases such as cancer, diabetes, and heart disease. ooTFD object oriented Transcription factors and gene
expression, Institute for Transcriptional Informatics IFTII, US http://www.ifti.org/cgi-bin/ifti/ootfd.pl
A successor to TFD (Transcription Factors Database), now referred to as rTFD
(relational Transcription Factors Database). ooTFD has been implemented in a
number of object-oriented database management systems, including ROL (Rule-
based
Object Language), MOOD (Materials object-oriented database), and the pure java
object database ozone . PDB Protein Data Bank, Research
Collaboratory for Structural Bioinformatics http://www.rcsb.org/
3D macromolecular structural data.
Incorporates NDB Nucleic Acid Database Project, Rutgers. PEDB Prostate ESTs, Fred Hutchinson
Cancer Research Center, US http://www.pedb.org/
A curated relational database and suite of analysis tools designed for the
study of prostate gene expression in normal and disease states. Expressed
Sequence Tags (ESTs) and full-length cDNA sequences derived from more than
40 human prostate cDNA libraries are maintained and represent a wide
spectrum of normal and pathological conditions. PIR Protein Information Resource,
NBRF, Georgetown Univ. Medical Center, US http://www-nbrf.georgetown.edu/pirwww/pirhome.shtml
The Protein Information Resource (PIR),
in collaboration with the Munich Information Center for Protein Sequences (MIPS)
and the Japanese International Protein Sequence Database (JIPID) maintains the
PIR- International Protein Sequence Database --- a comprehensive, annotated, and
non- redundant protein sequence database in which entries are classified into
family groups and alignments of each group are available. PIR-NRL3D http://pir.georgetown.edu/pirwww/dbinfo/nrl3d.html
Sequence-Structure Database is produced by PIR- International from sequence and
annotation information extracted from three-dimensional structures in the Protein Databank (PDB). The PIR- NRL3D database makes the sequence
information in PDB available for similarity searches and retrieval and provides
cross- reference information for use with the other PIR Protein Sequence
Databases. PMD Protein Mutant DataBase, National Institute of Genetics,
Japan PRONET, Doubletwist, US http://pronet.doubletwist.com/
Protein interactions on the web. As part of Myriad Genetics' effort to
understand protein interactions on a global scale, we have begun to curate
the published literature for protein interaction information. In addition to
our curational effort, Myriad has also developed a high- throughput version
of the yeast two- hybrid system to identify protein- protein interactions Proteome Analysis, European Bioinformatics Institute http://www.ebi.ac.uk/proteome/
set up to provide comprehensive statistical and comparative analyses of the predicted proteomes of fully sequenced organisms. The analysis is compiled using
InterPro, CluSTr and GO [Gene Ontology], and is performed on the
non- redundant complete proteome sets of SWISS-PROT and TrEMBL entries. PROSITE, Swiss Institute of
Bioinformatics http://www.expasy.ch/prosite/
A database of protein families and
domains. It consists of biologically significant sites, patterns and profiles
that help to reliably identify to which known protein family (if any) a new
sequence belongs PUMA, Phylogeny of Unicellular
organisms Metabolic pathways Alignments SEE WIT which supersedes
PUMA. PathDB, National Center for
Genome Research, US http://www.ncgr.org/pathdb/
A functional prototype research tool for biochemistry and functional genomics.
One of the key underlying philosophies of our project is to capture discrete
metabolic steps. This allows us to build tools to construct metabolic networks
de novo from a set of defined steps. PathDB is not simply a data repository but a system around which tools
can be created for building, visualizing, and comparing metabolic networks. Pfam
(from SWISS-PROT and TrEMBL) http://pfam.wustl.edu/
and various European mirror sites including EBI, UK http://www.sanger.ac.uk/Software/Pfam/
and Sweden http://www.cgr.ki.se/Pfam/
A database of multiple alignments of
protein domains or conserved protein regions. Hopefully they represent some
evolutionary conserved structure, which has implications for the protein's
function. Pfam is actually formed in two separate ways. Pfam-A are accurate
human crafted multiple alignments whereas Pfam-B is an automatic clustering of
the rest of SWISS- PROT and TrEMBL using the program Domainer Presage Collaborative Resource
for structural genomics, UC-Berkeley, US http://presage.berkeley.edu/
Provides a database of proteins, each
of which has a collection of annotations reflecting current experimental status,
structural assignments models, and suggestions. A tool for scientists to keep
track of structural knowledge of their proteins of interest. Derived from
SWISS-PROT and TrEMBL. Prints, University College
London, UK http://www.biochem.ucl.ac.uk/bsm/dbbrowser/PRINTS/PRINTS.html
Compendium of protein fingerprints. ProClass, NBRF Georgetown Univ.
Medical Center, US http://www-nbrf.georgetown.edu/gfserver/proclass.html
A non- redundant protein database
organized according to family relationships as defined collectively by ProSite
patterns and PIR superfamilies. The ProClass database can facilitate protein
family information retrieval, unveil domain and family relationships, and
classify multi- domained proteins, by combining global and motif similarities
into a single family organization scheme. ProDom, INRA, France http://protein.toulouse.inra.fr/prodom.html
Protein domain database. ProtFam, MIPS, Germany http://www.mips.biochem.mpg.de/proj/protfam/
A curated protein classification database. In a joint effort, MIPS and PIR-
NBRF classify sequences into superfamilies and families and
annotate homology domains proWeb Project, Fred Hutchinson
Cancer Research Center, US http://www.proweb.org/
Web- based protein family documentation,
links to protein and protein families databases and links to specific protein
family websites PubMed Central, NCBI http://www4.ncbi.nlm.nih.gov/PubMed/
Medline PubRef See CrossRef REBASE, Restriction Enzyme DataBase, New England Biolabs http://rebase.neb.com/rebase/rebcit.html
A collection of information about restriction enzymes and related proteins. It
contains published and unpublished references, recognition and cleavage sites,
isoschizomers, commercial availability, methylation sensitivity, crystal and
sequence data. DNA methyltransferases, homing endonucleases, nicking enzymes,
specificity subunits and control proteins are also included. Putative DNA
methyltransferases and restriction enzymes, as predicted from analysis of
genomic sequences, are also listed. REBASE is updated daily and is constantly
expanding. RESID, National Biomedical
Research Foundation, US http://www-nbrf.georgetown.edu/pirwww/dbinfo/resid.html
A database of protein post- translational modifications with descriptive, chemical, structural and
bibliographic information. Designed to assist users in interpreting the features
annotations for active sites, covalent binding sites, modified sites, and cross-
links in the Protein Sequence Database and convey chemical information
with more detail and precision than is possible in a sequence database RGD Rat Genome Database, Medical
College of Wisconsin, US http://rgd.mcw.edu/
is the [Goal is] establishment of a Rat Genome Database, to collect, consolidate, and integrate data generated from ongoing rat genetic and genomic research efforts and make these data widely available to the scientific community. A secondary, but critical goal is to provide curation of mapped positions for quantitative trait loci, known mutations and other phenotypic data. RHdb Radiation Hybrid Database, EBI, UK http://corba.ebi.ac.uk/RHdb/
Database of raw data used in constructing radiation
hybrid maps. This includes STS data, scores, experimental conditions,
and extensive cross references. RNA Abundance Database (RAD), CBIL, Univ. of
Pennsylvania, US http://www.cbil.upenn.edu/rad2/servlet
A public gene expression database designed to hold data from array-based
(microarrays, high-density oligo arrays, macroarrays) and nonarray- based (SAGE)
experiments. The ultimate goal is to allow comparative analysis of experiments
performed by different laboratories using different platforms and investigating
different biological systems. To achieve this goal, RAD contains: precise
descriptions of the experiments and distinctions between raw data and processed
results. In addition, a gene index is used to integrate array elements and gene
tags. The selection of experiments to include in RAD will be directed by our
research interests and those of our collaborators such as hematopoiesis. RNA World, IMB, Jena, Germany http://www.imb-jena.de/RNA.html
Links on RNA related topics. RNAi Database, Cornell Univ., Cold Spring Harbor Lab,
US http://formaggio.cshl.org/~marco/fabio/index.html
results of the RNAi analysis of genes expressed in the ovary of C. elegans,
as described in in RNAi
analysis of genes expressed in the ovary of Caenorhabditis elegans F. Piano, A.
Schetter, M Mangone, L. Stein, and K.J. Kemphues, published in the November
issue of Current Biology. Rat Genome Data, Jackson Lab, US
http://www.informatics.jax.org/rat/index.shtml RatMap Rat Genome Database,
Goteborg University, Sweden http://ratmap.gen.gu.se/
Locus queries, homology (mouse/rat)
nomenclature, linkage and physical maps, gene mapping data. RefSeq Reference Sequences,
NCBI, US http://www.ncbi.nlm.nih.gov/LocusLink/refseq.html
Curated sequence data and related
information for the community to use as a standard. Whereas GenBank is a
repository of sequences, the RefSeq database will be a non-redundant set of
reference sequences, including constructed genomic contigs, mRNAs, proteins,
and, in the future, entire chromosomes. RefSeq records are made available in
three 'status' levels: predicted, provisional and reviewed. Reviewed records
represent a compilation of our current knowledge of a gene and its transcripts.
During the review process we integrate additional information, when available,
such as sequence data, publications, nomenclature, and feature annotations from
multiple GenBank records, the Human Gene Nomenclature Committee, and Online
Mendelian Inheritance in Man. The initial release of RefSeq records includes
human mRNA and protein reference sequences. The current scope is limited to
human sequences but other organisms will be added in the future. Regulon DB: Transcriptional Regulation in E. coli,
Centro de Investigacion sobre Fijacion de Nitrogeno, UNAM A.P., Mexico http://www.cifn.unam.mx/Computational_Biology/regulondb/
Predictions for regulatory proteins, binding sites and operons. Research Collaboratory for Structural
Bioinformatics RCSB See Protein DataBank SAGEmap, NCBI, US http://www.ncbi.nlm.nih.gov/SAGE/
Serial Analysis of Gene Expression, or
SAGE, is an experimental technique designed to gain a quantitative measure of
gene expression. The SAGE technique itself includes several steps utilizing
molecular biological, DNA sequencing and bioinformatics techniques. These steps
have been used to produce 9 or 10 base "tags", which are then, in
some manner, assigned gene descriptions SBASE Protein Domain Library,
ICGEB, International Centre for Genetic Engineering and Biotechnology, Italy http://www3.icgeb.trieste.it/~sbasesrv/
Annotated protein sequence segments
(structural, functional, ligand binding and topogenic). Designed to facilitate
detection of domain homologies. SBIR Small Business Innovation Research Awards SEE CRISP SCOP: Structural Classification of
Proteins, University of Cambridge UK http://scop.berkeley.edu/
SCOP mirrors
Reference: Murzin A. G., Brenner S. E., Hubbard T., Chothia C. (1995). SCOP:
a structural classification of proteins database for the investigation of
sequences and structures. J. Mol. Biol. 247, 536-540. SGD Saccharomyces Genome
Database, Stanford University http://genome-www.stanford.edu/Saccharomyces/ SGD Worm-Yeast Protein Comparison,
Stanford University, US http://genome-www.stanford.edu/Saccharomyces/worm/
Entire complement of predicted proteins
from the nematode C. elegans ("worm") and budding yeast S.
cerevisiae ("yeast") genomes. SMART (a Simple Modular Architecture Research Tool) EMBL, Heidelberg,
Germany http://smart.embl-heidelberg.de/
Allows the identification and annotation of genetically mobile domains and the analysis of domain architectures. More than 500 domain families found in signalling, extracellular and
chromatin- associated proteins are detectable. These domains are extensively annotated with respect to phyletic distributions, functional class, tertiary
structures and functionally important residues. Each domain found in a non-
redundant protein database as well as search parameters and taxonomic information are stored in a relational database system. User interfaces to this database allow searches for
proteins containing specific combinations of domains in defined taxa. SNP Consortium Data Release, SNP
Consortium http://snp.cshl.org/data/
The current [9th] release consists of 1,034,034 SNPs, all of which have
been anchored to the human genome by "in silico" mapping to the
genomic working draft SPAD Signaling Pathway Database,
Institute of Genetic Resources, Kyushu Univ., Japan http://www.grt.kyushu-u.ac.jp/eny-doc/index.html
Integrated database for genetic
information and signal transduction systems. SRS Sequence Retrieval System http://www.lionbio.co.uk/publicsrs.html
URL has a list of public SRS servers, including EBI, DDBJ,
INFOBIOGEN,
EMBL SRS, developed initially as an academic system,
probably the best biological database browsing tool available. SRS allows you to browse the contents of databases through a web interface, exploring links to other databases and launching other programs on the retrieved database records. STACK Sequence Tag Alignment and Consensus Knowledgebase,
South African National Bioinformatics Institute, SANBI, South Africa http://www.sanbi.ac.za/Dbases.html
Mirrored at MRC, UK. Project aims to generate comprehensive representation of
the sequence of each of the expressed genes in the human genome, by extensive
processing of gene fragments to make accurate alignments, highlight diversity
and provide a carefully joined set of consensus sequences for each gene.. STACK
is organised via tissue subdivisions as well as a whole body index. A cleaned up
version of the EST data is represented by Sanigene, A clean consensus entry
database. Each STACK release contains sequences obtained from public submissions
to GenBank dbEST distributed by NCBI. The comprehensive consensus includes
publicly available data and a non- redundant integration of tissue datasets. SWISS 2D PAGE, Swiss Institute
of Bioinformatics http://www.expasy.ch/ch2d/
Data
on proteins identified on various 2-D PAGE reference maps. SWISS 3D Image, ExPASy, Switzerland http://www.expasy.ch/sw3d/
An image database which strives to provide high quality pictures of
biological macromolecules with known three-dimensional structure. The
database contains mostly images of experimentally elucidated structures, but
also provides views of well accepted theoretical protein models. SWISS-PROT, ExPASy (Expert
Protein Analysis System) Swiss Institute of Bioinformatics http://www.expasy.ch/sprot/sprot-top.html
An annotated protein sequence database
maintained by the Department of Medical Biochemistry of the University of Geneva
and the EMBL Data Library. Saccharomyces Genome Deletion
Project http://sequence-www.stanford.edu/group/yeast_deletion_project/deletions3.html Stanford MicroArray Database, Stanford Univ., US
http://genome-www4.stanford.edu/MicroArray/SMD/
Stores raw and normalized data from microarray experiments, as well as their
corresponding image files. In addition, SMD provides interfaces for data
retrieval, analysis and visualization. Includes a biological ontology. TAMBIS, University of Manchester UK http://img.cs.man.ac.uk/tambis/
Transparent Access to Multiple Bioinformatics Information Sources. Aims to
provide transparent information retrieval and filtering from biological
information services by building a homogenising layer on top of the
different sources. This layer uses a mediator and many source wrappers
to create the illusion of one all encompassing data source. TBASE Transgenic/Targeted Mutation
Database, Jackson Laboratory, US http://tbase.jax.org/
Since
development of the technology to manipulate the germline of animals over a
decade ago, a large number of transgenic animals have been produced worldwide
for use in both basic and applied research. Additionally, development of gene
targeting protocols involving homologous recombination in mouse embryonic stem
cells has resulted in a considerable number of mutant lines with specific
phenotypes and well-defined DNA structural changes. TBASE is an attempt to
organize information on transgenic animals and targeted mutations generated and
analyzed worldwide. TIGR Gene Indices http://www.tigr.org/tigr-scripts/nhgi_scripts/tgi_blast.pl
TIGR Unique Gene Indices include Tentative Consensus and Singleton EST
sequences and can be searched with either a nucleotide (default) or peptide
query sequence. TIGR Microbial Database, TIGR, US http://www.tigr.org/tdb/mdb/mdbcomplete.html
A listing of published microbial
genomes and chromosomes and those in
progress. TRANSFAC, GBF - Gesellschaft
für Biotechnologische Forschung GmbH, Germany http://transfac.gbf.de/TRANSFAC/
Compiles
data about gene regulatory DNA sequences and protein factors binding to and
acting through them. On this basis, programs are developed that help to identify
putative promoter or enhancer structures and to suggest their features. TRIPLES TRansposon-Insertion
Phenotypes, Localization and Expression in Saccharomyces, Yale
Univ., US http://ycmi.med.yale.edu/YGAC/triples.htm
Defined
mutant alleles for the analysis of disruption phenotypes, protein localization
and gene expression in Saccharomyces
cerevisiae. TrEMBL, Swiss Institute of
Bioinformatics, European Bioinformatics Institute UK http://www.expasy.ch/sprot/
A
computer- annotated supplement of SWISS- PROT that contains all the translations
of EMBL nucleotide sequence entries not yet integrated in SWISS- PROT. Taxonomy, NCBI, US
See Nomenclature ToxExpress See under GeneExpress Transgenic and Targeted Mutant
Animal Database, Johns Hopkins Univ., US now available through TBASE. Transpath, Frank G. Schacherer
(as a Ph. D. project) GBF Braunschweig, Germany http://193.175.244.148/
An extension module to the TRANSFAC
database on transcription factors and their binding sites. It focuses on
pathways involved in the regulation of transcription factors in different
species, mainly human, mouse and rat. Elements of the relevant signal
transduction pathways like hormones, receptors, enzymes and transcription
factors are stored together with information about their interaction and
references in an object-oriented database. Transterm, Univ. of Otago, New
Zealand http://biochem.otago.ac.nz:800/chrisb/home_page.html
Database of sequence contexts about the
stop and start codons of many species found in GenBank. TransTerm also contains
codon usage data for these same species and summary statistics for the sequences
analysed. UM-BBD University of Minnesota Biocatalysis/Biodegradation Database,
US http://umbbd.ahc.umn.edu/index.html
Information on microbial biocatalytic reactions and biodegradation pathways for
primarily xenobiotic, chemical compounds. The goal of the UM-BBD is to provide
information on microbial enzyme-catalyzed reactions that are important for
biotechnology. The reactions covered are studied for basic understanding of
nature, biocatalysis leading to specialty chemical manufacture, and
biodegradation of environmental pollutants. Individual reactions and metabolic
pathways are presented with information on the starting and intermediate
chemical compounds, the organisms that transform the compounds, the enzymes, and
the genes. UniGene, NCBI, US http://www.ncbi.nlm.nih.gov/UniGene/index.html
An experimental system for
automatically partitioning GenBank sequences into a non- redundant set of gene-
oriented clusters. Each UniGene cluster contains sequences that represent a
unique gene, as well as related information such as the tissue types in which
the gene has been expressed and map location. Well- characterized genes and ESTs. UniVec, NCBI, US http://www.ncbi.nlm.nih.gov/VecScreen/UniVec.html
A database that can be used to quickly identify segments within nucleic acid
sequences which may be of vector origin (vector contamination)
... In addition to vector sequences, UniVec also contains sequences for those
adapters, linkers and primers commonly used in the process of cloning cDNA or
genomic DNA. V Base: the database of human antibody genes, Centre for Protein
Engineering, Medical Research Council, UK http://www.mrc-cpe.cam.ac.uk/imt-doc/public/A
comprehensive directory of all human germline variable region sequences compiled from over a thousand published
sequences, including those in the current releases of the Genbank and EMBL data libraries. Variome, Structural Bioinformatics, Inc. http://www.strubix.com/variOSP.htma series of 3-D protein modules derived from the DNA sequences of known disease targets. Each Variome™
module is composed of variant clinical genetic sequences isolated from
tens- of- thousands of individual patient samples. Currently, there are two comprehensive database modules in Variome™ : 1) the HIV Protease module and 2) the HIV Reverse Transcriptase module.
Provides key insights into the meaningful interactions between a drug and its polymorphic targets. VecScreen, NCBI, US http://www.ncbi.nlm.nih.gov/VecScreen/VecScreen.html
A system for quickly identifying
segments of a nucleic acid sequence that may be of vector origin. NCBI developed
VecScreen to combat the problem of vector contamination in public sequence
databases. Virgil, InfoBiogen, France http://www.infobiogen.fr/services/virgil/home.html
A database of rich links for data browsing, data analysis and database
interconnection, with a focus on human data. It contains more than 40\,000 rich
links from 5 major databases: SWISS- PROT, GenBank, PDB, GDB and OMIM.
Virgil uses an object- oriented database engine: Eyedb.
Virgil data
model was designed to comprehensively describe a link between two biological
objectsWIT2 What Is There, Argonne
National Lab, US http://wit.mcs.anl.gov/WIT2/
Attempts to produce metabolic
reconstructions (models of the metabolism of the organism derived from sequence,
biochemical, and phenotypic data) for sequenced (or partially sequenced)
genomes. For each organism, table connecting genes (ORFs) to hypothesized
functional roles is included. Worm Chip Directory, Stanford
University, US http://cmgm.stanford.edu/~kimlab/wmdirectorybig.html Worm PD, Proteome, Inc. http://www.proteome.com/DB-demo/intro-to-WormPD.html
Worm Proteome Database: Caenorhabditis elegans XREFdb, NCBI http://www.ncbi.nlm.nih.gov/XREFdb/index.html
Component of the XREF project devoted
to cross- referencing the genetics of model organisms with mammalian phenotypes and accelerating
the identification of genes mutated in human diseases. YPD Yeast Proteomics Database,
Proteome, US http://www.proteome.com S.
cerevisiae Proteome Database ZFIN Zebrafish Information Network,
University of Oregon, US http://zfish.uoregon.edu/ Software
includes BEAUTY, BLAST, CLUSTALW, DBGET, DBSearching, browsing and analysis
tools, Dbsolve, Entrez, ExPASy, Fasta, Gene Identification Software Sites GRAIL,
Gapped BLAST, MedMiner, Proteomic tools, PSI-BLAST, SMART (Simple Modular
Architecture Research Tool), SWISS-Model, Yeast Tools, WWW Promoter Scan Amino Acid Analysis Server, EMBL
[PROPSEARCH] http://www.embl.org/aaa.html ArrayDB, NHGRI, US http://genome.nhgri.nih.gov/arraydb/
LIMS (Laboratory Information Management System) software for
managing and analyzing large-scale expression database. Information stored
in ArrayDB is used to provide integrated gene expression reports by linking
array target sequences with NCBI’s Entrez retrieval system, UniGene and
KEGG pathway views. Designed to store information on hybridization targets
(cDNA clones).Production of arrays begins with the selection of the ‘probes’ to be
printed on the array. In many cases, these are chosen directly from
databases including GenBank, dbEST and UniGene…Additionally, full- length
cDNAs, collections of partially sequenced cDNAs (or ESTs) or randomly chosen
cDNAs from any library of interest can be used. Arrays for higher eukaryotes
are typically based on the EST portions of these projects, whereas for yeast
and prokaryotes, probes are usually generated by amplifying genomic DNA with
gene-specific primers. [DJ Duggan et al "Expression profiling using
cDNA microarrays" Nature Genetics 21(1s): 10-14, Jan 1999] Given the expense of obtaining clones, producing DNA from them, and
printing them, it is usually preferable to produce arrays with a low
redundancy of representation, so as to survey the broadest possible set of
genes. In this regard, the human UniGene database represents an excellent
model of the kind of informational base one needs both to choose clones and
to evaluate expression profiles. ..On the other hand, no other organisms
have such a well- developed EST (expressed sequence tag) database, a
limitation given that cDNA microarrays also permit the ‘assay’ of
uncharacterized cDNAs (which may represent genes with informative expression
patterns… cDNA arrays are produced by spotting PCR products representing
specific genes onto a matrix. Printing is carried out by a robot. [DJ Duggan
et al "Expression profiling using cDNA microarrays" Nature
Genetics 21(1s): 10-14, Jan 1999] See also cDNA BEAUTY: BLAST Enhanced Alignment
Utility: An enhanced version of the NCBI's BLAST database search tool.
BEAUTY, when used to search three new custom sequence databases that we have
developed, incorporates information on sequence family membership, the location
of the conserved domains, and the locations of any annotated domains and sites
directly into BLAST search results. These enhancements make it much easier to
detect weak, but functionally significant, matches in BLAST database searches.
http://searchlauncher.bcm.tmc.edu:9331/seq-search/Help/beauty.html BLAST (Basic Local Alignment Search
Tool): Software program from NCBI for searching public databases for
homologous sequences or proteins. Designed to explore all available sequence
databases regardless of whether query is protein or DNA. http://www.ncbi.nlm.nih.gov/BLAST/ CLUSTALW at EBI, UK http://www2.ebi.ac.uk/clustalw/ Cn3Dhttp://www.ncbi.nlm.nih.gov/Structure/CN3D/cn3d.shtml
A helper application for your web browser that allows you to view 3-dimensional
structures from NCBI's Entrez
retrieval service. Electronic PCR, NCBI http://www.ncbi.nlm.nih.gov/STS
PCR-based STSs have been used as
landmarks for construction of various types of genomic maps. Using e-PCR these
sites can be detected in DNA sequences, potentially allowing their map locations
to be determined. FASTA: Software program, from
the University of Virginia, used to scan a protein or DNA sequence library for
similar sequences. http://fasta.bioch.virginia.edu/ GeneParser http://beagle.colorado.edu/~eesnyder/GeneParser.html
a program for the identification of protein coding regions in genomic DNA
sequences. [Eric Snyder] GRAIL: Genome Recognition and
Assembly Internet Link software http://compbio.ornl.gov/Grail-1.3/ GeneSpring™ , Silicon Genetics, US http://www.sigenetics.com/Products/GeneSpring/index.html
Software, an analytical workbench enabling scientists to visualize and
manipulate gene expression data. Experimental data from microarrays,
Affymetrix chips, SAGE, or any technique that associates numbers with genes
can easily be imported for rigorous analysis MedMiner http://discover.nci.nih.gov
Text-mining tool for gene expression
profiling ORF Finder, NCBI, US http://www.ncbi.nlm.nih.gov/gorf/gorf.html
Gene prediction. PredictProtein Server http://www.embl-heidelberg.de/predictprotein/predictprotein.html
Service for sequence analysis and
protein structure prediction. A Neural Network based prediction
server, which automatically builds a multiple sequence alignment from the most
recent version of SwissProt. Ab initio secondary structure prediction. PSA Protein Structure Predicter
Server, BMERC, Boston Univ. US http://bmerc-www.bu.edu/psa/
Predicts probable secondary structures
and folding classes for a given amino acid sequence. Protein Explorer http://www.umass.edu/microbio/chime/explorer/ |