Table 1. Molecular Biology Database Collectiona


Database name

Full name and/or description


1. Nucleotide Sequence Databases
1.1. International Nucleotide Sequence Database Collaboration
1 DDBJ—DNA Data Bank of Japan All known nucleotide and protein sequences
2 EMBL Nucleotide Sequence Database All known nucleotide and protein sequences
3 GenBank® All known nucleotide and protein sequences
1.2. DNA sequences: genes, motifs and regulatory sites
1.2.1. Coding and coding DNA
403 ACLAME A classification of genetic mobile elements
30 CUTG Codon usage tabulated from GenBank
480 Genetic Codes Genetic codes in various organisms and organelles
668 Entrez Gene Gene-centered information at NCBI
495 HERVd Human endogenous retrovirus database
687 Hoppsigen Human and mouse homologous processed pseudogenes
294 Imprinted Gene Catalogue Imprinted genes and parent-of-origin effects in animals
512 Islander Pathogenicity islands and prophages in bacterial genomes
343 MICdb Prokaryotic microsatellites
707 NPRD Nucleosome positioning region database
47 STRBase Short tandem DNA repeats database
5 TIGR Gene Indices Organism-specific databases of EST and gene sequences
48 Transterm Codon usage, start and stop signals
6 UniGene Non-redundant set of eukaryotic gene-oriented clusters
320 UniVec Vector sequences, adapters, linkers and primers used in DNA cloning, can be used to check for vector contamination
302 VectorDB Characterization and classification of nucleic acid vectors
305 Xpro Eukaryotic protein-encoding DNA sequences, both intron-containing and intron-less genes
1.2.2. Gene structure, introns and exons, splice sites
414 ASAP Alternative spliced isoforms
28 ASD Alternative splicing database at EBI, includes three databases AltSplice, AltExtron and AEdb
10 ASDB Alternative splicing database: protein products and expression patterns of alternatively spliced genes
639 ASHESdb Alternatively spliced human genes by exon skipping database
450 EASED Extended alternatively spliced EST database
667 ECgene Genome annotation for alternative splicing
631 EDAS EST-derived alternative splicing database
34 ExInt Exon–intron structure of eukaryotic genes
36 HS3D Homo sapiens splice sites dataset
238 Intronerator Alternative splicing in C.elegans and C.briggsae
46 SpliceDB Canonical and non-canonical mammalian splice sites
746 SpliceInfo Modes of alternative splicing in human genome
580 SpliceNest A tool for visualizing splicing of genes from EST data
1.2.3. Transcriptional regulator sites and transcription factors
231 ACTIVITY Functional DNA/RNA site activity
31 DBTBS Bacillus subtilis promoters and transcription factors
663 DoOP Database of orthologous promoters: chordates and plants
106 DPInteract Binding sites for E.coli DNA-binding proteins
33 EPD Eukaryotic promoter database
494 HemoPDB Hematopoietic promoter database: transcriptional regulation in hematopoiesis
516 JASPAR PSSMs for transcription factor DNA-binding sites
700 MAPPER Putative transcription factor binding sites in various genomes
40 PLACE Plant cis-acting regulatory DNA elements
41 PlantCARE Plant promoters and cis-acting regulatory elements
563 PlantProm Plant promoter sequences for RNA polymerase II
566 PRODORIC Prokaryotic database of gene regulation networks
42 PromEC E.coli promoters with experimentally identified transcriptional start sites
246 SELEX_DB DNA and RNA binding sites for various proteins, found by systematic evolution of ligands by exponential enrichment
227 TESS Transcription element search system
756 TRACTOR db Transcription factors in gamma-proteobacteria database
345 TRANSCompel Composite regulatory elements affecting gene transcription in eukaryotes
340 TRANSFAC Transcription factors and binding sites
757 TRED Transcriptional regulatory element database
49 TRRD Transcription regulatory regions of eukaryotic genes
2. RNA sequence databases
229 16S and 23S rRNA Mutation Database 16S and 23S ribosomal RNA mutations
230 5S rRNA Database 5S rRNA sequences
411 Aptamer database Small RNA/DNA molecules binding nucleic acids, proteins
232 ARED AU-rich element-containing mRNA database
378 Mobile group II introns A database of group II introns, self-splicing catalytic RNAs
463 European rRNA database All complete or nearly complete rRNA sequences
490 GtRDB Genomic tRNA database
236 Guide RNA Database RNA editing in various kinetoplastid species
76 HIV Sequence Database HIV RNA sequences
689 HuSiDa Human siRNA database
237 HyPaLib Hybrid pattern library: structural elements in classes of RNA
379 IRESdb Internal ribosome entry site database
529 microRNA Registry Database of microRNAs (small non-coding RNAs)
380 NCIR Non-canonical interactions in RNA structures
381 ncRNAs Database Non-coding RNAs with regulatory functions
705 NONCODE A database of non-coding RNAs
240 PLANTncRNAs Plant non-coding RNAs
564 Plant snoRNA DB snoRNA genes in plant species
723 PolyA_DB A database of mammalian mRNA polyadenylation
242 PseudoBase Database of RNA pseudoknots
382 Rfam Non-coding RNA families
244 RISSC Ribosomal internal spacer sequence collection
630 RNAdb Mammalian non-coding RNA database
245 RNA Modification Database Naturally modified nucleosides in RNA
43 RRNDB rRNA operon numbers in various prokaryotes
629 siRNAdb siRNA database and search engine
247 Small RNA Database Small RNAs from prokaryotes and eukaryotes
248 SRPDB Signal recognition particle database
754 SSU rRNA Modification Database Modified nucleosides in small subunit rRNA
383 Subviral RNA Database Viroids and viroid-like RNAs
249 tmRNA Website tmRNA sequences and alignments
250 tmRDB tmRNA database
251 tRNA sequences tRNA viewer and sequence editor
252 UTRdb/UTRsite 5'- and 3'-UTRs of eukaryotic mRNAs
3. Protein sequence databases
3.1. General sequence databases
163 EXProt Sequences of proteins with experimentally verified function
542 NCBI Protein database All protein sequences: translated from GenBank and imported from other protein databases
714 PA-GOSUB Protein sequences from model organisms, GO assignment and subcellular localization
194 PIR-PSD Protein information resource protein sequence database, has been merged into the UniProt knowledgebase
370 PIR-NREF PIR's non-redundant reference protein database
565 PRF Protein research foundation database of peptides: sequences, literature and unnatural amino acids
197 Swiss-Prot Now UniProt/Swiss-Prot: expertly curated protein sequence database, section of the UniProt knowledgebase
198 TrEMBL Now UniProt/TrEMBL: computer-annotated translations of EMBL nucleotide sequence entries: section of the UniProt knowledgebase
775 UniParc UniProt archive: a repository of all protein sequences, consisting only of unique identifiers and sequence
318 UniProt Universal protein knowledgebase: merged data from Swiss-Prot, TrEMBL and PIR protein sequence databases
776 UniRef UniProt non-redundant reference database: clustered sets of related sequences (including splice variants and isoforms)
3.2. Protein properties
221 AAindex Physicochemical properties of amino acids
729 ProNIT Thermodynamic data on protein–nucleic acid interactions
280 ProTherm Thermodynamic data for wild-type and mutant proteins
772 TECRdb Thermodynamics of enzyme-catalyzed reactions
3.3. Protein localization and targeting
444 DBSubLoc Database of protein subcellular localization
375 NESbase Nuclear export signals database
376 NLSdb Nuclear localization signals
704 NMPdb Nuclear matrix associated proteins database
706 NOPdb Nucleolar proteome database
734 PSORTdb Protein subcellular localization in bacteria
745 SPD Secreted protein database
587 THGS Transmembrane helices in genome sequences
589 TMPDB Experimentally characterized transmembrane topologies
3.4. Protein sequence motifs and active sites
374 ASC Active sequence collection: biologically active peptides
203 Blocks Alignments of conserved regions in protein families
440 CSA Catalytic site atlas: active sites and catalytic residues in enzymes of known 3D structure
438 COMe Co-ordination of metals etc.: classification of bioinorganic proteins (metalloproteins and some other complex proteins)
771 CopS Comprehensive peptide signature database
666 eBLOCKS Highly conserved protein sequence blocks
206 eMOTIF Protein sequence motif determination and searches
179 Metalloprotein Site Database Metal-binding sites in metalloproteins
209 O-GlycBase O- and C-linked glycosylation sites in proteins
717 PDBSite 3D structure of protein functional sites
187 Phospho.ELM S/T/Y protein phosphorylation sites (formerly PhosphoBase)
193 PROMISE Prosthetic centers and metal ions in protein active sites
215 PROSITE Biologically significant protein patterns and profiles
732 ProTeus Signature sequences at the protein N- and C-termini
3.5. Protein domain databases; protein classification
622 ADDA A database of protein domain classification
204 CDD Conserved domain database, includes protein domains from Pfam, SMART, COG and KOG databases
205 CluSTr Clusters of Swiss-Prot+TrEMBL proteins
671 FunShift Functional divergence between the subfamilies of a protein domain family
200 Hits A database of protein domains and motifs
207 InterPro Integrated resource of protein families, domains and functional sites
208 iProClass Integrated protein classification database
561 PIRSF Family/superfamily classification of whole proteins
212 PRINTS Hierarchical gene family fingerprints
210 Pfam Protein families: multiple sequence alignments and profile hidden Markov models of protein domains
727 PRECISE Predicted and consensus interaction sites in enzymes
214 ProDom Protein domain families
216 ProtoMap Hierarchical classification of Swiss-Prot proteins
567 ProtoNet Hierarchical clustering of Swiss-Prot proteins
740 S4 Structure-based sequence alignments of SCOP superfamilies
217 SBASE Protein domain sequences and tools
218 SMART Simple modular architecture research tool: signalling, extracellular and chromatin-associated protein domains
219 SUPFAM Grouping of sequence families into superfamilies
220 SYSTERS Systematic re-searching and clustering of proteins
199 TIGRFAMs TIGR protein families adapted for functional annotation
3.6. Databases of individual protein families
156 AARSDB Aminoacyl-tRNA synthetase database
308 ASPD Artificial selected proteins/peptides database
158 BacTregulators Transcriptional regulators of AraC and TetR families
364 CSDBase Cold shock domain-containing proteins
653 CuticleDB Structural proteins of Arthropod cuticle
658 DCCP Database of copper-chelating proteins
160 DExH/D Family Database DEAD-box, DEAH-box and DExH-box proteins
161 Endogenous GPCR List G protein-coupled receptors; expression in cell lines
162 ESTHER Esterases and other alpha/beta hydrolase enzymes
464 EyeSite Families of proteins functioning in the eye
166 GPCRDB G protein-coupled receptors database
679 gpDB G-proteins and their interaction with GPCRs
167 Histone Database Histone fold sequences and structures
169 Homeobox Page Homeobox proteins, classification and evolution
293 Hox-Pro Homeobox genes database
170 Homeodomain Resource Homeodomain sequences, structures and related genetic and genomic information
366 HORDE Human olfactory receptor data exploratorium
174 InBase Inteins (protein splicing elements) database: properties, sequences, bibliography
518 KinG—Kinases in Genomes S/T/Y-specific protein kinases encoded in complete genomes
519 Knottins Database of knottins—small proteins with an unusual ‘disulfide through disulfide’ knot
176 LGICdb Ligand-gated ion channel subunit sequences database
368 Lipase Engineering Database Sequence, structure and function of lipases and esterases
524 LOX-DB Mammalian, invertebrate, plant and fungal lipoxygenases
177 MEROPS Database of proteolytic enzymes (peptidases)
369 NPD Nuclear protein database
546 NucleaRDB Nuclear receptor superfamily
182 Nuclear Receptor Resource Nuclear receptor superfamily
183 NUREBASE Nuclear hormone receptors database
184 Olfactory Receptor Database Sequences for olfactory receptor-like molecules
185 ooTFD Object-oriented transcription factors database
188 PKR Protein kinase resource: sequences, enzymology, genetics and molecular and structural properties
759 PLPMDB Pyridoxal-5'-phosphate dependent enzymes mutations
609 ProLysED A database of bacterial protease systems
192 Prolysis Proteases and natural and synthetic protease inhibitors
224 REBASE Restriction enzymes and associated methylases
195 Ribonuclease P Database RNase P sequences, alignments and structures
573 RPG Ribosomal protein gene database
575 RTKdb Receptor tyrosine kinase sequences
309 S/MARt dB Nuclear scaffold/matrix attached regions
741 Scorpion Database of scorpion toxins
372 SDAP Structural database of allergenic proteins and food allergens
196 SENTRA Sensory signal transduction proteins
373 SEVENS 7-transmembrane helix receptors (G-protein-coupled)
248 SRPDB Proteins of the signal recognition particles
314 TrSDB Transcription factor database
399 VKCDB Voltage-gated potassium channel database
202 Wnt Database Wnt proteins and phenotypes
4. Structure Databases
4.1. Small molecules
646 ChEBI Chemical entities of biological interest
261 CSD Cambridge structural database: crystal structure information for organic and metal-organic compounds
265 HIC-Up Hetero-compound Information Centre—Uppsala
402 AANT Amino acid–nucleotide interaction database
111 Klotho Collection and categorization of biological compounds
113 LIGAND Chemical compounds and reactions in biological pathways
615 PDB-Ligand 3D structures of small molecules bound to proteins and nucleic acids
735 PubChem Structures and biological activities of small organic molecules
4.2. Carbohydrates
429 CCSD Complex carbohydrate structure database (CarbBank)
652 CSS Carbohydrate structure suite: carbohydrate 3D structures derived from the PDB
486 Glycan Carbohydrate database, part of the KEGG system
292 GlycoSuiteDB N- and O-linked glycan structures and biological sources
535 Monosaccharide Browser Space-filling Fischer projections of monosaccharides
300 SWEET-DB Annotated carbohydrate structure and substance information
4.3. Nucleic acid structure
272 NDB Nucleic acid-containing structures
273 NTDB Thermodynamic data for nucleic acids
387 RNABase RNA-containing structures from PDB and NDB
283 SCOR Structural classification of RNA: RNA motifs by structure, function and tertiary interactions
4.4. Protein structure
413 ArchDB Automated classification of protein loop structures
255 ASTRAL Sequences of domains of known structure, selected subsets and sequence–structure correspondences
288 BAliBASE A database for comparison of multiple sequence alignments
257 BioMagResBank NMR spectroscopic data for proteins and nucleic acids
384 CADB Conformational angles in proteins database
258 CATH Protein domain structures database
259 CE 3D protein structure alignments
260 CKAAPs DB Structurally similar proteins with dissimilar sequences
442 Dali Protein fold classification using the Dali search engine
385 Decoys ‘R’ Us Computer-generated protein conformations
447 DisProt Database of Protein Disorder: proteins that lack fixed 3D structure in their native states
448 DomIns Domain insertions in known protein structures
264 DSDBASE Native and modeled disulfide bonds in proteins
386 DSMM Database of simulated molecular motions
452 eF-site Electrostatic surface of Functional site: electrostatic potentials and hydrophobic properties of the active sites
674 GenDiS Genomic distribution of protein structural superfamilies
472 Gene3D Precalculated structural assignments for whole genomes
489 GTD Genomic threading database: structural annotations of complete proteomes
322 GTOP Protein fold predictions from genome sequences
360 Het-PDB Navi Hetero-atoms in protein structures
498 HOMSTRAD Homologous structure alignment database: curated structure-based alignments for protein families
267 IMB Jena Image Library Visualization and analysis of 3D biopolymer structures
502 IMGT/3Dstructure-DB Sequences and 3D structures of vertebrate immunoglobulins, T cell receptors and MHC proteins
268 ISSD Integrated sequence–structure database
269 LPFC Library of protein family core structures
270 MMDB NCBI's database of 3D structures, part of NCBI Entrez
456 E-MSD EBI's macromolecular structure database
331 ModBase Annotated comparative protein structure models
262 MolMovDB Database of macromolecular movements: descriptions of protein and macromolecular motions, including movies
274 PALI Phylogeny and alignment of homologous protein structures
275 PASS2 Structural motifs of protein superfamilies
557 PepConfDB A database of peptide conformations
276 PDB Protein structure databank: all publicly available 3D structures of proteins and nucleic acids
277 PDB-REPRDB Representative protein chains, based on PDB entries
278 PDBsum Summaries and analyses of PDB structures
619 PDB_TM Transmembrane proteins with known 3D structure
719 Protein Folding Database Experimental data on protein folding
282 SCOP Structural classification of proteins
284 Sloop Classification of protein loops
583 Structure Superposition Database Pairwise superposition of TIM-barrel structures
585 SWISS-MODEL Repository Database of annotated 3D protein structure models
285 SUPERFAMILY Assignments of proteins to structural superfamilies
584 SURFACE Surface residues and functions annotated, compared and evaluated: a database of protein surface patches
764 TargetDB Target data from worldwide structural genomics projects
401 3D-GENOMICS Structural annotations for complete proteomes
310 TOPS Topology of protein structures database
5. Genomics Databases (non-human)
5.1. Genome annotation terms, ontologies and nomenclature
73 Genew Human gene nomenclature: approved gene symbols
487 GO Gene ontology consortium database
389 GOA EBI's gene ontology annotation project
513 IUBMB Nomenclature database Nomenclature of enzymes, membrane transporters, electron transport proteins and other proteins
514 IUPAC Nomenclature database Nomenclature of biochemical and organic compounds approved by the IUBMB-IUPAC Joint Commission
515 IUPHAR-RD The International Union of Pharmacology recommendations on receptor nomenclature and drug classification
552 PANTHER Gene products organized by biological function
317 UMLS Unified medical language system
5.1.1. Taxonomy and Identification
78 ICB gyrB database for identification and classification of bacteria
297 NCBI Taxonomy Names of all organisms represented in GenBank
608 PANDIT Protein and associated nucleotide domains with inferred trees
299 RIDOM rRNA-based differentiation of medical microorganisms
243 RDP-II Ribosomal database project
301 Tree of Life Information on phylogeny and biodiversity
5.2. General genomics databases
7 COG Clusters of orthologous groups of proteins
650 COGENT Complete genome tracking: predicted peptides from fully sequenced genomes
337 CORG Comparative regulatory genomics: conserved non-coding sequence blocks
445 DEG Database of essential genes from bacteria and yeast
451 EBI Genomes EBI's collection of databases for the analysis of complete and unfinished viral, pro- and eukaryotic genomes
453 EGO Eukaryotic gene orthologs: orthologous DNA sequences in the TIGR gene indices
70 EMGlib Enhanced microbial genomes library: completely sequenced genomes of unicellular organisms
458 Entrez Genomes NCBI's collection of databases for the analysis of complete and unfinished viral, pro- and eukaryotic genomes
461 ERGOLight Integrated biochemical data on nine bacterial genomes: publicly available portion of the ERGO database
470 FusionDB Database of bacterial and archaeal gene fusion events
611 Genome Atlas DNA structural properties of sequenced genomes
484 Genome Information Broker DDBJ's collection of databases for the analysis of complete and unfinished viral, pro- and eukaryotic genomes
678 Genome Reviews Integrated view of complete genomes
75 GOLD Genomes online database: a listing of completed and ongoing genome projects
352 HGT-DB Putative horizontally transferred genes in prokaryotic genomes
223 Integr8 Functional classification of proteins in whole genomes
112 KEGG Kyoto encyclopedia of genes and genomes: integrated suite of databases on genes, proteins and metabolic pathways
528 MBGD Microbial genome database for comparative analysis
549 ORFanage Database of orphan ORFs (ORFs with no homologs) in complete microbial genomes
551 PACRAT Archaeal and bacterial intergenic sequence features
715 PartiGeneDB Assembled partial genomes for ~250 eukaryotic organisms
354 PEDANT Results of an automated analysis of genomic sequences
99 TIGR Microbial Database Lists of completed and ongoing genome projects with links to complete genome sequences
66 TIGR Comprehensive Microbial Resource Various data on complete microbial genomes: uniform annotation, properties of DNA and predicted proteins
311 TransportDB Predicted membrane transporters in complete genomes, classified according to the TC classification system
118 WIT3 What is there? Metabolic reconstruction for completely sequenced microbial genomes
5.3. Organism-specific databases
5.3.1. Viruses
473 HCVDB The hepatitis C virus database
497 HIV Drug Resistance Database HIV mutations that confer resistance to anti-HIV drugs
168 HIV Molecular Immunology Database HIV epitopes
365 HIV RT and Protease Sequence Database HIV reverse transcriptase and protease sequences
602 NCBI Viral Genomes Viral genome resource at NCBI
725 Poxvirus genomic sequences and gene annotation
750 T4-like genome database Sequences of T4-like bacteriophages from various sources
201 VIDA Homologous viral protein families database
761 VIPER Virus particle explorer: virus capsid structures
303 VirOligo Virus-specific oligonucleotides for PCR and hybridization
5.3.2. Prokaryotes
641 BacMap Picture atlas of annotated bacterial genomes
614 MetaGrowth Growth requirements of bacterial pathogens
720 PGTdb Prokaryotic growth temperature database Escherichia coli
415 ASAP A systematic annotation package for community analysis of E.coli and related genomes
428 CyberCell database A collection of data on E.coli K12 intended for mathematical modeling to simulate the bacterial cell
436 coliBase A database for E.coli, Salmonella and Shigella
437 Colibri E.coli genome database at Institut Pasteur
623 EchoBASE Post-genomic studies of Escherichia coli
462 Essential genes in E.coli First results of an E.coli gene deletion project
482 GenoBase E.coli genome database at Nara Institute
165 GenProtEC E.coli K12 genome and proteome database
555 PEC Profiling of E.coli chromosome
108 EcoCyc E.coli K12 genes, metabolic pathways, transporters and gene regulation
69 EcoGene Sequence and literature data on E.coli genes and proteins
116 RegulonDB Transcriptional regulation and operon organization in E.coli Bacillus subtilis
424 BSORF Bacillus subtilis genome database at Kyoto U.
89 NRSub Non-redundant Bacillus subtilis database at U. Lyon
96 SubtiList Bacillus subtilis genome database at Institut Pasteur Other bacteria
420 BioCyc Pathway/genome databases for many bacteria
426 CampyDB Database for Campylobacter genome analysis
433 ClostriDB Finished and unfinished genomes of Clostridium spp.
648 CIDB Chlamydia Interactive Database: gene expression data
68 CyanoBase Cyanobacterial genomes
521 LeptoList Leptospira interrogans genome
534 MolliGen Genomic data on mollicutes
733 PseudoCAP Pseudomonas aeruginosa genome database and community annotation project
94 RsGDB Rhodobacter sphaeroides genome
762 VirFact Bacterial virulence factors and pathogenicity islands
760 Virulence Factors Reference database for microbial virulence factors
5.3.3. Unicellular eukaryotes
409 ApiEST-DB EST sequences from various Apicomplexan parasites
439 CryptoDB Cryptosporidium parvum genome database
662 Diatom EST Database ESTs from two diatom algae, Thalassiosira pseudonana and Phaeodactylum tricornutum
446 DictyBase Universal resource for Dictyostelium discoideum
72 Full-Malaria Full-length cDNA library from erythrocytic-stage Plasmodium falciparum
328 GeneDB Curated database for various Sanger-sequenced genomes
698 LumbriBASE ESTs of the earthworm Lumbricus rubellus
91 PlasmoDB Plasmodium genome database
586 TcruziDB Trypanosoma cruzi genome database
359 ToxoDB Toxoplasma gondii genome database
5.3.4. Fungi Yeasts
635 AGD Ashbya gossypii genome database
617 CandidaDB Candida albicans genome database
645 Candida Genome Candida albicans genome database
441 CYGD MIPS Comprehensive yeast genome database
483 Génolevures A comparison of S.cerevisiae and 14 other yeast species
730 PROPHECY Profiling of phenotypic characteristics in yeast
576 SCMD Saccharomyces cerevisiae morphological database: micrographs of budding yeast mutants
577 SCPD Saccharomyces cerevisiae promoter database
357 SGD Saccharomyces genome database
25 TRIPLES Transposon-insertion phenotypes, localization and expression in Saccharomyces
306 YDPM Yeast deletion project and mitochondria database
342 Yeast Intron Database Ares lab database of splicesomal introns in S.cerevisiae
254 Yeast snoRNA Database Yeast small nucleolar RNAs
307 yMGV Yeast microarray global viewer
763 YRC PDR Yeast resource center public data repository Other fungi
425 CADRE Central Aspergillus data repository
435 COGEME Phytopathogenic fungi and oomycete EST database
533 MNCDB MIPS Neurospora crassa database
708 OGD Oomycete Genomics Database: ESTs and annotation
98 Phytophthora Functional Genomics Database ESTs and expression data from P.infestans and P.sojae
5.3.5. Invertebrates Caenorhabditis elegans
430 C.elegans Project Genome sequencing data at the Sanger Institute
238 Intronerator Introns and splicing in C.elegans and C.briggsae
570 RNAiDB RNAi phenotypic analysis of C.elegans genes
100 WILMA C.elegans annotation database
304 WorfDB C.elegans ORFeome
51 WormBase Data repository for C.elegans and C.briggsae: curated genome annotation, genetic and physical maps, pathways Drosophila melanogaster
71 FlyBase Drosophila sequences and genomic information
767 FlyBrain Database of the Drosophila nervous system
670 FlyMine Integration of insect genomic and proteomic data
467 FlyTrap Drosophila mutants created using GFP protein trap strategy
471 GadFly Genome annotation database of Drosophila
677 GeniSys Enhancer- and promoter-inserted mutants of Drosophila
774 DPDB Drosophila polymorphism database
449 Drosophila microarray project Data and tools for Drosophila gene expression studies
509 InterActive Fly Drosophila genes and their roles in development Other invertebrates
410 AppaDB A database on the nematode Pristionchus pacificus
643 BeetleBase Genome database of the beetle Tribolium castaneum
649 Ciliate IES-MDS Db Macro- and micronuclear genes in spirotrichous ciliates
434 CnidBase Cnidarian evolution and gene expression database
543 Parasitic nematode sequencing project
544 NEMBASE Nematode sequence and functional data database
726 PPNEMA Plant-parasitic nematode rRNAs
743 SilkDB Silkworm Bombyx mori ESTs, mutants, photographs
744 SilkSatDb A microsatellite database of the silkworm Bombyx mori
747 SpodoBase Genomics of the butterfly Spodoptera frugiperda
6. Metabolic Enzymes and Pathways; Signaling Pathways
6.1. Enzymes and Enzyme Nomenclature
421 BRENDA Enzyme names and biochemical properties
109 ENZYME Enzyme nomenclature and properties
459 Enzyme Nomenclature IUBMB Nomenclature Committee recommendations
613 EzCatDB Enzyme Catalytic Mechanism Database
508 IntEnz Integrated enzyme database and enzyme nomenclature
716 PDBrtf Representation of target families of enzymes in PDB
758 SCOPEC Mapping of catalytic function to domain structure
6.2. Metabolic Pathways
644 BioSilico Integrated access to various metabolic databases
112 KEGG Pathway Metabolic and regulatory pathways in complete genomes
114 MetaCyc Metabolic pathways and enzymes from various organisms
115 PathDB Biochemical pathways, compounds and metabolism
117 UM-BBD University of Minnesota biocatalysis and biodegradation database
6.3. Intermolecular Interactions and Signaling Pathways
633 3DID 3D interacting domains: domain–domain interactions in proteins with known 3D structures
405 aMAZE A system for the annotation, management, and analysis of biochemical and signalling pathway networks
103 BIND Biomolecular interaction network database
419 BioCarta Online maps of metabolic and signaling pathways
422 BRITE Biomolecular relations in information transmission and expression, part of KEGG
659 DDIB Database of domain interactions and binding
104 DIP Database of interacting proteins: experimentally determined protein–protein interactions
105 DRC Database of ribosomal crosslinks
329 GeneNet Database on gene network components
664 hp-DPI Database of protein interactions in Helicobacter pylori
688 HPID Human protein interaction database
507 IntAct project Protein–protein interaction data
770 Inter-Chain Beta-Sheets Protein–protein interactions mediated by interchain beta-sheet formation
510 InterDom Putative protein domain interactions
718 PDZBase Protein–protein interactions involving PDZ domains
749 Protein-protein interfaces Interacting residues in protein–protein interfaces in PDB
773 PINdb Proteins interacting in nucleus (human and yeast)
748 POINT Prediction of human protein–protein interactome
616 PSIbase Interaction of proteins with known 3D structures
612 Reactome A knowledgebase of biological pathways
571 ROSPath Reactive oxygen species (ROS) signaling pathway
395 STCDB Signal transductions classification database
582 STRING Predicted functional associations between proteins
341 TRANSPATH Gene regulatory networks and microarray analysis
7. Human and other Vertebrate Genomes
7.1. Model organisms, comparative genomics
63 ACeDB C.elegans, S.pombe and human genomic information
26 AllGenes Human and mouse gene, transcript and protein annotation
65 ArkDB Genome databases for farm and other animals
647 ChickVD Sequence variation in the chicken genome
286 Cre Transgenic Database Cre transgenic mouse lines with links to publications
660 DED Database of evolutionary distances
27 Ensembl Annotated information on eukaryotic genomes
465 FANTOM Functional annotation of mouse full-length cDNA clones
468 FREP Functional repeats in mouse cDNAs
673 GALA Genomic alignment, annotation and experimental results
347 GenetPig Genes controlling economic traits in pig
605 HomoloGene Automatically detected homologous genes in complete eukaryotic genomes
690 Inparanoid A database of eukaryotic orthologs
696 IPI International protein index: non-redundant sets of human, mouse and rat proteins
777 KaryotypeDB Karyotype and chromosome information for animal and plant species
400 KOG Eukaryotic orthologous groups of proteins
87 Mouse Genome Informatics Formerly mouse genome database
540 MTID Mouse transposon insertion database
703 NegProt Negative Proteome: a tool for comparison of complete proteomes
556 PEDE Pig EST data explorer: full-length cDNAs and ESTs
665 PhenomicDB Comparison of phenotypes of orthologous genes in human and model organisms
724 Polymorphix A database of sequence polymorphisms
93 Rat Genome Database Rat genetic and genomic data
625 RatMap Rat genome tools and data
751 TAED The adaptive evolution database: a phylogeny-based tool for comparative genomics
5 TIGR Gene Indices Organism-specific databases of EST and gene sequences
6 UniGene Unified clusters of ESTs and full-length mRNA sequences
319 UniSTS Unified view of sequence tagged sites with mapping data
783 VEGA Vertebrate genome annotation: a repository for manual annotation of finished vertebrate genome sequences
101 ZFIN Zebrafish information network
7.2. Human genome databases, maps and viewers
27 Ensembl Annotated information on eukaryotic genomes
404 AluGene Complete Alu map in the human genome
349 CroW 21 Human chromosome 21 database
55 GB4-RH Genebridge4 human radiation hybrid maps
56 GDB Human genes and genomic maps
57 GenAtlas Human genes, markers and phenotypes
350 GeneCards Integrated database of human genes, maps, proteins and diseases
348 GeneLoc Gene location database (formerly UDB—Unified database for human genome mapping)
327 GeneNest Gene indices of human, mouse, zebrafish, etc.
59 GenMapDB Mapped human BAC clones
35 Gene Resource Locator Alignment of ESTs with finished human sequence
324 HOWDY Human organized whole genome database
60 HuGeMap Human genome genetic and physical map data
77 Human BAC Ends Non-redundant human BAC end sequences
597 Human Genome Segmental Duplication Database Segmental duplications in the human genome
61 IXDB Physical maps of human chromosome X
697 L1Base Functional annotation and prediction of LINE-1 elements
54 Map Viewer Display of genomic information by chromosomal position
600 MGC Mammalian genome collection: full-length ORFs for human, mouse and rat genes
391 NCBI RefSeq Non-redundant collection of naturally occurring biological molecules
553 ParaDB Paralogy mapping in human genomes
62 RHdb Radiation hybrid map data
592 SKY/M-FISH and CGH Fluorescent images of chromosomes and cytogenetic data
4 STACK Sequence tag alignment and consensus knowledgebase
596 The Chromosome 7 Annotation Project A comprehensive description of human chromosome 7
684 TRBase Tandem repeats in the human genome
316 UCSC Genome Browser Genome assemblies and annotation
7.3. Human proteins
685 H-InvDB Full-length human cDNA clones
499 HPMR Human plasma membrane receptome: sequences, literature and expression data
500 HPRD Human protein reference database: domain architecture, post-translational modifications and disease association
37 HUNT Human novel transcripts: annotated full-length cDNAs
171 HUGE Human unidentified gene-encoded large (>50 kDa) protein and cDNA sequences
522 LIFEdb Localization, interaction and functions of human proteins
312 trome, trEST and trGEN: Databases of predicted human protein sequences
8. Human Genes and Diseases
8.1. General Databases
661 DG-CST Disease gene conserved sequence tags
683 HCAD Human chromosome aberration database: chromosomal breakpoints and affected genes
8 Homophila Drosophila homologs of human disease genes to
548 OMIA Online Mendelian inheritance in animals: a catalog of animal genetic and genomic disorders
143 OMIM Online Mendelian inheritance in man: a catalog of human genetic and genomic disorders
550 ORFDB Collection of ORFs that are sold by Invitrogen
554 PathBase European mutant mice histopathology database: images
146 PMD Compilation of protein mutant data
358 SOURCE Functional genomics resource for human, mouse and rat
8.2. Human Mutations Databases
8.2.1. General polymorphism databases
119 ALFRED Allele frequencies and DNA polymorphisms
416 BayGenomics Genes relevant to cardiovascular and pulmonary disease
654 Cypriot national mutation database Disease mutations in the Cypriot population
655 Database of Genomic Variants Human genomic variants: frequency, segmental duplications and genome assembly gaps
595 dbQSNP Quantification of SNP allele frequencies database
127 dbSNP Database of single nucleotide polymorphisms
669 FESD Functional element SNPs database: SNPs located within promoters, UTRs, etc., of human genes
496 HGVS Databases A compilation of human mutation databases
131 HGVbase Human genome variation database: curated human polymorphisms
133 HGMD Human gene mutation database
367 IPD Immuno polymorphism database
517 JSNP Japanese SNP database
45 rSNP Guide SNPs in regulatory gene regions
344 SNP Consortium database SNP Consortium data
626 SNPeffect Phenotypic effects of human coding SNPs
590 TopoSNP Topographic database of non-synonymous SNPs
755 TPMD Taiwan polymorphic microsatellite marker database
8.2.2. Cancer
122 Atlas of Genetics and Cytogenetics in Oncology and Haematology Cancer-related genes, chromosomal abnormalities in oncology and haematology, and cancer-prone diseases
593 Cancer Chromosomes Cytogenetic, clinical and reference information on cancer-related aberrations
431 CGED Cancer gene expression database
651 COSMIC Catalogue of somatic mutations in cancer: sequence data, samples and publications
126 Germline p53 Mutations Mutations in human tumor and cell line p53 gene
362 IARC TP53 Database Human TP53 somatic and germline mutations
152 MTB Mouse tumor biology database: tumor types, genes, classification, incidence, pathology
709 OncoMine Cancer microarray data by gene or cancer type
153 Oral Cancer Gene Database Cellular and molecular data for genes involved in oral cancer
148 RB1 Gene Mutation DB Mutations in the human retinoblastoma (RB1) gene
574 RTCGD Mouse retroviral tagged cancer gene database
579 SNP500Cancer Re-sequenced SNPs from 102 reference samples
149 SV40 Large T-Antigen Mutants Mutations in SV40 large tumor antigen gene
155 Tumor Gene Family Databases Cellular, molecular and biological data about genes involved in various cancers
8.2.3. Gene-, system- or disease-specific
768 ALPSbase Autoimmune lymphoproliferative syndrome database
120 Androgen Receptor Gene Mutations Database Mutations in the androgen receptor gene
123 BTKbase Mutation registry for X-linked agammaglobulinemia
594 CarpeDB Comprehensive database on the genetics of epilepsy
124 CASRDB Calcium-sensing receptor database: CASR mutations causing hypercalcemia and/or hyperparathyroidism
125 Cytokine Gene Polymorphism in Human Disease Cytokine gene polymorphism literature database
137 Collagen Mutation Database Human type I and type III collagen gene mutations
460 ERGDB Estrogen responsive genes database
164 FUNPEP Low-complexity peptides capable of forming amyloid plaque
363 GOLD.db Genomics of lipid-associated disorders database
129 GRAP Mutants of G-protein coupled receptors of family A
130 HaemB Factor IX gene mutations, insertions and deletions
491 HbVar Human hemoglobin variants and thalassemias
680 HAGR Human ageing genomic resources: genes related to ageing in humans and model organisms
134 Human p53/hprt, rodent lacI/lacZ databases Mutations at the human p53 and hprt genes; rodent transgenic lacI and lacZ mutations
135 Human PAX2 Allelic Variant Database Mutations in human PAX2 gene
136 Human PAX6 Allelic Variant Database Mutations in human PAX6 gene
506 INFEVERS Hereditary inflammatory disorder and familial mediterranean fever mutation data
139 KinMutBase Disease-causing protein kinase mutations
523 Lowe Syndrome Mutation Database Mutations causing Lowe oculocerebrorenal syndrome
142 NCL Mutation Database Polymorphisms in neuronal ceroid lipofuscinoses genes
144 PAHdb Mutations at the phenylalanine hydroxylase locus
559 PGDB Prostate and prostatic diseases gene database
145 PHEXdb PHEX mutations causing X-linked hypophosphatemia
147 PTCH1 Mutation Database Mutations and SNPs found in PTCH1 gene
SCAdb Spinocerebellar ataxia candidate gene database
632 T1Dbase A resource for type 1 diabetes research
752 The Autism Chromosome Rearrangement Database Curated collection of genomic features related to autism
753 The Lafora Database Mutations and polymorphisms associated with Lafora progressive myoclonus epilepsy
9. Microarray Data and other Gene Expression Databases
634 5'SAGE 5'-end serial analysis of gene expression
338 ArrayExpress Public collection of microarray gene expression data
11 Axeldb Gene expression in Xenopus laevis
12 BodyMap Human and mouse gene expression data
417 BGED Brain gene expression database
432 CleanEx Expression reference database, linking heterogeneous expression data to facilitate cross-dataset comparisons
657 dbERGEII Database of experimental results on gene expression: genomic alignment, annotation and experimental data
454 EICO DB Expression-based imprint candidate organiser: a database for discovery of novel imprinted genes
455 emap Atlas Edinburgh mouse atlas: a digital atlas of mouse embryo development and spatially mapped gene expression
13 EPConDB Endocrine pancreas consortium database
110 EpoDB Genes expressed during human erythropoiesis
14 FlyView Drosophila development and genetics
326 GeneAnnot Revised annotation of Affymetrix human gene probe sets
325 GeneNote Human genes expression profiles in healthy tissues
330 GenePaint Gene expression patterns in the mouse
676 GeneTide A transcriptome-focused member of the GeneCards suite
481 GeneTrap Expression patterns in an embryonic stem library of gene trap insertions
603 GEO Gene expression omnibus: gene expression profiles
485 GermOnline Gene expression in mitotic and meiotic cell cycle
15 GXD Mouse gene expression database
681 H-ANGEL Human anatomic gene expression library
493 HemBase Genes expressed in differentiating human erythroid cells
23 HugeIndex Expression levels of human genes in normal tissues
17 Kidney Development Database Kidney development and gene expression
778 LOLA List of lists annotated: a comparison of gene sets identified in different microarray experiments
18 MAGEST Ascidian (Halocynthia roretzi) gene expression patterns
699 MAMEP Molecular anatomy of the mouse embryo project: gene expression data on mouse embryos
339 MEPD Medaka (freshwater fish Oryzias latipes) gene expression pattern database
19 MethDB DNA methylation data, patterns and profiles
537 Mouse SAGE SAGE libraries from various mouse tissues and cell lines
541 NASCarrays Nottingham Arabidopsis Stock Centre microarray database
545 NetAffx Public Affymetrix probesets and annotations
711 Osteo-Promoter Database Genes in osteogenic proliferation and differentiation
154 PEDB Prostate expression database: ESTs from prostate tissue and cell type-specific cDNA libraries
558 PEPR Public expression profiling resource: expression profiles in a variety of diseases and conditions
21 RECODE Genes using programmed translational recoding in their expression
568 RefExA Reference database for human gene expression analysis
739 rOGED Rat ovarian gene expression database
712 SAGEmap NCBI's resource for SAGE data from various organisms
742 SIEGE Smoking Induced Epithelial Gene Expression
22 Stanford Microarray Database Raw and normalized data from microarray experiments
24 Tooth Development Database Gene expression in dental tissue
10. Proteomics Resources
606 2D-PAGE Proteome database system for microbial research
731 DynaProt 2D Proteome database of Lactococcus lactis
222 GelBank 2D gel electrophoresis patterns of proteins from complete microbial genomes
710 Open Proteomics Database Mass-spectrometry-based proteomics data for human, yeast, E.coli and Mycobacterium
377 PEP Predictions for entire proteomes: summarized analyses of protein sequences
281 RESID Pre-, co- and post-translational protein modifications
225 SWISS-2DPAGE Annotated 2D gel electrophoresis database
11. Other Molecular Biology Databases
11.1. Drugs and drug design
407 ANTIMIC Database of natural antimicrobial peptides
781 AOBase Antisense oligonucleotide selection and design
408 APD Antimicrobial peptide database
423 BSD Biodegradative strain database: microorganisms that can degrade aromatic and other organic compounds
443 DART Drug adverse reaction target database
701 MetaRouter Compounds and pathways related to bioremediation
186 Peptaibol Peptaibol (antibiotic peptide) sequences
392 PharmGKB Pharmacogenomics knowledge base: effect of genetic variation on drug responses
315 TTD Therapeutic target database
11.2. Probes
505 IMGT/PRIMER-DB Immunogenetics oligonucleotide primer database
296 MPDB Synthetic oligonucleotides useful as primers or probes
728 PrimerPCR PCR primers for eukaryotic and prokaryotic genes
390 probeBase rRNA-targeted oligonucleotide probe sequences, DNA microarray layouts and associated information
736 QPPD Quantitative PCR Primer Database for human and mouse
356 RTPrimerDB Real-time PCR primer and probe sequences
11.3. Unclassified databases
298 PubMed Citations and abstracts of biomedical literature
256 BioImage Database of multidimensional biological images
12. Organelle Databases
74 GOBASE Organelle genome database
547 OGRe Organelle genome retrieval system
601 Organelle genomes NCBI's organelle genome resource
722 PLprot Arabidopsis thaliana chloroplast protein database
713 Organelle DB Organelle proteins and subcellular structures
12.1. Mitochondrial Genes and Proteins
637 AMPDB Arabidopsis mitochondrial protein database
686 HMPD Human mitochondrial protein database
38 HvrBase Primate mitochondrial DNA control region sequences
64 Mitochondriome Metazoan mitochondrial genes
83 MitoDat Mitochondrial proteins (predominantly human)
226 MitoDrome Nuclear-encoded mitochondrial proteins of Drosophila
84 MitoMap Human mitochondrial genome
85 MitoNuc Nuclear genes coding for mitochondrial proteins
86 MITOP2 Mitochondrial proteins, genes and diseases
531 MitoPD Yeast mitochondrial protein database
532 MitoProteome Experimentally described human mitochondrial proteins
538 MPIMP Mitochondrial protein import machinery of plants
241 PLMItRNA Plant mitochondrial tRNA
13. Plant Databases
13.1. General plant databases
599 BarleyBase Expression profiling of plant genomes
624 CR-EST Crop ESTs: barley, pea, wheat and potato
67 CropNet Genome mapping in crop plants
128 FLAGdb++ Integrative database about plant genomes
351 GénoPlante-Info Plant genomic data from the Génoplante consortium
488 GrainGenes Genes and phenotypes of wheat, barley, rye, triticale, oats or
607 Gramene A resource for comparative grass genomics
81 Mendel Annotated plant ESTs and STSs
581 openSputnik Plant EST clustering and functional annotation
560 PhytoProt Clusters of (predicted) plant proteins
721 PlantMarkers A database of predicted molecular markers from plants
355 PlantGDB Plant genome database: actively transcribed plant genes
189 PLANT-PIs Plant protease inhibitors
371 PlantsP/PlantsT Plant proteins involved in phosphorylation and transport
588 TIGR plant repeat database Classification of repetitive sequences in plant genomes
313 TropGENE DB Genes and genomes of sugarcane, banana, cocoa
13.2. Arabidopsis thaliana
636 AGNS Arabidopsis GeneNet supplementary: gene expression and phenotypes of mutants and transgens
618 AGRIS Arabidopsis gene regulatory information server: promoters, transcription factors and their target genes
780 Arabidopsis MPSS Arabidopsis gene expression detected by massively parallel signature sequencing
638 Arabidopsis Nucleolar Protein Database Comparative analysis of human and Arabidopsis nucleolar proteomes
640 ASRP Arabidopsis thaliana small RNA project
412 ARAMEMNON Arabidopsis thaliana membrane proteins and transporters
765 AthaMap Genome-wide map of putative transcription factor binding sites in Arabidopsis thaliana
427 CATMA Complete Arabidopsis transcriptome microarray
656 DATF Database of Arabidopsis transcription factors
672 GabiPD Central database of the German Plant Genome Project
675 GeneFarm Expert annotation of Arabidopsis gene and protein families
527 MAtDB MIPS Arabidopsis thaliana database
738 RARGE RIKEN Arabidopsis genome encyclopedia: cDNAs, mutants and microarray data
578 SeedGenes Genes essential for Arabidopsis development
97 TAIR The Arabidopsis information resource
627 WAtDB Wageningen Arabidopsis thaliana database: mutants, transgenic lines and natural variants
13.3. Rice
418 BGI-RISe Beijing genomics institute rice information system
79 INE Integrated rice genome explorer
353 IRIS International rice information system
536 MOsDB MIPS Oryza sativa database
90 Oryzabase Rice genetics and genomics
628 Oryza Tag Line database T-DNA insertion mutants of rice
737 RAD Rice annotation database
336 RiceGAAS Rice genome automated annotation system
569 Rice PIPELINE Unification tool for rice databases
572 Rice proteome database Rice proteome database
13.4. Other plants
610 Brassica ASTRA A database for Brassica genomic research
526 MaizeGDB Maize genetics and genomics database
80 LIS (formerly MGI) Legume information server (formerly Medicago genome initiative): ESTs, gene expression and proteomic data
539 MtDB Medicago trunculata genome database
620 PoMaMo Potato Maps and More: Potato genome data
766 SGMD Soybean genomics and microarray database
14. Immunological Databases
642 BCIpep A database of B-cell epitopes
604 dbMHC Genetic and clinical database of the human MHC
150 FIMM Functional molecular immunology data
682 Haptendb Curated database of hapten molecules
779 HLA Ligand/Motif A database and search tool for HLA sequences
501 IL2Rgbase X-linked severe combined immunodeficiency mutations
172 IMGT International immunogenetics information system: immunoglobulins, T cell receptors, MHC and RPI
503 IMGT/Gene-DB Vertebrate immunoglobulin and T cell receptor genes
173 IMGT/HLA Polymorphism of human MHC and related genes
504 IMGT/LIGM-DB Immunoglobulin, T cell receptor and MHC nucleotide sequences from human and other vertebrates
16 Interferon Stimulated Gene Database Genes induced by treatment with interferons
692 IPD-ESTDAB Immunologically characterized melanoma cell lines
693 IPD-HPA Immuno polymorphism of human platelet antigens
694 IPD-KIR Immuno polymorphism of killer-cell Ig-like receptors
691 IPD-MHC Sequences of the major histocompatibility complex
361 JenPep Quantitative binding data for immunological protein–peptide interactions
702 MHCBN A database of MHC binding and non-binding peptides
181 MHCPEP MHC-binding peptides
107 MPID MHC—peptide interaction database
621 VBASE2 Variable genes from the Ig loci of human and mouse

a Each database is shown in the list only once, often in a category that was arbitrarily chosen among two or three appropriate ones. In the online version of this list at the NAR website (, a database can be listed under two categories.

b Accession number of the database in the online list; can be used to view the database summary, e.g. shows the summary for DDBJ.