LEARNING OBJECT RESOURCES
LSM2104 MODULE

POPULAR BIOINFORMATICS DATABASES

A biological database is a collection of data that is organized so that its contents can easily be accessed, managed, and updated. Below given are the popular biological databases with its link.

Nucleotide Databases
GenBank,EMBL and DDBJ are the 3 major nucleotide databases in the world. They exchange their sequence data on a daily basis to ensure that the basic sequence information stored in their databases are equivalent

 

NCBI

National Center for Biotechnology Information

  • GenBank is a DNA sequence database from National Center Biotechnology Information.It incorporates sequences from publicly available sources
  • Search Engine -Entrez

 

 

EMBL

    European Molecular Biology Laboratory

  • EMBL-nucleotide sequence database contians mainly the sources for DNA and RNA sequences from EBI(European Bioinformatics Institute).
  • Search Engine - SRS

 

DDBJ

      DNA Data Bank of Japan

  • DDBJ established in 1986 at the National Institute of Genetics (NIG)
  • reorganized as the Center for Information Biology and DNA Data Bank of Japan (CIB/DDBJ)  in 2001
Protein Databases

 

SWISS-PROT

 

  • Swiss-prot contains protein sequence maintained by SIB and EBI/EMBL

  • provides high level of annotation ,minimum level of redundancy ,high level of integration with other databases.

     

 

 

TrEMBL

                

  • TrEMBL contains translations of all coding sequences in the EMBL nucleotide sequence database.
  • supplement to Swiss-prot

 

PIR

   Protein Information Resource

 

  • PIR contains Protein Sequences which was maintained since 1988.PIR is split into four distinct sections, that differ in quality of the data and the level of annotation.
    • PIR1 - fully classified and annotated entries.
    • PIR2 - preliminary entries, not thoroughly reviewed.
    • PIR3 - unverified entries, not reviewed.
    • PIR4 - conceptual translations.
  • PIR has recently joined forces with EBI (European Bioinformatics Institute) and SIB (Swiss Institute of Bioinformatics) to establish the UniProt (Universal Protein Resource), the central resource of protein sequence and function.

 

UniProt

       Universal Protein Resource

  • UniProt is the central access point for extensive curated protein information, including function, classification, and cross-reference.
Structure Databases

PDB

Protein Data Bank

 

  • PDB is the single worldwide repository for the processing and distribution of 3-D biological macromolecular structure data.

NDB

      Nucleic Acid Database

 

  • Repository of 3-D structural information about nucleic acids

CCDB/CSD

 Cambridge Crystallographic Data Centre / Cambridge Structural Database

  • compilation of a computerised database containing comprehensive data for organic and metal-organic compounds studied by X-ray and neutron diffraction
Protein Domain and Family Database

 

ProDom

 

  • ProDom protein domain database consists of an automatic compilation of homologous domains

 

SMART

Simple M odular Architecture Research Tool

 

  • SMART allows the identification and annotation of genetically mobile domains

 

Pfam

 Protein families database of alignments and HMMs

  • Pfam is a large collection of multiple sequence alignments and hidden Markov models covering many common protein domains and families.

 

PROSITE

 

  • Prosite database is based on SwissPort and thus is very well annotated. Characterization of protein families is done by the single most conserved motif observed in a multiple sequence alignment of known homologous. These conserved motifs usually relate to biological functions such as active sites or binding sites.
Genome Database

 

Ensembl

Tour of Ensemble

 

  • Ensembl is a joint project between EMBL - EBI and the Sanger Institute to develop a software system which produces and maintains automatic annotation on eukaryotic genomes.
Pathway Databases

 

KEGG

Kyoto Encyclopedia of Genes and Genomes

 

 

  • Kegg is useful for understanding higher order functional meanings and utilities of the cell or the organism from its genome information
For more information use the following links
© Copyright 2004-06 Department of Biochemistry, National University of Singapore. All Rights Reserved.
Last Updated : 2006