The RNA Abundance Database and its Annotation Web-Forms.

Elisabetta Manduchi1, G.R. Grant, Hongxian He, J. Liu, M.D. Mailman, A. Pizarro, P.L. Whetzel, C.J. Stoeckert Jr.
1manduchi@pcbi.upenn.edu, Center for Bioinformatics, University of Pennsylvania

Gene expression array technology has become increasingly widespread among researchers who recognize its numerous promises. At the same time, bench biologists and bioinformaticians have come to increasingly appreciate the importance of close collaboration from the onset of a study and of collecting and exchanging detailed information on the many experimental and computational procedures using a structured mechanism. This is crucial for adequate analyses of this kind of data. The RNA Abundance Database (RAD: http://www.cbil.upenn.edu/RAD3) provides a comprehensive MIAME-supportive infrastructure for gene expression data management and makes extensive use of ontologies. Specific details on protocols, biomaterials, study designs, etc., are collected through a user-friendly suite of web annotation forms. Software has been developed to generate MAGE-ML documents to enable easy export of studies stored in RAD to any other database accepting data in this format (e.g. the public repository ArrayExpress). RAD is part of a more general Genomics Unified Schema (www.gusdb.org), which includes a richly annotated gene index (www.allgenes.org), thus providing a platform that integrates genomic and transcriptomic data from multiple organisms. This infrastructure enables a large variety of queries that integrate visualization and analysis tools and can be tailored to serve the specific needs of projects focusing on particular organisms or systems. Examples are PlasmoDB (www.plasmodb.org), a genomic database for the malaria-causing parasite Plasmodium falciparum and EPConDB ( www.cbil.upenn.edu/EPConDB/index.shtml), the Endocrine Pancreas Consortium website.