Visualizing and Exploring Linked Functional Genomic Data Sets in YETI: Yeast Exploration Tool Explorer

Richard J. Orton1, William I. Sellers2, Dietlind L. Gerloff
1Richard.Orton@ed.ac.uk, University of Edinburgh; 2W.I.Sellers@lboro.ac.uk, University of Loughborough

We introduce the Yeast Exploration Tool Integrator (YETI), a novel bioinformatics tool for the integrated visualization and analysis of S.cerevisiae (budding yeast) functional genomic data sets. YETI consists of a relational database for the storage and management of data (MySQL) and a standalone program (Java) for visualization and analysis. One of the YETI's strengths is that its user-friendliness towards wet-laboratory biologist users (who will explore the data using YETI as a "workbench") is combined with a solid database structure enabling bioinformaticians to carry out collective analyses of the data sets at the desired levels of detail, as well as considerations of continuance, platform transferability, and expandability.

The prototype version, YETI 1.0, consists of three sections:

  1. A Genome Section for the informative display of the S.cerevisiae genome, its chromosomes and data on all its genes;
  2. A Proteome Section for the effective visualization of potential protein-protein interactions as inferred from yeast two-hybrid screens; and
  3. An Analysis Section for database searching using complex queries. All sections are fully inter-linked and provide access to textual information including Gene Ontology (GO) annotations at any point.

Two new sections are currently under development, a Transcriptome Section for the visualization and analysis of gene expression microarray data; and a Cell Section for a sub-cellular structural context and access to metabolic pathways.

YETI is freely available to academics, either for use over the WWW or for download under license at www.bru.ed.ac.uk/~orton/yeti.html.

Although YETI's design is specifically aimed at the yeast S. cerevisiae the framework of the program could be applied to other organisms with relative ease. The only prerequisite is that similar data sets are available, i.e. a completely sequenced genome, a database of gene/protein annotations and experimental functional genomics data sets such as protein-protein interactions.