Database of Three-Dimensional Exon Structures, SEDB

Chesley Leslin1, Valentin Ilyin2, Alex Abyzov, Grigory Makarevich
1chesley.leslin@verizon.net, Theoretical Molecular Biology and Bioinformatics Lab, Northeastern University, Boston, Massachusetts; 2valentin.ilyin@verizon.net, Theoretical Molecular Biology and Bioinformatics Lab, Northeastern University, Boston, Massachusetts

Expeditious mapping of exon borders and intron phase data onto structurally similar proteins presents a novel approach to study ancient exons, possible alternative splice variants, and phylogeny studies using exon/itron structure characteristics. According to the Exon Theory of Genes (Gilbert 1997), a correlation should be found between exon/intron boundaries and protein modules, which is backed by recent findings (Fedorov et al, 2001). Conversely, Stoltzfus et al. 1994 proclaimed there is no correlation found between exon-intron boundaries and the encoded protein structure, but the question remains unresolved. Ongoing research efforts have focused on sequence analysis to address the location of intron positions and whether the location falls in areas that would separate the protein into individual modules. We present SEDB (Structural Exon Database), to analyze the exon-intron boundaries in structurally related proteins. These proteins can share sequence identity, but have been shown to share >20% sequence identity, so if conserved exon/intron boundaries or exons are found in structurally related proteins it would help to strengthen the concept of exon shuffling, the cornerstone of the first theory. SEDB uses information from the CE database (Shindyalov et al. 1998) and EID (Gilbert, 1998) to dynamically map the position and phase of introns onto proteins related by structure. SEDB is a public, MySQL database generating these data sets, which are in return analyzed by Friend (Front-end Bioinformatics Tool, Abyzov, 2002) for complete structure and sequence analysis. SEDB can be found at http://glinka.bio.neu.edu/~cleslin/SEDB/SEDB.html