Building a Database of Protein Structure Using a Geographic Model based on Topological Consistency

Sung-Hee Park1, Keun Ho Ryu2, Byeong-Jin Jeong Hyeon S. Son
1shpark@dblab.chungbuk.ac.kr, Chungbuk National University; 2khryu@dblab.chungbuk.ac.kr, Chungbuk National University

ABSTRACT. The analysis and management of protein structure data is more complicated and more difficult due to characteristics of protein structure, which are very large, extremely complex, multidimensional and incomplete. Therefore, it is necessary to store structural features into a database for providing the structural information with analysis application of protein structures. Storing the protein structures into database requires a data model which includes characteristics of protein structure such as spatial arrangements and topological relationships to support structure analysis. We propose protein structure modeling using a geographic model and build a structure database based on the proposed modeling which includes thematic information and geometry of protein structure. In the modeling, multilevel geometry of proteins is represented by spatial types in spatial network model and thematic information includes the physico-chemical, biological and experimental data. We implemented the structure database by using ORACLE 8i spatial and state queries to retrieve topological relationship of structural elements with spatial operators. The experimental results show evaluation of queries with geometric and topological operators to be primitive operations for analyzing protein structures. This study proved that applying spatial concepts to manage protein structure can efficiently support fast protein structure analysis with spatial operation and a filter–refinement processing by using a multi dimensional index.