Version Management of a Genomic Sequence Database Using Active Rules and Temporal Concepts

Sung-Hee Park1, Keun Ho Ryu2, Byeong-Jin Jeong Hyeon S. Son
1shpark@dblab.chungbuk.ac.kr, Chungbuk National University; 2khryu@dblab.chungbuk.ac.kr, Chungbuk National University

There has often occur changes of sequences from the same piece of DNA due to biological mutation through evolution history, variables of experimental environment factors, uncertainty of measurement and unknown flaws of program algorithms concerned with genome sequencing process as the volume of sequence databases has been tremendously increased. It is not currently convinced that the recently updated sequences in databases are the accurate sequences. Therefore, one of emerging issues for managing sequence database is to manipulate changes of sequences over time in order to facilitate mutation and evolutionary relationship studies. In this respect, sequence identifiers in existing sequence databases have been altered into stable identifiers which can identify changed sequence entries. Furthermore, existing sequence database can not serve retrieving past sequence changes and support searching snapshot images of current sequences. Thus, in this paper we propose modeling of sequence versions for sequence changes of the same piece of DNA using a time stamp attribute in a temporal data model and mechanism of management of sequence versions in a sequence database by applying trigger rules(Event-Condition-Action) in an active database system. In the modeling of sequence versions, a sequence version includes the time stamp attributes into each tuple with representing transaction time to imply time duration of its existing in the sequence database. These sequence versions are represented by XML. The mechanism of management of sequence versions includes algorithms of operations for detection and creation of a new sequence version, and deletion. The proposed mechanism is automatically executed by triggering active rules based on Oracle DBMS once events of database update operation for sequences are occurred. This study shows that it retrieves history of sequence changes, applying a temporal model to manage valid genomic sequence versions and facilitating performing version management operations with active rules.