Efficiently finding regulatory elements using correlation with gene expression
Hideo Bannai1, Shunsuke Inenaga2, Ayumi Shinohara, Masayuki Takeda, Satoru Miyano
1bannai@ims.u-tokyo.ac.jp, Human Genome Center, Institute of Medical Science, University of Tokyo; 2s-ine@i.kyushu-u.ac.jp, Department of Informatics, Kyushu University
We present an efficient algorithm for detecting putative regulatory
elements in the upstream DNA sequences of genes, using gene expression
information obtained from microarray experiments.
Based on a generalized suffix tree, our algorithm looks for
motif patterns whose appearance in the upstream region is most
correlated with the expression levels of the genes.
We are able to find the optimal pattern, in time linear in the total
length of the upstream sequences.
We implement and apply our algorithm to publicly available
microarray gene expression data, and show that our method is able to
discover biologically significant motifs, including various motifs
which have been reported previously using the same dataset.
We further discuss applications for which the efficiency of the method
is essential.