Module Networks: Identifying Regulatory Modules and their Condition Specific Regulators from Gene Expression Data

Eran Segal1, Aviv Regev2, Dana Pe'er, Michael Shapira, David Botstein, Daphne Koller, Nir Friedman
1eran@cs.stanford.edu, Stanford; 2ARegev@CGR.Harvard.edu, CGR

Module Networks: Identifying Regulatory Modules and their Condition Specific Regulators from Gene Expression Data Eran Segal Aviv Regev Dana Pe'er Michael Shapira David Botstein Daphne Koller Nir Friedman A cell is a complex system that performs multiple functions and responds to a variety of signals. This activity is coordinated by the organization of the cell into a network of interacting functional modules: sets of genes that are co-regulated to respond to different conditions. In this paper, we present the module networks procedure, a novel, fully-automated, probabilistic method for discovering regulatory modules based on gene expression data. The method identifies the genes composing each module, the regulators controlling their expression, and the conditions under which regulation occurs. We applied this method to 173 gene expression arrays (Gasch et al) measuring the response of Saccharomyces cerevisiae to various stress conditions. We validated the results using gene annotations and patterns of cis-regulatory motifs, demonstrating the method's ability to identify highly coherent modules and their correct regulators. The method also suggests testable novel hypotheses about gene regulation in the form "regulator 'X' regulates process 'Y' under conditions 'W'". To evaluate the accuracy of our predictions, we performed microarray experiments to test three of our method's hypotheses. In all cases, the experimental results support our predictions, and suggest regulatory roles for previously uncharacterized proteins, including a putative transcription factor and two putative signal transduction molecules.