While confirming that this computational task would be difficult, these types of high-throughput exper iments still provide valuable information that can be used to assist the computational identification of bindi ng sites and the prediction of interactions through sequence analysis. With this goal in mind, we have devised a statistical model incorporating experimental interaction data and sequence data, that utilizes informed pri ors, mixture models, and discrimination techniques. We use a Gibbs sampling algorithm to trained the model on an experimentally-derived yeast-two-hybrid SH3 binding domain interaction network (Tong et al, 2001). O ur results reveal that the observed interaction networks can, to a large degree, be explained by our model, an d we directly compare the results of our algorithm to a similar network derived via combinatorial chemistry me thods (Tong et al, 2001). The performance of our method as evaluated by cross-validation is similar to that obtained by the phage display experiments.
Our technique simultaneously identifies the specific peptide motifs that show high affinity to each do main, as well as the most likely binding sites on the domain ligands. Once trained on a set of sequences and i nteractions, the resulting model may be used to derive statistically-robust estimates of the likelihood of bin ding between domains and new potential interactors. Such predictions may be used to further expand the network s of interactions, to suggest potential drug targets, and to more specifically direct further experiments, suc h as co-immunoprecipitation or protein mutagenesis.