Reconstructing Genome Architectures by End Sequence Profiling: Applications to Tumor Genomes

Ben Raphael1, Pavel Pevzner2, Stas Volik, Colin Collins
1braphael@ucsd.edu, University of Californa, San Diego; 2ppevzner@cs.ucsd.edu, Univeristy of Californa, San Diego

Although cancer progression is often associated with genome rearrangements, little is known about the detailed architecture of tumor genomes. The attempt to reconstruct the genomic organization of a tumor genome recently resulted in the development of the End Sequence Profiling (ESP) technique, and the application of this technique to human MCF7 tumor cells. End Sequence Profiling provides a relatively low-cost, high-resolution mapping of rearrangements in tumor cells. Motivated by data from such ESP experiments, we formulate the ESP Genome Reconstruction Problem, and develop an algorithm to solve this problem in the case of sparse ESP data. We apply our algorithm to analyze human MCF7 tumor cells, and obtain the first reconstruction of the putative architecture of a human MCF7 tumor genome. We use the reconstruction to analyze rearrangements associated with tumor progression. Finally, we describe simulations to elucidate the relationship between the coverage of ESP data, and the reliability of the reconstructed genome. The results of these simulations help organize BAC re-sequencing efforts in the ongoing ESP analysis of tumor genomes.