Automatic generation of cell-wide pathway model from complete genome
Kazuharu Arakawa1, Yohei Yamada2, Hiromi Komai, Kosaku Shinoda, Yoichi Nakayama, Masaru Tomita
1gaou@g-language.org, Institute for Advanced Biosciences, Keio University; 2skipper@g-language.org, Keio University
Knowledge in molecular biology is rapidly accumulating in the fields of
genome, transcriptome, proteome, and metabolome, demanding for a systems
biology approach in order to view the dynamic behavior of a cell as a
complex system. However, simulation is a challenging task especially
where large scale modeling is required due to the necessity for vast
amount of accurate parameters. E-Cell project estimated the
necessary cost for modeling the whole cell of Escherichia coli to be at
least 1800 man month, from the experience in modeling a in silico
mitochondria. Therefore a large scale modeling of cell in silico demands
for a novel high-throughput approach. If successfully integrated,
availability of large amount of genome sequence, transcripts and
expression data, enzyme reaction data, metabolic pathway maps, and the
data of metabolites in cells will create a strong base for a qualitative
cell model.
The Genome-based E-cell Modeling System (GEM System) developed upon the
G-language Genome Analysis Environment realizes a fully automatic
conversion of genome sequence data into a qualitative in silico cell
model, linking information from major public database such as GenBank, EMBL,
SWISS-PROT, KEGG, ARM, Brend, and WIT. The ORFs are matched to the
corresponding proteins and thus to stoichiometric reactions, through a
combined method using annotation, homology, and orthology. The generated
reaction network is then checked with KEGG reference pathway maps for
false positives and false negatives, and also for connectivity of
pathways where applicable.
Using the hybrid algorithm of dynamic and static simulation methods,
when all rate limiting reactions are dynamically represented, every
other reaction
can be staticly represented with the same accuracy as completely dynamic
simulation. GEM system can
semi-automatically generate the static part of this hybrid algorithm,
and provides a base for large scale dynamic simulation. The generated
models can be directly simulated using E-Cell, and SBML porting is being
developed.