Accepted Poster Abstracts - Review here!
Accepted Papers
Submission Guidelines for Accepted Papers

Call For Tutorials (closed)

Bioinformatics Journal
APBC2003
APBC2004
SIGSIM: Systems Biology of E.coli

Poster Abstracts

Data Mining
Data Visualisation
Databases
Functional Genomics
Genome Annotation
Microarrays
New Frontiers
Phylogeny and Evolution
Predictive Methods
Sequence Comparison
Structural Biology
Systems Biology



Data Mining
B-1  Biclustering Microarray Data by Gibbs Sampling
Qizheng Sheng1, Yves Moreau2, Bart De Moor
1qizheng.sheng@esat.kuleuven.ac.be, Katholieke Universiteit Leuven, Department of Electrical Engineering; 2yves.moreau@esat.kuleuven.ac.be, Katholieke Universiteit Leuven, Department of Electrical Engineering
Correspondence address: qizheng.sheng@esat.kuleuven.ac.be

We have adapted Gibbs sampling strategy, which has become a method of choice for the discovery of motifs in DNA and protein sequences, to the biclustering of discretized microarray data. In contrast with standard clustering, biclustering reveals similar expressional behavior of the genes over a subset of conditions in an microarray data set.

Long abstract


B-2  Hidden Multivariate Markov Models for Pattern Recognition in Genomic DNA Sequences
Leo Wang-Kit Cheung1
1lcheung@crch.hawaii.edu, Cancer Research Center of Hawaii, University of Hawaii
Correspondence address: lcheung@crch.hawaii.edu

Hidden Multivariate Markov Models (HM3s) are introduced for modeling multi-dimensional genomic DNA data. A bivariate version of HM3s is developed for studying the joint behavior of the C+G richness pattern and the bendability pattern of DNA. Applications of the bivariate HM3s for recognition/prediction of eukaryotic promoter regions are illustrated.

Long abstract


B-4  Domain-Domain correlations in Yeast protein complexes
Doron Betel1, Christopher W.V. Hogue2
1doron.betel@utoronto.ca, Samuel Lunenfeld Research Institute, Mt. Sinai Hospital and Department of Biochemistry, University of Toronto; 2hogue@mshri.on.ca, Samuel Lunenfeld Research Institute, Mt. Sinai Hospital and Department of Biochemistry, University of Toronto
Correspondence address: doron.betel@utoronto.ca

We introduce a new method for detecting statistically meaningful functional associations between domains from molecular complexes. Two random control sets were used to compute P-values for domain co-occurrences in complexes. Results from four different datasets show that many of the correlations are between domains of similar or associated function.

Long abstract


B-5  ReBIL : Relating Biological Information through Literature
Francisco M. Couto1, Pedro Coutinho2, Mário J. Silva
1fjmc@di.fc.ul.pt, Faculdade de Ciencias, Universidade de Lisboa; 2pedro@afmb.cnrs-mrs.fr, UMR 6098, Architecture et Fonction des Macromolécules Biologiques, CNRS
Correspondence address: fjmc@di.fc.ul.pt

ReBIL aims to improve the efficiency of information extraction systems applied to biological literature. The project is based on the correlation between structural and functional classifications of gene products. We evaluate extracted information by checking if they preserve the correlation. More information about Rebil is available at http://xldb.fc.ul.pt/rebil/.

Long abstract


B-6  POLII TRANSCRIPTION TERMINATION SIGNALS IN HUMAN
Aroul Selvam1, Thomas Down2, Tim Hubbard
1asr25@cam.ac.uk, The Wellcome Trust Sanger Institute; 2td2@sanger.ac.uk, The Wellcome Trust Sanger Institute
Correspondence address: asr25@cam.ac.uk

RNA polymerase II - although important as it transcribes all the protein coding genes in the cell, little is known about its termination process. This study focuses on identifying motifs that link to transcription termination and polymerase relase process.

Long abstract


B-7  GOstat: Find statistically overrepresented Gene
Tim Beissbarth1, Terry Speed2
1beissbarth@wehi.edu.au, WEHI; 2terry@wehi.adu.au, WEHI
Correspondence address: beissbarth@wehi.edu.au

GOstat provides a useful tool in order to find biological processes or annotations characteristic for a group of genes. This is greatly helpful in analyzing lists of genes resulting from high throughput screening experiments, such as microarrays, for their biological meaning. GOstat is accessible via the Internet at http://gostat.wehi.edu.au.

Long abstract


B-8  Multi-Dynamic Bayesian Networks for Pattern Recognition in Genomic DNA Sequences
Leo Wang-Kit Cheung1, Angel Yee-Man Cheung2
1lcheung@crch.hawaii.edu, Cancer Research lCenter of Hawaii, University of Hawaii ; 2angelymch@yahoo.com, Department of Computer Science, Chu Hai College
Correspondence address: angelymch@yahoo.com

Multi-Dynamic Bayesian Networks (MDBNs) are introduced for analyzing multi-dimensional genomic DNA data. A two-dimensional version of MDBNs is developed for learning and predicting the joint behaviour of the C+G richness pattern and the bendability pattern of DNA. Applications of these MDBNs for recognition/prediction of eukaryotic promoter regions are illustrated.

Long abstract


B-9  Hight-throughput gene expression analysis with GATO
David Vilanova1, James holzwarth2, Marie Camille Zwahlen, Frank Desiere,Matthew Alan Roberts
1david.vilanova@rdls.nestle.com, Nestle research center; 2james.holzwarth@rdls.nestle.com, Nestle research center
Correspondence address: david.vilanova@rdls.nestle.com

We present GATO (gene annotation tool) a tool to analyse gene expression data based on Ensembl database and Gene Ontology. We describe how our tool can be utilized to rapidly mine gene expression data and drive biological interpretation using Affymetrix arrays.

Long abstract


B-10  Continuous in situ Analysis of Cell Growth and Cell Viability
Petra Haenel1, Franz Kummert2, Karl Friehs, Erwin Flaschel, Gerhard Sagerer
1iphaenel@techfak.uni-bielefeld.de, Bielefeld University, Germany; 2franz@techfak.uni-bielefeld.de, Bielefeld University, Germany
Correspondence address: iphaenel@techfak.uni-bielefeld.de

We present an image analysis system to detect and count eukaryotic cells in darkfield microscopy images. Analyzing undiluted yeast suspension the tool differentiates between single, budding and cell clusters as well as dead and vital cells. The cells within a cluster are detected by active contours as well as the cell features which results in precise information for fermentation.

Long abstract


B-11  Partially supervised clustering of gene expression time course data
Alexander Schoenhuth1, Alexander Schliep2, Christine Steinhoff
1aschoen@zpr.uni-koeln.de, Center for Applied Computer Science, University Cologne; 2schliep@molgen.mpg.de, Max Planck Institute for Molecular Genetics, Berlin
Correspondence address: aschoen@zpr.uni-koeln.de

As the amount of genes with known function available is growing there is a need for classification methods which allow the use of prior knowledge. Partially supervised clustering of time courses stemming from gene expression experiments addresses this problem. Here a model-based clustering approach using Hidden Markov Models is proposed.

Long abstract


B-12  ExMI: Extracting Molecular Interaction from Large Biomedical Literature
Yoshihiro Ohta1, Tohru Natume2, Tetsuo Nishikawa, Hiroko Ohi, Tohru Hisamitsu
1yoh@crl.hitachi.co.jp, HITACHI Central Research Laboratory; 2natsume@jbirc.aist.go.jp, National Institute of Advanced Industrial Science and Technology
Correspondence address: yoh@crl.hitachi.co.jp

Extracting molecular interactions from the rapidly growing biomedical literature is important to seek systematical explorations of relationships between genes and proteins. However, many of the existing computer-aided methods are not sufficiently capable of processing a huge amount of literature. Extraction techniques include molecule name detection, interaction event detection, and graphical interface construction. We explore these techniques and show system examples.

Long abstract


B-13  An algorithm to select abstracts from MEDLINE concerning UV-regulated genes
Hiroko Ao1, Toshihisa Takagi
1aohiroko@ims.u-tokyo.ac.jp, Department of Computational Biology
Correspondence address: aohiroko@ims.u-tokyo.ac.jp

With the rapid growth of machine-readable literature, such like MEDLINE database, a search for articles is an important assignment. Therefore, we propose an efficient algorithm to select information from results of a PubMed search. When taking 487 UV-regulated genes, it extracted sentences containing the query with 97% precision and 97% recall.

Long abstract


B-14  TextLens: A Fast and Practical Partial Parser for Biomedical Literature
Yasunori Yamamoto1, Hiroko Ao2, Toshihisa Takagi
1yayamamo@ims.u-tokyo.ac.jp, Department of Computer Science, University of Tokyo; 2aohiroko@ims.u-tokyo.ac.jp, Department of Computational Biology
Correspondence address: yayamamo@ims.u-tokyo.ac.jp

TextLens Partial Parser is a parser to catch a pair of main subject and predicate of a sentence. It aims at an improvement of information extraction by appropriately capturing each chunk of words and a structure of a sentence. It uses a rule-based algorithm which makes an abstract expression of a sentence.

Long abstract


B-16  MutationMiner: A Graph Theoretic Approach to Extract Point Mutation Data from Biomedical Literature
Lawrence C. Lee1, Florence Horn2, Fred E. Cohen
1lle8@itsa.ucsf.edu, University of California San Francisco; 2horn@cmpharm.ucsf.edu, University of California San Francisco
Correspondence address: lle8@itsa.ucsf.edu

MutationMiner is a program which automates extraction of point mutation data from biomedical literature. It uses regular expressions and a graph theoretic approach to find point mutations in the text and then confirms the mutations with SwissProt data. MutationMiner can search over one thousand journal articles in 24 hours.

Long abstract


B-17  Use of hidden Markov models and phylogenetic algorithms to predict functionally distinct subclasses of chromodomains in different families of chromatin-modifying proteins
Khairina Tajul-Arifin1, Rohan Teasdale2, John S. Mattick
1k.tajularifin@imb.uq.edu.au, IMB, UQ; 2r.teasdale@imb.uq.edu.au, IMB, UQ
Correspondence address: k.tajularifin@imb.uq.edu.au

We have used phylogenetic algorithms and hidden Markov models to identify functionally distinct subsets within the family of chromodomains. Our results demonstrate the validity of using bioinformatic analysis of large datasets to predict subtle but meaningful differences in protein domain function and structure-function relationships.

Long abstract


B-18  FatiGO: a web tool for finding significant associations of Gene Ontology terms with groups of genes
Fatima Al-Shahrour1, Ramon Diaz-Uriarte2, Joaquin Dopazo
1falshahrour@cnio.es, Bioinformatics Unit, Spanish Natonal Cancer Center, CNIO.; 2rdiaz@cnio.es, Bioinformatics Unit, Spanish Natonal Cancer Center, CNIO.
Correspondence address: jdopazo@cnio.es

FatiGO (http://fatigo.bioinfo.cnio.es) is a simple but powerful procedure to extract Gene Ontology terms that result(upon the application of a statistical test) significantly over or under-represented in sets of genes within the context of a genome-scale experiments (DNA microarray, proteomics, etc.).

Long abstract


B-19  Current status of the GENIA Corpus: an Annotated Corpus in Molecular Biology Domain
Tomoko Ohta1, Jin-Dong Kim2, Yuka Tateisi, Masayoshi Tsuruoka, Jun'ichi Tsujii
1okap@is.s.u-tokyo.ac.jp, CREST, JST; 2jkdim@is.s.u-tokyo.ac.jp, University of Tokyo
Correspondence address: okap@is.s.u-tokyo.ac.jp

GENIA corpus 3.0p and 3.01, consisting of 2,000 MEDLINE abstracts, have been released with linguistically rich annotations including sentence boundaries, term boundaries, term classifications, semi-structured coordinated clauses, recovered ellipsis in terms, part-of-speech, etc. This poster is intended to provide the current status of the GENIA corpus.

Long abstract


B-20  The automatic discovery of structural principles describing protein fold space
Adrian P Cootes1, Michael je Sternberg2, Stephen H Muggleton
1a.cootes@ic.ac.uk, Imperial College; 2m.sternberg@ic.ac.uk, Imperial College
Correspondence address: a.cootes@ic.ac.uk

The rapid increase in protein structures produced by structural-genomics projects will make it increasingly difficult to analyse and understand the distribution of proteins in fold space. We have applied a machine-learning strategy to automatically determine the structural principles describing 45 folds.

Long abstract


B-21  A Constructional Approach to Extraction
Cornelia M. Verspoor1, George J. Papcun2, Kari Sentz
1verspoor@lanl.gov, Los Alamos National Laboratory; 2gjp@lanl.gov, Los Alamos National Laboratory
Correspondence address: verspoor@lanl.gov

We present a prototype implementation of a system for extracting protein/gene interactions from biological literature which is motivated by the theory of Construction Grammar. CG provides a powerful framework for combining domain-specific terminology management with patterns incorporating generic linguistic structural constraints.

Long abstract


B-22  Discovery of Analog Enzymes in Thiamin Biosynthesis by Anticorrelation
Enrique Morett1, J. Korbel2, K. Emmanuvel Rajan1, G. Saab-Rincon1, L. Olvera1, M. Olvera1, B. Snel2, S. Schmidt2, and P. Bork2.
1emorett@ibt.unam.mx, 1Instituto de of Biotecnologia, Universidad Nacional Autonoma de Mexico, AP 510-3, Cuernavaca Mor. 62250.Mexico; 2 European Molecular Biology Laboratory, Meyerhofstrasse 1. Heidelberg 69117. Germany.
Correspondence address: emorett@ibt.unam.mx

Prediction of gene function is one of the most challenging tasks in genomic science when there is no clear sequence similarity to annotated genes. Here we present a new method denominated Anticorrelation of Gene Presence to predict gene function. Using this method we identified four new genes involved in thiamin biosynthesis

Long abstract


B-23  EST based method to identify differentially expressed gene clusters along chromosomes
Karine Megy1, Stephane Audic2, Francios Enault, Jean-Michel Claverie
1km369@cam.ac.uk, University of Cambridge; 2audic@igs.cnrs-mrs.fr, CNRS
Correspondence address: audic@igs.cnrs-mrs.fr

We developed a method based on a statistical analysis of Expressed Sequence Tags (ESTs) to evaluate the positional clustering of differentially expressed genes. Human chromosomes 20, 21 and 22 were analysed with this method and show clusters of specifically expressed genes

Long abstract


B-24  Surfing data sources in drug discovery
Dennis Madsen1
1dnnm@novonordisk.com, Novo Nordisk
Correspondence address: dnnm@novonordisk.com

A data integration generalist tool has been developed allowing simultaneous query in several data sources. The query is restricted to specific types such as project or metabolite name. The hits are displayed with links to the originating data source and an option to use the hit as the next query.

Long abstract


B-25  Novel members of the C12/C19 cysteine proteases identified through human genome mining efforts: primary characterization of selected genes
Pierrat Benoit1, Bruengger Adrian2, Cai Richard, Gerhartz Bernd, Kossida Sophia, Nirmala Nanguniri, Worpenberg Susanne
1benoit.pierrat@pharma.novartis.com, Novartis Institute of Biomedical Research; 2adrian.bruengger@pharma.novartis.com, Novartis Institute of Biomedical Research
Correspondence address: benoit.pierrat@pharma.novartis.com

DUBs are cysteine proteases controlling the ubiquitination status of target proteins. Using genome mining tools, we have conducted searches for new human DUB members leading to the identification of 11 new sequences. Here we report on their in silico annotation and discuss the primary functional characterization of selected members.

Long abstract


B-26  Mapping and Visual Exploration of GPCR Classification Hierarchy in Interpro and GPCRDB System
Yanwei Niu1, Xiangyun Wang2, Yockey, Anastasia Christianson, Guang R. Gao
1niu@capsl.udel.edu, . Department of ECE, University of Delaware, USA; 2Xiangyun.Wang@astrazeneca.com, EST Informatics Wilmington, Astra Zeneca PLC
Correspondence address: niu@capsl.udel.edu

Interpro and GPCRDB are two GPCR classification systems. Using data mining technique, we compared the two systems family by family at each level of the classification hierarchy and established mapping relation between them. We introduced a novel visualization tool that allows us to directly and easily compare them.

Long abstract


B-27  Discovering biological knowledge from gene expression using association rules
P. Carmona-Saez1, M. Chagoyen2, A. Rodriguez, O. Trelles, J.M. Carazo and A. Pascual-Montano
1pcarmona@cnb.uam.es, National Center of Biotechnology. Madrid; 2monica@cnb.uam.es, National Center of Biotechnology. Madrid
Correspondence address: pascual@cnb.uam.es

We describe the application of association rule discovery technique to find relevant relations between different genes attributes and experimental conditions in microarrays expression dataset. This method can be used to extract interesting and very diverse biological information. The method is implemented in EngeneTM software package that it is freely available upon request at http://www.engene.cnb.uam.es

Long abstract


B-28  Human and Mouse expression maps from in silico expression profiles
Alia BenKahla1, Ralf Herwig2, Hans Lehrach, Marie-Laure Yaspo
1kahla@molgen.mpg.de, Max Planck Institute for Molecular Genetics; 2herwig@molgen.mpg.de, Max Planck Institute for Molecular Genetics
Correspondence address: kahla@molgen.mpg.de

We present the strategy used to extract the "in silico expression profiles" of the human and mouse genes (EST mining approache) and the data describing differentially expressed genes, disease related genes, and cluster of genes potentially involved in a common cellular function. Orthology gene expression comparison will also be presented.

Long abstract


B-29  New Datasets for Structural Data Mining Studies.
Carmen K. Chu1, Merridee A. Wouters2
1cchu@cse.unsw.edu.au, Computational Biology and Bioinformatics Program, Victor Chang Cardiac Research Institute; 2m.wouters@victorchang.unsw.edu.au, Computational Biology and Bioinformatics Program, Victor Chang Cardiac Research Institute
Correspondence address: m.wouters@victorchang.unsw.edu.au

We compared the sequence-derived representative dataset PDB_SELECT with the structural database SCOP. Some folds remain overrepresented in PDB_SELECT. After filtering, we obtain a subset of unique protein fold representatives: approximately ¼ of the original PDB_SELECT 25% list. We also discuss using unique representatives of SCOP folds as a representative dataset.

Long abstract


B-30  An architecture for a modularized gene information retrieval and summarization tool: Bioretrieve
Anton Bergheim1, Sheila Rock2
1anton@cs.wits.ac.za, University of the Witwatersrand; 2sheila@cs.wits.ac.za, University of the Witwatersrand
Correspondence address: anton@cs.wits.ac.za

The ability to process natural language based information computationally is becoming a necessity for the geneticist. We present here an architecture for Bioretrieve, a computational tool for the management of the extremely large body of knowledge that exists about genes. Designed in a modular fashion and employing an open-source approach, it has advantages over existing monolithic systems.

Long abstract


B-31  SEMA, A semantic literature annotator
Alex Garcia1, Cleary John2, Mark A. Ragan, Yi-Ping Pheobe Chen
1a.Garcia@imb.uq.edu.au, Institute for Molecular Bioscience; 2jcleary@reeltwo.com, Reel Two
Correspondence address: a.garcia@imb.uq.edu.au

We are using a machine-learning algorithm implemented in GO-KDS to complement SwissProt literature citation fields for each database entry. SEMA organizes this new relevant information, then builds a conceptual navigable map that is presented to the user as a flat or hyperbolic tree. This map allows redefining queries over the same database or over other information sources.

Long abstract


B-32  Non-negative matrix factorization for gene expression and scientific texts analysis
A. D. Pascual-Montano1, P. Carmona-Saez2, M. Chagoyen and J.M. Carazo
1pascual@cnb.uam.es, National Center of Biotechnology. Madrid. Spain; 2pcarmona@cnb.uam.es, National Center of Biotechnology. Madrid. Spain
Correspondence address: pascual@cnb.uam.es

We describe the application of Non-negative Matrix Factorization (NMF) technique to reduce dimensionality and to find local patterns hidden in gene expression data sets and in the scientific literature. Results show the potential of this new machine learning technique to find relevant biological information.

Long abstract


B-33  GENAW: GEnetic Network Analysis Workbench for microarray raw data
Pan-Gyu Kim1, Kyung Shin Lee2, Seon- Hee Park, Mi Young Shin, Hwan-Gue Cho
1pgkim@pearl.cs.pusan.ac.kr, Department of computer science, Pusan national university; 2kslee@pearl.cs.pusan.ac.kr, Department of computer engineering, Pusan national university
Correspondence address: pgkim@pearl.cs.pusan.ac.kr

We develop GENAW (GEnetic Network Analysis Workbench for microarray raw data) system that produces automatically the network from raw expression data. GENAW accepts various data formats of commercial tools and provides various visualization tools. We experimented with Yeast cell cycle data from Stanford university, our experiment was sufficiently reasonable.

Long abstract


B-34  Multi-class protein fold classification using an integrative machine learning approach
Aik Choon Tan1, David Gilbert2
1actan@brc.dcs.gla.ac.uk, Bioinformatics Research Centre, Department of Computing Science, University of Glasgow;
2drg@brc.dcs.gla.ac.uk, Bioinformatics Research Centre, Department of Computing Science, University of Glasgow
Correspondence address: actan@brc.dcs.gla.ac.uk

We devised a novel approach to integrate rules induced from multi-class and unbalanced data sets; and to demonstrate its usefulness to multi-class protein fold classification which contains 700 examples for 27 SCOP folds. We showed that this approach increases the sensitivity of the classifiers and yielding more useful classifiers.

Long abstract


B-35  C. elegans microarray data seen through a novel nonmetric multidimensional scaling method
Y-h. Taguchi1, Y. Oono2
1tag@granular.com, Department of Physics, Chuo University; 2y-oono@uiuc.edu, Department of Physics, UIUC
Correspondence address: tag@granular.com

C.elegans microarray data is analysed by a novel nonmetric multidimensional scaling method that is maximally nonmetric. The genes are embeddable in 3D. Their annotations are consistent with their positions in this space. A method to compute the 3D coordinates directly from the microarray data is also developed.

Long abstract


B-36   Non-metric analysis of temporal patterns captured in microarray data
Y-h. Taguchi1, Y. Oono2
1tag@granular.com, Department of Physics, Chuo Universit; 2y-oono@uiuc.edu, Department of Physics, UIUC
Correspondence address: tag@granular.com

A nonmetric multidimensional scaling analysis of the gene activity response of cell cycle-synchronized human fibroblasts to serum [Lyer et al. Science 283, 83-87 (1999)] automatically gives a ring-like gene arrangement along which the expression level peak rotates. This unambiguously demonstrates the power of a nonlinear data mining method.

Long abstract


B-37  Extracting Transcription Factor Interactions from Medline Abstracts
Marc Light1, Robert Arens, Vladimir Leontiev, Meredith Patterson, Xinying Qiu, Hudong Wang
1marc-light@uiowa.edu, University of Iowa
Correspondence address: marc-light@uiowa.edu

Staying abreast of research on transcription factors (TFs) is currently a difficult task for biologists. We are building a system that will extract TF interactions from Medline abstracts automatically. To date, we have annotated a corpus for TF interactions and evaluated a number of component technologies.

Long abstract


B-38  Construction of the plant gene index system based on tissue-categorized EST sets
Seung-Jae Noh1, Cheol-Goo Hur2, Sung-Ho Goh, Ho-Jin Chung, Kyoung-Oak Choi
1sjnoh@kribb.re.kr, Korea Research Institute of Bioscience and Biotechnology; 2hurlee@kribb.re.kr, Korea Research Institute of Bioscience and Biotechnology
Correspondence address: sjnoh@kribb.re.kr

Our plant gene index system based on stackPACK EST clustering with tissue categorization contains valuable information about 150,000 consensus sequences obtained from 9 principal plant model organisms. The information can be browsable, searchable, and downloadable with user-friendly web-interface at http://plant.pdrc.re.kr/new_korea/genepool/Plant/index.html

Long abstract


B-39  Regression analysis in optimal gene selection for DNA microarray analysis
Si-Ho Yoo1, Sung-Bae Cho2
1bonanza@candy.yonsei.ac.kr, Yonsei University; 2sbcho@cs.yonsei.ac.kr, Yonsei University
Correspondence address: bonanza@candy.yonsei.ac.kr

We propose a new gene selection method based on forward selection method in regression analysis. This method reduces redundant information about cancer that could be in the subset of selected genes. The result shows high accuracy of 90.3% for colon cancer data.

Long abstract


B-40  BioInfoCallboratory: Towards an Agent- Assisted Web-Based Collaboration Environment for Bioinformatics
Yan Chen1, Yi-Ping Phoebe Chen2
1y52.chen@student.qut.edu.au, Queensland University of Technology; 2p.chen@qut.edu.au, Queensland University of Technology
Correspondence address: y52.chen@student.qut.edu.au

Collaborations in web based bioinformatics environment require intelligent supports to assist human computer interaction. BioInfoCallboratory is an agent assisted web based environment for supporting bioinformatics research. It facilitates sophisticated interactions such as: matchmaking based on common interests; internet spanning data mining for bioinformatics data; and event alert for interested parties.

Long abstract


B-41  Pathogenic archaea-do they exist
Neil Saunders1, Ricardo Cavicchioli2, Paul M.G. Curmi, Torsten Thomas
1neil.saunders@unsw.edu.au, The University of New South Wales; 2r.cavicchioli@unsw.edu.au, The University of New South Wales
Correspondence address: neil.saunders@unsw.edu.au

We have developed a rapid, automated search strategy for the detection of contaminating sequence from putative novel pathogenic archaea in human EST sequence data. The system has general application to the detection of microbial pathogens and will be available at http://psychro.bioinformatics.unsw.edu.au.

Long abstract


B-42  Structural Classification in the Gene Ontology
Cliff Joslyn1, Susan Mniszewski2, Andy Fulmer, Gary Heaton
1joslyn@lanl.gov, Los Alamos National Laboratory; 2smm@lanl.gov, Los Alamos National Laboratory
Correspondence address: joslyn@lanl.gov

We present the Gene Ontology Clusterer (GOC), which structurally classifies the GO based on pseudo-distances between comparable nodes in posets, in conjunction with scoring algorithms, to rank-order the GO nodes with respect to a set of requested genes. We will also share lessons we've learned about working with the GO.

Long abstract


B-43  Extracting informative genes with negative correlation for accurate cancer classification
Hong-Hee Won1, Sung-Bae Cho2
1cool@candy.yonsei.ac.kr, Yonsei University; 2sbcho@cs.yonsei.ac.kr, Yonsei University
Correspondence address: cool@candy.yonsei.ac.kr

We define two negatively correlated ideal gene vectors which represent the patterns of classes well and extract two significant gene subsets (SGSs) based on the similarity to two ideal genes. We train the neural network classifiers with SGSs and combine them. The ensemble classifier produces the best recognition rate-97.1% in Leukemia, 87.1% in Colon, and 92.0% in Lymphoma.

Long abstract


B-44  Issues and principles in the analysis of large genomic datasets.
Francis Clark1, Susan Lilley2
1fc@maths.uq.edu.au, Advanced Computational Modelling Centre, University of Queensland, Australia.; 2s364202@student.uq.edu.au, School of Information Technology & Electrical Engineering, University of Queensland, Australia.
Correspondence address: fc@maths.uq.edu.au

Development of research analysis pipelines often involves working with poorly understood data to answer questions that are, initially, simplistic. This poster overviews some strategies and best practices that may be employed in such work, including; handling & appraisal of the data, choice of appropriate thresholds, extrapolation, and checking for reasonableness.

Long abstract


B-45  Hierarchical classification of cDNA libraries for gene expression analysis
Bumjin Kim1, Sanghyuk Lee2, Hyunjung Lee, Young-Ah Shin, Euiju Jung, Pora Kim
1unikbj@ewha.ac.kr, Division of Molecular Life Sciences, Ewha Womans University, Seoul 120-750, KOREA; 2sanghyuk@ewha.ac.kr, Division of Molecular Life Sciences, Ewha Womans University, Seoul 120-750, KOREA
Correspondence address: uandikbj@hotmail.com

Approximately 8,200 human cDNA libraries in dbEST were hierarchically classified in a hierarchical fashion in four gene expression categories – tissue, pathology, developmental stage, and sex. Web-based application for profiling gene expression using the resulting database is available at http://genome.ewha.ac.kr/EODB/.

Long abstract


B-46  Detection of implicit protein-protein interactions from literature
Tomonori Izumitani1, Frederic Tingaud2, Hirotoshi Taira, Eisaku Maeda
1izumi@cslab.kecl.ntt.co.jp, NTT Communication Science Laboratories; 2tingaud@cslab.kecl.ntt.co.jp, NTT Communication Science Laboratories
Correspondence address: izumi@cslab.kecl.ntt.co.jp

In this study, we propose a method to detect explicit or implicit protein-protein interactions from text data. It was applied to the detection of interactions between yeast proteins. The result indicates that the putative interactions detected by the method can contain true and experimentally unidentified interactions.

Long abstract


B-47  Comparison of intra-molecular disulphide bonding arrangements between disulphide-rich and -poor proteins in the Protein Data Bank
Gerald Hartig1, Tran Trung Tran2, Mark Smythe
1g.hartig@imb.uq.edu.au, Institute for Molecular Bioscience; 2tran@doctor.com, Protagonist Pty Ltd
Correspondence address: g.hartig@imb.uq.edu.au

Intra-molecular disulphide bonds (IDSB) are an important determinant of a protein’s 3D conformation. This work describes the differences in IDSB arrangements between disulphide-rich and -poor proteins. A naturally occurring partition of 25.2 residues / IDSB was used, revealing differences in PDB headers, SCOP folds, IDSB bonding patterns and loop lengths.

Long abstract


B-48  Intimately Incorporated NLP System Adapted for Bio-Text Mining
Young-Sook Hwang1, Hae-Chang Rim2, Kyoung-MePark, Ki-Joong Lee, Hong-Woo Chun
1yshwang@nlp.korea.ac.kr, Korea Univ.; 2rim@nlp.korea.ac.kr, Korea Univ.
Correspondence address: yshwang@nlp.korea.ac.kr

BioNLPro is a system for providing the base for a robust bio-text mining system. It is an intimately integrated NLP system consisting of the adapted core NLP modules reflecting the peculiarities of bio-text including a POS tagger, a biological term recognizer, a grammatical relation tagger based on chunking and a biological event extractor.

Long abstract


B-49  Data to Diamonds: Multivariate Datamining Leads to Concise Gene
Rob Dunne1, Glenn Stone2
1Rob.Dunne@csiro.au, CSIRO ; 2Glenn.Stone@csiro.au, CSIRO
Correspondence address: Rob.Dunne@csiro.au

CSIRO Bioinformatics have developed an analysis methodology based on generalized linear models coupled with a specialized Bayesian variable selection technique. This methodology is capable of producing parsimonious predictors of; Classification targets, Numeric targets using Gaussian, Poisson or Gamma regressions or Survival targets using Cox's proportional hazards regression.


Long abstract


B-50  Detection of Program Source Code Plagiarism Using Genomic Sequence Alignments Methodology
Eun-Mi Kang1, Hwan-Gue Cho2, Young-Min Kang
1emkang@pearl.cs.pusan.ac.kr, Pusan National University; 2hgcho@pusan.ac.kr, Pusan National University
Correspondence address: emkang@pearl.cs.pusan.ac.kr

We propose a new method for detecting the plagiarism by exploiting the genomic sequence alignment. The system extracts linear sequences of keywords from the source code flow, and computes the local alignments to detect local similarity of original sources. The experimental results show this approach is more powerful than fingerprinting-matching.

Long abstract


B-51  Computational comparative analysis framework at the Centre for Bioinformatics and Biological Computing.
M Bellgard1, A Hunter2, D Schibeci
1m.bellgard@murdoch.edu.au, CBBC, Murdoch University; 2a.hunter@cbbc.murdoch.edu.au, CBBC, Murdoch University
Correspondence address: m.bellgard@murdoch.edu.au

The CBBC conducts research in computational biology ranging from comparative genomic sequence analysis, microarray and proteomic data analysis, and novel algorithms and software. The CBBC is developing a comparative analysis framework incorporating audit trailing of analysis, open source activities and distributed resource management. We present an overview of this framework.

Long abstract


B-52  Extraction of patterns in each domain of G-protein-coupled receptors
Jeongho Huh1, Chungoo Park2, Dong Soo Jung, Hong Gil Nam, Jiin Choi, Young Bock Lee
1artist3@postech.ac.kr, Division of Molecular Life Sciences, Pohang University of Science and Technology; 2madreach@bric.postech.ac.kr, Biological Research Information Center(BRIC), Pohang University of Science and Technology
Correspondence address: artist3@postech.ac.kr

Detecting local functional sequence patterns is suitable in GPCR sequence analyses. We attempted to extract patterns that are specific in GPCR subtypes. For extracting patterns, we applied different rules to each domain of three GPCR domains. Consequently, we obtained specific patterns of GPCR clans with high frequency and high specificity.

Long abstract


B-53  Pathway data mining: tissue specificity and potential cross talks between pathways
Yu-Tai Wang1, Ueng-Cheng Yang2, Yung-Wen Deng, Cheng-Min Wei, Kai-Lung Tang, Der-Ming Liou.
1ytwang@ym.edu.tw, Institute of Biochemistry, National Yang-Ming University, Taiwan; 2yang@ym.edu.tw, Bioinformatics Research Center, National Yang-Ming University, Taiwan
Correspondence address: ytwang@ym.edu.tw

Pathway is a way to present the mechanism behind a biological phenomenon. We have developed methods to integrate pathway-related information. By querying this integrated database, users will be able to look up tissue specific pathways and discover possible cross talk between pathways. The query results can output in graphic form.

Long abstract


B-54  GHMM and HMMEd: A toolkit for Hidden Markov Models
Wasinee Rungsarityotin1, Alexander Schliep2
1rungsari@molgen.mpg.de, Max Planck Institute for Molecular Genetics; 2schliep@molgen.mpg.de, Max Planck Institute for Molecular Genetics
Correspondence address: schliep@molgen.mpg.de

We have developed and implemented a library for a general Hidden Markov Model (GHMM) to assist in designing a topology and visualizing parameters for a HMM. The tool has been used in solving problems such as identification of circular permutation with Profile HMMs. GHMM and HMMEd is freely available at http://sourceforge.net/projects/ghmm/.


Long abstract


B-55  Efficiently finding regulatory elements using correlation with gene expression
Hideo Bannai1, Shunsuke Inenaga2, Ayumi Shinohara, Masayuki Takeda, Satoru Miyano
1bannai@ims.u-tokyo.ac.jp, Human Genome Center, Institute of Medical Science, University of Tokyo; 2s-ine@i.kyushu-u.ac.jp, Department of Informatics, Kyushu University
Correspondence address: bannai@ims.u-tokyo.ac.jp

We present an efficient algorithm for detecting putative regulatory elements in the upstream sequences of genes, using expression data obtained from microarrays. We are able to find the optimal pattern, most correlated with the expression levels of the genes, in time linear in the total length of the upstream sequences.

Long abstract


B-56  Discovering useful patterns from DNA microarray experiment with large-scale multifactor design by genetic algorithm and permutation test
Ju Han Kim1, Tae Su Chung2, Jihun Kim, Ji Yeon Park, Hye Won Lee, Jihoon Kim, Mingoo Kim
1juhan@snu.ac.kr, Seoul National University Human Genome Research Institute; 2epiai@korea.com, Seoul National University Human Genome Research Institute
Correspondence address: juhan@snu.ac.kr

We present a method of discovering useful patterns from DNA microarray experiment with large-scale multifactor design with no replication using genetic algorithm. Permutation test for the distance measures between observation and the pattern discovered statistically significant multifactor gene expression patterns with simple biological interpretations.

Long abstract


B-57  QTL analysis for outcrossing family data using genetic algorithm and simulated EM algorithm
Reiichiro Nakamichi1, Satoru Miyano2
1rei-naka@ims.u-tokyo.ac.jp, Human Genome Center, Institute of Medical Science, the University of Tokyo; 2miyano@ims.u-tokyo.ac.jp, Human Genome Center, Institute of Medical Science, the University of Tokyo
Correspondence address: rei-naka@ims.u-tokyo.ac.jp

We propose a new method of quantitative trait loci (QTL) mapping using genetic algorithm (GA) with simulated EM algorithm. It detects QTL without highly organized experimental cross and is applicable to human genetics. Simulation studies showed high performance of our method in the cases not supported by traditional gene mappings.

Long abstract


B-58  Hierarchical-Partitioning: A New Clustering Framework for Gene Expression Data Analysis
Alan Wee-Chung Liew1, Hong Yan2, Lap Keung Szeto
1itwcliew@cityu.edu.hk, Dept of Computer Engineering and Information Technology, City University of Hong Kong; 2ityan@cityu.edu.hk, Dept of Computer Engineering and Information Technology, City University of Hong Kong
Correspondence address: itwcliew@cityu.edu.hk

We introduce a novel hierarchical-partitioning clustering algorithm for gene expression data analysis, which combines both features of hierarchical-based and partitioning-based clustering. Our algorithm performs a successive binary subdivision of the data into smaller and smaller partitions hierarchically, until no further splitting of a (parent) partition into two smaller (children) partitions is possible.

Long abstract


B-59  P-quasi complete linkage clustering method for gene-expression profiles based on distribution analysis
Shigeto Seno1, Reiji Teramoto2, Yoichi Takenaka, Hideo Matsuda
1s-senoo@ist.osaka-u.ac.jp, Department of Bioinformatic Engineering, Graduate School of Information Science and Technology, Osaka University; 2teramoto@sumitomopharm.co.jp, Genomic Science Laboratories, Research Division, Sumitomo Pharmaceuticals
Correspondence address: s-senoo@ist.osaka-u.ac.jp

We propose a new clustering method with the following two features. First, this method exploits a new similarity measure based on distribution of gene expressions. Second, this method leverages the P-quasi complete linkage algorithm for describing clusters. The synergy of the two features provides more informative clustering than traditional ones.

Long abstract


B-60  New tools for exploring noncoding RNA-mediated regulatory networks
S.Stanley1
1S.Stanley@imb.uq.edu.au, IMB
Correspondence address: S.Stanley@imb.uq.edu.au

We present a method for the extraction of minimal complete sets of exactly repeated sequences from genomes, and then extracting subsets with the potential for producing primary sequence-dependent RNA regulatory signals and regulatory networks, initially focused on intronic sequences, wherein we can examine clustering of matched sequences by functional groups. Results are presented for S.cerevisiae.

Long abstract


B-61  DESCRIBER: Graphical Relational Models for Collaborative Filtering in Microarray Data Mining
William H. Hsu1, Roby Joehanes2, Prashanth Boddhireddy
1bhsu@cis.ksu.edu, Kansas State University; 2robbyjo@cis.ksu.edu, Kansas State University
Correspondence address: bhsu@cis.ksu.edu

This poster presents DESCRIBER, a system that uses graphical models to represent relational data in computational genomics portals cf. myGrid, integrating descriptive data models for microarray data mining and extending the information retrieval capabilities of indices such as ResearchIndex. The objective is to provide collaborative filtering (CF) over data, metadata, source code (cf. OpenBio), and experimental documentation.

Long abstract


B-62  A Software Toolkit for Learning Dynamic Graphical Models of Gene Regulatory Structure from Microarray Data
William H. Hsu1, Youping Deng2, J. Clare Nelson, Judith L. Roe
1bhsu@cis.ksu.edu, Kansas State University; 2ypdeng@ksu.edu, Kansas State University
Correspondence address: bhsu@cis.ksu.edu

We present BNJ, an experimental Java-based software toolkit for learning network models of gene regulation from microarray data. We survey current research issues in learning the structure of graphical models, outline the components of BNJ used in modeling regulatory dynamics of S. cerevisiae, and present preliminary results and current research directions.

Long abstract


B-63  Computational prediction of macrophage specific regulatory network
Brendan Tse1, Timothy Ravasi2, Christine Wells, Yi-Ping Phoebe Chen, David Hume
1s371293@student.uq.edu.au, University of Queensland; 2t.ravasi@imb.uq.edu.au, University of Queensland
Correspondence address: s371293@student.uq.edu.au

This project aims to create an analytical pipeline that links existing pattern discovery tools to allow automation of transcriptional element pattern predictions to be preformed simultaneously across multiple species directly from microarray experimental results. The system may be used as a means to map transcriptional pathways to macrophage specific regulatory networks.

Long abstract


B-64  Protein Superfamily Clustering using Biomedical Text Mining via the Information Bottleneck Method
Sahng-Joon Auh1, Jae-Hong Eom2, Byoung-Hee Kim, Byoung-Tak Zhang
1sjauh@bi.snu.ac.kr, Biointelligence Laboratory; 2jheom@bi.snu.ac.kr, Biointelligence Laboratory
Correspondence address: jheom@bi.snu.ac.kr

We present a novel implementation of protein superfamily clustering using biomedical literature via the recently introduced information bottleneck method which shows good performance in document clustering. We test our method over 1866 saccharomyces cerevisiae proteins in COGs (Clusters of Orthologous Groups of proteins) by NCBI (National Center for Biotechnology Information).

Long abstract


B-65  A Bayesian HMM algorithm for the identification of gene families
Richard Boys1, Daniel Henderson2
1richard.boys@ncl.ac.uk, University of Newcastle upon Tyne; 2d.a.henderson@open.ac.uk, The Open University
Correspondence address: richard.boys@ncl.ac.uk

We describe an algorithm that identifies families of genes with similar nucleotide patterns and hopefully similar function. The approach is quite general in terms of its flexibility with respect to the number of different families that may be present and the complexity of the structure within each family.

Long abstract


B-66  Bio-Linux: An integrated bioinformatics solution for the EG community
Dan Swan1, Bela Tiwari2, Dawn Field
1dswan@ceh.ac.uk; 2btiwari@ceh.ac.uk
Correspondence address: dswan@ceh.ac.uk

Bio-Linux is an integrated, bioinformatics-centred, research platform. By providing both standard favourite and cutting edge bioinformatics tools on a Linux-based system, it combines the benefits of being powerful, configurable, and easily updateable, with the ease of use and potential for software integration required for the handling and analysis of biological data.

Long abstract



Data Visualisation
C-1  Automated Construction of Comparative Maps between Zebrafish, Human, Rat and Mouse
Jedidiah Mathis1, Victor Ruotti2, Jeff Nie, Dan Chen, John Postlethwait, Monte Westerfield, Michael Thomas, Michael Carvan, Peter Tonellato
1jmathis@mcw.edu, Medical College of Wisconsin; 2vruotti@mcw.edu, Medical College of Wisconsin
Correspondence address: jmathis@mcw.edu

Radiation hybrid maps coupled with the mapping of expressed sequence tags and their organization into UniGene clusters, has revolutionized the way comparative maps are built and maintained. We have used publicly available rat, mouse, human, and zebrafish data to build completely integrated comparative maps.

Long abstract


C-2  GET3D, A Genomic Exploration Tool in 3D
John Gill1
1john.gill@monash.edu.au, Victorian Bioinformatics Consortium, Monash University
Correspondence address: john.gill@med.monash.edu.au

The GET3D, Genomic Exploration Tool in 3D, software tool is a complimentary product to the CAS, Categorised Annotation Set, system. Through 3D visualization and interaction techniques it allows for the manipulate of the CAS dataset, including the highlighting and exploration of data relationships, and for adding new information and relationships.

Long abstract


C-3  Application of Q-Gene software for the quantitation on Nipah virus in experimental animals using real-time PCR
L.I. Pritchard1, Y. Kaku2, G. Crameri, B.T. Eaton, D.B. Boyle
1ian.pritchard@csiro.au, AAHL CSIRO; 2, National Institute of Animal Health, Toyko, Japan
Correspondence address: ian.pritchard@csiro.au

Quantitative real-time PCR represents a highly sensitive and powerful technique for the high-throughput analysis of virus load and gene expression. We used the Q-Gene software (Muller et al., 2002) to expedite the statistical analysis, graphical presentation and evaluation of the real-time PCR quantitation of Nipah virus in experimental animals.

Long abstract


C-4  Comparing Patterns in Gene Expression in Longitudinal Array Experiments Using a Novel Algorithm, TAPiR: Time-course Algorithm for Pattern Recognition
Catherine Campbell1, Raj Lingam2, Yang Fann
1campbelc@ninds.nih.gov, NIH-NINDS; 2lingamr@ninds.nih.gov, NIH-NINDS
Correspondence address: campbelc@ninds.nih.gov

TAPiR is a novel algorithm for identifying, clustering and visualizing time-course microarray experiments based on either fold change or t-test p-value thresholds. TAPiR assigns letters to specific patterns of change that can be oriented along the time-course to form “words” that can then be sorted alphabetically to identify similar clusters.

Long abstract


C-5  Correlation analysis as a preprocessing tool in clustering of time-course gene expression timecourse data
Christopher Bowman1, Richard Baumgartner2, Stephanie Booth
1Christppher.Bowman@nrc-cnrc.gc.ca, Institute for Biodiagnostics; 2Richard.Baumgartner@nrc-cnrc.gc.ca, Institute for Biodiagnostics
Correspondence address: Christopher.Bowman@nrc-cnrc.gc.ca

We apply correlation analysis, a tool developed for analysis of functional magnetic resonance images to microarray timecourse experiments. Although fMRI and microarray timecourses share little in common physically, the data analysis tasks are quite similar, and correlation analysis is shown to be a useful preprocessing tool to apply prior to clustering gene expression timecourses.

Long abstract


C-6  Uncovering Hidden Linkages among Disparate Information Sources
Edy S. Liongosari1, Mitu Singh2
1edy.s.liongosari@accenture.com, Accenture Technology Labs; 2mitu.singh@accenture.com, Accenture Technology Labs
Correspondence address: edy.s.liongosari@accenture.com

The Knowledge Discovery Tool or KDT is a tool that utilizes a knowledge modeling approach to intelligently extract and integrate a large set of disconnected bio-medical information. Its unique user interface allows it users to see how the entities are linked together, uncovers hidden linkages and highlights certain unusual links that might be worth exploring.

Long abstract


C-7  Poster Title: Linguistic profiling of genome sequences. The Sequence identifier position end set cardinal: an estimate of linear sequence complexity: algorithms and genome profiling applications.
Christophe Lefevre1
1chris.lefevre@med.monash.edu.au, Victorian Bioinformatics Consortium
Correspondence address: chris.lefevre@med.monash.edu.au

The sequence identifier end set cardinal is proposed as a new estimator of linear sequence linguistic complexity. A scanning window algorithm to compute this value is presented and profiles obtained with genomic sequences are discussed.

Long abstract


C-8  GENOME-WIDE HAPLOTYPE STRUCTURE VISUALIZATION AND ANALYSIS IN MOUSE
Tim Wiltshire1, Serge Batalov 2, Mathew Pletcher, R.J.Mural, M.D.Adams, C.F.Fletcher
1timw@gnf.org, GNF; 2batalov@gnf.org, GNF
Correspondence address: timw@gnf.org

SNPview, the interactive navigator for the individual SNPs, SSLPs, alleles and haplotypes projected to the genomic axis is available on-line at http://www.gnf.org/SNP/ . Large, but discrete regions of the genome are not very polymorphic between particular strain pairs, and thus cannot easily be interrogated for natural genetic variations influencing QTLs.

Long abstract


C-9  Molecular Modeling using Virtual Reality with Force Feed Back
Hiroshi Mizushima1, Hiroshi Tanaka2, Masaaki Hatsuta, Daisuke Arai, Hiroshi Nagata
1hmizushi@ncc.go.jp, National Cancer Center Research Institute; 2tanaka@tmd.ac.jp, Tokyo Medical Dental University
Correspondence address: hmizushi@ncc.go.jp

We developed a computer aided molecular modeling system using virtual reality technologies. Although it is still a prototype, the most characteristic function of the system is enabling its user to “touch” and “feel” the electrostatic potential field of a protein or a drug molecule.

Long abstract


C-10  BIRCH - A portable and comprehensive bioinformatics platform
Brian Fristensky1
1frist@cc.umanitoba.ca, University of Manitoba
Correspondence address: frist@cc.umanitoba.ca

BIRCH is a  resource of integrated programs and databases for molecular biology, unified through the GDE graphic interface. The BIRCH framework is designed for semi-automated installation and customization on Unix systems, and integration of locally-installed software and databases into BIRCH. http://home.cc.umanitoba.ca/~psgendb


Long abstract


C-11  BioViz: Brassica Arabidopsis Comparative Genome Browser - The application of Scalable Vector Graphics to comparative genomics
Christopher T Lewis1, Andrew Sharpe2, Stephen Karcz, Isobel AP Parkin, Derek Lydiate
1LewisCT@agr.gc.ca, Agriculture and Agri-food Canada; 2sharpea@agr.gc.ca, Agriculture and Agri-food Canada
Correspondence address: LewisCT@agr.gc.ca

SVG has enabled a visually appealing application for the visual comparison of Brassica napus and Arabidopsis thaliana. SVG overcomes two key drawbacks of current web-based genome browsers: fixed displays and frequent page reloads. The Brassica Araboidopsis Comparative Genome Browser is available online at http://www.brassica.ca.


Long abstract


C-12  STING MILLENNIUM SUITE v.3 and JAVA PROTEIN DOSSIER: a novel concept in data visualization and analysis of the protein structure/function relationship
Goran Neshich1, Roberto Togawa2, Walter Rocchia, Adauto L. Mancini, Paula R. Kuser, Michel E. B. Yamagishi, Alexandre Alvaro, Christian Baudet and Roberto H. Higa
1neshich@cnptia.embrapa.br, EMBRAPA/CNPTIA; 2togawa@cenargen.embrapa.br, EMBRAPA/CENARGEN
Correspondence address: neshich@cnptia.embrapa.br

STING Millennium (SMS) and Java Protein Dossier (JPD) make a powerful duo for the structural analysis of macromolecules. SMS is a web based set of programs and databases for analysis of protein structures, while JPD provides an abundant collection of physical and chemical descriptors/parameters. SMS/JPD v.3 is available at http://www.cbi.cnptia.embrapa.br

Long abstract


C-13  A Visualization Framework to Assist in the Selection of SNP Markers for Association Studies of Complex Diseases
Francisco M. De La Vega1, Hadar Avi-Itzhak2
1delavefm@appliedbiosystems.com, Applied Biosystems; 2AviitzHI@appliedbiosystems.com, Applied Biosystems
Correspondence address: delavefm@appliedbiosystems.com

We developed a framework to visualize SNPs, haplotype blocks, and genes across chromosomal physical maps and their relationship with linkage disequilibrium maps. This visualization is aimed to the cost-effective selection of SNP markers for disease association studies as a function of the profile of LD obtained on reference population samples.

Long abstract


C-14  Metabolic Control Analysis of Gene-knockout Escherichia coli Based on the Inverse Flux Analysis with Experimental Verification
Md. Aminul Hoque1, Khandaker Al Zaid Siddiquee2, Kazuyuki Shimizu
1aminul@sfc.keio.ac.jp, Institute for Advanced Bioscience, Keio University, Tsuruoka, 997-0035, Japan; 2, Department of Biochemical Engineering and Science, Kyushu Institute of Technology,
Correspondence address: aminul@sfc.keio.ac.jp

It was shown from Inverse Flux Analysis (IFA) of Escherichia coli that if pyk was knocked out, then the flux through ppc and TCA cycle pathway increased, and the acetate production flux reduced but acetate production increased while ppc was knocked out. The experimental data well coincided with the IFA results.

Long abstract



Databases
D-1  GeneView: A Dynamic Gene Annotation System and Its Application to Microarray Data Analysis
Xiang Yao1, Heng Dai2, Bin Tian, David Zhao, Albert Leung, Simon Smith, and Jackson Wan
1xyao@prdus.jnj.com, Johnson and Johnson PRD; 2hdai1@prdus.jnj.com, Johnson and Johnson PRD
Correspondence address: xyao@prdus.jnj.com

I. We have developed a system that monitors various data sources, dynamically extracts gene information, comprehensively matches genes, and integrates them into a central database by categories, such as pathway, genetic mapping, phenotype, expression profile, domain structure, protein interaction, disease association, and references. The system achieves high performance when querying a large batch of genes together

Long abstract


D-2  http://elm.eu.org - ELM Resource for prediction of functional sites in proteins
Rune Linding1, ELM Consortium2
1linding@embl.de, EMBL; 2info@elm.eu.org, ELM Consortium
Correspondence address: linding@embl.de

ELM is a resource for predicting functional sites in eukarytic proteins. Putative functional sites are identified by conventional methods, such as patterns (regular expressions) or hidden Markov models. To improve the predictive power, context-based rules and logical filters will be developed and applied to reduce the amount of false positives.

Long abstract


D-3  FlyMine: An integrated database for Drosophila and Anopheles genomics
Gos Micklem1, Andrew Varley2, Richard Smith, Rachel Lyne
1gos@gen.cam.ac.uk, University of Cambridge; 2ajv12@cam.ac.uk, University of Cambridge
Correspondence address: ajv12@cam.ac.uk

FlyMine is a project to build an integrated database of genomic, expression and protein data for Drosophila and Anopheles. Data are stored massively redundantly and arbitrary queries are allowed using a web interface or Java API. Queries are re-written in real time to make use of redundant tables.

Long abstract


D-4  InterPro, a protein functional classification resource
Nicola Mulder1, InterPro Consortium2
1mulder@ebi.ac.uk, EBI; 2interpro@ebi.ac.uk, EBI
Correspondence address: mulder@ebi.ac.uk

InterPro is an integrated protein signature resource for predicting protein families, domains and functional sites. It incorporates data from PROSITE, Pfam, PRINTS, ProDom, SMART, TIGRFAMs, PIR Superfamilies and the structure-based SUPERFAMILY. InterPro classifies 80% of SWISS-PROT/TREMBL and provides links to the proteins, methods, specialised protein family resources and structural information.

Long abstract


D-5  dbZach: An Integrative Toxicogenomic Supportive Relational Database System
Lyle D Burgoon1, Paul C Boutros2, Edward Dere, Shane Doran, Shraddha Pai, Raeka Aiyar, Jigger Vakharia, Rebecca Rotman, Tim Zacharewski
1burgoonl@msu.edu, Michigan State University; 2 Michigan State University
Correspondence address: burgoonl@msu.edu

The dbZach System is a microarray information management system meant for local installation in labs or larger, enterprise environments. The dbZach database includes four core subsystems, and six modular subsystems. The system also includes GUI data mining tools, and an API. Source code will be available soon at: http://dbzach.fst.msu.edu.

Long abstract


D-6  MyMED - An Internal XML Relational Database Implementation of MEDLINE Citations
K Lewis1, CW Hogue2
1lewis@mshri.on.ca, Samuel Lunenfeld Research Institute, Mt Sinai Hospital, Dept of Biochemistry, University of Toronto; 2hogue@mshri.on.ca, Samuel Lunenfeld Research Institute, Mt Sinai Hospital, Dept of Biochemistry, University of Toronto
Correspondence address: lewis@mshri.on.ca

MyMED is an internal relational XML database implementation of MEDLINE citations. The MyMED database is necessary to execute text mining algorithms and complex text searches in a fast, secure manner. Data is stored in a DB2 database that is enabled for the XML Extender and Text Information Extender.

Long abstract


D-7  RAT GENOME DATABASE - RGD - DISEASE ORIENTED RESEARCH RESOURCE
Aubrey Hughes1, Jedediah Mathis2, Mary Shimoyama, Milton Datta, Simon Twigger, Charles W.Wang, Nataliya Nenasheva, Dean Pasko, Norberto de la Cruz, Victor Ruotti, Susan Bromberg, Chin-Fu Chen, Rajni Nigam, Gopal Gopinathrao, Angela Zuniga-Myer and Peter Tonellato
1ahughes@mcw.edu, Medical College of Wisconsin; 2jmathis@mcw.edu, Medical College of Wisconsin
Correspondence address: ahughes@mcw.edu

The Rat Genome Database (RGD) provides research groups access to rat genomic and genetic data, including annotated sequence, related to particular diseases. DORR will help RGD answer the need of the rat community for access to curated data of interest. rgd.mcw.edu

Long abstract


D-8  RAT GENOME DATABASE - RGD - MAPPING DISEASE ONTO THE GENOME
Aubrey Hughes1, Jedediah Mathis2, Mary Shimoyama, Norberto de la Cruz, Charles W. Wang, Nataliya Nenasheva, Dean Pasko, Jiali Chen, Lan Zhao, Chunyu Fan, Wenhua Wu, Chin-Fu Chen, Rajni Nigam, Gopal Gopinathrao, Angela Zuniga-Meyer, Susan Bromberg, Jessica Ginster, Anne Kwitek-Black, Janan Eppig, Lois Maltais, Donna Maglott, Greg Schuler, Simon Twigger, Howard Jacob and Peter Tonellato
1ahughes@mcw.edu, Medical College of Wisconsin; 2jmathis@mcw.edu, Medical College of Wisconsin
Correspondence address: ahughes@mcw.edu

The Rat Genome Database (RGD) is a disease centric resource of comprehensively curated data from rat genetic and genomic research. RGD's goal is to provide information that will aid researchers in using the rat as a model organism for human disease studies. RGD is available at rgd.mcw.edu .

Long abstract


D-9  Submission tools for EMBL-Bank, EMBL-Align and SWISS-PROT databases
Vincent Lombard1, Mary ann Tuli, Robert Vaughan, Minna Lehvaslaiho, Weimin Zhu and Rolf Apweiler
1lombard@ebi.ac.uk, EBI
Correspondence address: lombard@ebi.ac.uk

Webin, Webin-Align and SPIN are the web-based tools for submitting nucleotide sequences, nucleotide sequence alignments and protein sequences respectively to EMBL-Bank, EMBL-Align, or SWISS-PROT databases. These tools guide you through a sequence of WWW forms allowing interactive submission. All the information required to create a database entry is collected during this process. All submission tools are available at http://www.ebi.ac.uk/Submissions/index.html


Long abstract


D-10  UniProt: Universal Protein Databases for Protein Sequences and Function
Allyson Williams1, Maria Jesus Martin2, Claire O'Donovan, Daniel Barrell, Alexander Fedotov, Rolf Apweiler
1allyson@ebi.ac.uk, EMBL - EBI; 2martin@ebi.ac.uk, EMBL - EBI
Correspondence address: allyson@ebi.ac.uk

The UniProt Consortium (European Bioinformatics Institute, Swiss Institute of Bioinformatics, and the Protein Information Resource) was created to merge Swiss-Prot, TrEMBL and PIR database activities into UniProt, a comprehensive resource of protein sequences and function. UniProt has three layers: protein sequence archive, protein knowledgebase, and non-redundant reference (NREF) databases.

Long abstract


D-11  PRIME: automatically extracted PRotein Interactions and Molecular Information databasE.
Asako Koike1, Yoshiyuki Kobayashi2, Toshihisa Takagi
1akoike@ims.u-tokyo.ac.jp, Human Genome Center, The Institute of Medical Science, Univ. of Tokyo; 2yashi@ls.hitachi.co.jp, Life Science Group, Hitachi, Ltd.
Correspondence address: akoike@ims.u-tokyo.ac.jp

PRIME(http://prime.ontology.ims.u-tokyo.ac.jp/) is an integrated database involving major completely sequenced eukaryotes. It contains the protein-protein/gene/compound interaction data extracted by natural language processing, domain information, structural information, protein kinase classification, and ortholog tables among organisms. The comparison and prediction of pathways are also available by an automatic pathway graphic image interface.

Long abstract


D-12  Automating Data Collection And Categorisation Using The CAS Software
John Gill1
1john.gill@monash.edu.au, Victorian Bioinformatics Consortium, Monash University
Correspondence address: john.gill@med.monash.edu.au

CAS is a data integration system that allows for the creation of categories and integrated data sets. Each category consists of a set of attributes and methods, which act as an object item. Incoming data, obtained manually or through the automated data collection component, is linked to category attributes according to user defined rules and conditions.

Long abstract


D-13  Building a Database of Protein Structure Using a Geographic Model based on Topological Consistency
Sung-Hee Park1, Keun Ho Ryu2, Byeong-Jin Jeong Hyeon S. Son
1shpark@dblab.chungbuk.ac.kr, Chungbuk National University; 2khryu@dblab.chungbuk.ac.kr, Chungbuk National University
Correspondence address: shpark@dblab.chungbuk.ac.kr

We propose protein structure modeling using a geographic model and build a structure database which includes thematic information and geometry of protein. In the modeling, geometry is represented by spatial types and thematic information includes the physico-chemical data. We state queries to retrieve topological relationship between structural elements with spatial operators.

Long abstract


D-14  Version Management of a Genomic Sequence Database Using Active Rules and Temporal Concepts
Sung-Hee Park1, Keun Ho Ryu2, Byeong-Jin Jeong Hyeon S. Son
1shpark@dblab.chungbuk.ac.kr, Chungbuk National University; 2khryu@dblab.chungbuk.ac.kr, Chungbuk National University
Correspondence address: shpark@dblab.chungbuk.ac.kr

We propose modeling of sequence versions for sequence changes of the same piece of DNA using a time stamp attribute in a temporal data model and mechanism of management of sequence versions in a sequence database by applying trigger rules(Event-Condition-Action) in an active database system.

Long abstract


D-15  GCC: a database system for immune cells transcriptomes
Andrea Splendiani1, C.Vizzardelli,N.Pavelka,M.Pelizzola,M.Capozzoli,O.Beretta,F.Granucci,P.Ricciardi-Castagnoli
1andrea.splendiani@unimib.it, Univ. Milano Bicocca
Correspondence address: andrea.splendiani@unimib.it

GCC is a database for immune cells transcriptomes. It is based on a database system (built using opensource technologies) that allows for data storage and intelligent retrieval. It is targeted at the affymetrix platform and allows for MIAME compliant experimental annotation and supports adoption of ontologies.

Long abstract


D-16  dbSTR: A Database for Short Tandem Repeats
Haifeng Liu1, Loo Nin Teo2, Eric Yap, Linda Gan, Hui Min Wu, Sock Hoon Ng, Adrian Eng, Loo See Teo, Keng Wah Chao
1lhaifeng@dso.org.sg, DSO National Laboratories, Singapore; 2tloonin@dso.org.sg, DSO National Laboratories, Singapore
Correspondence address: lhaifeng@dso.org.sg

dbSTR is a repository of short tandem repeats (microsatellites) whose polymorphisms have either been predicted using machine learning or verified using wet-lab methods. These STRs could be useful markers for high resolution linkage and association studies. dbSTR is freely available at http://www.dbstr.org.

Long abstract


D-17  Integration and representation of heterogeneous metabolic databases for the analysis of metabolism: BIOSILICO
Jin Sik Kim1, Ji Hoon Jun2, Yong Wook Kim, Sujin Chae, Mira Roh, Yong-Ho In and Sang Yup Lee
1jskim@mail.kaist.ac.kr, KAIST; 2gene2@bioinfomatix.com, Bioinfomatix Inc.
Correspondence address: jskim@mail.kaist.ac.kr

BIOSILICO is a web-based database system that facilitates the search and analysis of metabolic pathways. BIOSILICO allows efficient retrieval of all available information on enzymes, compounds, reactions and pathways by integrating the heterogeneous metabolic databases and generates well-designed view pages showing retrieved data in a systematic way for easy understanding.

Long abstract


D-18  GlycoSuiteDB: A curated relational database of glycoprotein glycan structures
Hiren J. Joshi1, Sarah Jarvis2, Jonathan W Arthur, Mathew J. Harrison, Marc R. Wilkins, Nicolle H. Packer, Catherine A. Cooper
1hirenj@proteomesystems.com, Proteome Systems; 2sjarvis@proteomesystems.com, Proteome Systems
Correspondence address: Jonathan.Arthur@proteomesystems.com

GlycoSuiteDB is a relational database of published glycan structures designed to assist researchers in the analysis of glycans. GlycoSuiteDB can be accessed from http://www.glycosuite.com

Long abstract


D-19  SNP PrimerPicker
Yip-Kuen Lau1, Ching-Fun Lau2, Henry Yiu-Hang Fu, Hong Xue
1henryfu@ust.hk, Applied Genomics Center, Hong Kong Bioinformatics Center, ParmacoGenetics Ltd, Department of Biochemistry, Hong Kong University of Science and Technology; 2carolau@ust.hk, Applied Genomics Center, Hong Kong Bioinformatics Center, ParmacoGenetics Ltd, Department of Biochemistry, Hong Kong University of Science and Technology
Correspondence address: hxue@ust.hk

SNP PrimerPicker is a software system for designing primers. Sequences in record are aligned with the source using bl2seq. It utilizes Primer3 to find primers and blastcl3 to compare similarity with the chromosomes. The components are integrated to make the procedure convenient. It is web accessible at http://bcz099.ust.hk/primerpicker/.

Long abstract


D-20  Melbourne Brain Genome Project
Seong-Seng Tan1, Lavinia Hyde2, Masters C, Gunnersen J, Kenshole B, Job C, Augustine C, Boon W-M, Brown M, Scott HS
1s.tan@hfi.unimelb.edu.au, Howard Florey Institute of Experimental Medicine and Physiology; 2hyde@wehi.edu.au, Walter and Eliza Hall Institute
Correspondence address: hyde@wehi.edu.au

The Melbourne Brain Genome Project is an Internet resource for studying gene expression as measured by serial analysis of gene expression, in both normal mice and specific mouse models. These models mimic neurodegenerative human diseases. The resource includes tools developed to analyse this data and is available at http://www.mbgproject.org.

Long abstract


D-22  Assembly as part of a DNA sequence management system
Zmasek, C. M.1, Lapp, H.2, Ching, K.; Wiltshire, T.; Fletcher, C.; Orth, A.
1czmasek@gnf.org, Genomics Institute of the Novartis Research Foundation; 2hlapp@gnf.org, Genomics Institute of the Novartis Research Foundation
Correspondence address: czmasek@gnf.org

We describe a tool to manage assembly processes. Two main advantages of this system are: [i] Based on "assembly projects" which contain all the relevant data of an assembly. This allows for re-assembly at a later point with (e.g.) additional input sequences. [ii] All user interaction is through a GUI.

Long abstract


D-23  Yeast Protein Interactomes: The Novel Platform and Value-Added Database
Chung-Yen Lin1, Chi-Shang Cho2, Chen-Zen Lo, Chao A. Hsiung
1cylin@nhri.org.tw, National Health Research Institutes; 2vecstar@nhri.org.tw, National Health Research Institutes
Correspondence address: cylin@nhri.org.tw

Based on the composing of php, Mysql, Linux, we construct the S. cerevisiae (15,000 entries) and putative C. elegans protein interactions database. This database can help to annotate novel proteins by the interacting partners; also can provide proper candidates to narrow down the scale of further high-throughput screening experiments.

Long abstract


D-24  Free public services from the European Bioinformatics Institute, the European Molecular Biology Laboratory outstation.
Nicola Harte1, Rodrigo Lopez2, Karyn Duggan, Rob Harper, Asif Kibria, Adam Lowe, Gulam Patel, Sharmila Pillai, Emmanuel Quevillon, Stephen Robinson, Ville Silventoinen
1nharte@ebi.ac.uk, EMBL-EBI; 2rls@ebi.ac.uk, EMBL-EBI
Correspondence address: nharte@ebi.ac.uk

The EBI provides free, publicly available bioinformatics services for the scientific community. These can be divided up into the following categories: data submissions processing, biological database production, access to query, analysis and retrieval systems, ftp downloads, training and education and user support. These services are available at: http://www.ebi.ac.uk/services.

Long abstract


D-25  KEGG API: A new web service for accessing the KEGG database
Shuichi Kawashima1, Toshiaki Katayama2, Yoko Sato, Minoru Kanehisa
1shuichi@kuicr.kyoto-u.ac.jp, Bioinformatics Center, Institute for Chemical Research, Kyoto University; 2k@bioruby.org, Bioinformatics Center, Institute for Chemical Research, Kyoto University
Correspondence address: shuichi@kuicr.kyoto-u.ac.jp

KEGG API is a new web service for accessing the KEGG database. Using the APIs in the local program, the user can retrieve various information about genes, pathways, chemical compounds etc. stored in the latest versions of the KEGG database. KEGG API is available at http://www.genome.ad.jp/kegg/soap/.


Long abstract


D-26  A technology for integration of databases with common subject domains
Maria Samsonova1, Andrei Pisarev2, Maxim Blagov
1samson@spbcas.ru, SPbSPU; 2pisarev@spbcas.ru, SPbSPU
Correspondence address: samson@spbcas.ru

We present a novel approach to the integration of distributed molecular biology information resources, which consists in a design of an adaptive natural language interface and application of multiagent technology. Our approach permits integration of any databases which have a common subject domain. The implemented prototype is available at http://urchin.spbcas.ru/NLP/NLP.htm.

Long abstract


D-27  JPIPE, A pipeline module for JEMBOSS
Alex Garcia1, Leyla J. Garcia2, Mark A. Ragan, Yi-Ping Phoebe Chen
1a.Garcia@imb.uq.edu.au, Institute for Molecular Bioscience; 2leyla.garcia@unisabana.edu.co, U. de la Sabana
Correspondence address: a.garcia@imb.uq.edu.au

We present a module (JPIPE) that allows EMBOSS users to build analysis pipelines under the JEMBOSS GUI. JPIPE is a flexible workflow system that complements JEMBOSS. A tracking system is a part of JPIPE so the user is able to recreate pipelines, compare results at a particular point of the workflow and administer ongoing jobs.

Long abstract


D-28  An integrative searchable database for Bioinformatics tools, algorithms and software: pBIRD
Sumeet Muju1, Catherine Campbell2, KaiIng Chow Yang Fann
1mujus@ninds.nih.gov, NIH-NINDS; 2campbelc@ninds.nih.gov, NIH-NINDS
Correspondence address: mujus@ninds.nih.gov

pBIRD is a fully-searchable web based system designed to identify, catalog and describe a broad range of available bioinformatics software. The system is designed to upload, store, track and retrieve information associated with and about bioinformatics software products deposited in the database from both internal and external sources.

Long abstract


D-29  A genetic polymorphism object model and XML implementation: Biological Variation Markup Language.
Greg Tyrelle1, Garry C. King2
1greg@kinglab.unsw.edu.au, UNSW; 2garry@kinglab.unsw.edu.au, UNSW
Correspondence address: greg@kinglab.unsw.edu.au

As molecular genotyping technologies accelerate there is an increasing need to communicate precise information on polymorphism data in machine-readable format. We have developed a hierarchical object model and XML implementation called Biological Variation Markup Language (BVML) to facilitate exchange between genotyping laboratories and distributed databases.

Long abstract


D-30  updateBASE : Real-time automatic updating system of biological databases under the client-server environment.
Sujin Chae1, Mira Roh2, Ji-Hoon Jun, Geunwoo Lee, Yong-ho In
1sujin@bioinfomatix.com, Bioinfomatix Inc.; 2mrroh@bioinfomatix.com, Bioinfomatix Inc.
Correspondence address: sujin@bioinfomatix.com

We developed the updateBASE system which provides real-time, automatic updating of biological databases under the client-server architecture. Using this system, an annotator can get up-to-date database sources and get more confident annotation results. The updateBASE will play an effective supporting role in predicting functions and relationships of unknown sequences.

Long abstract


D-31  CleanBank: a database of sequence artifacts
Hanne Volpin1, Eitan Rubin2
1hanne@agri.gov.il, Bioinformatics, Agricultural Research Organization, Bet Dagan, Israel; 2Eitan.Rubin@weizmann.ac.il, Bioinformatics and Biological Computing, Weizmann Institute of Science, Rehovot, Israel
Correspondence address: Eitan.Rubin@weizmann.ac.il

CleanBank is a database that documents suspected artifacts found in sequences and/or their annotation in the international sequence databases. The artifacts are either reported by researchers, or identified by curated algorithms. Current algorithms detect E. coli and vector contamination. For a detailed description and a preview, see http://bip.weizmann.ac.il/MIW/CleanBank/index.html

Long abstract


D-32  Optimizing Genome Interval Overlap Queries Using an R-Tree Index
Hilmar Lapp1, Chris Mungall2, Scott Cain, Lincoln Stein
1hlapp@gnf.org, GNF; 2cjm@fruitfly.org, University of California, Berkely
Correspondence address: hlapp@gnf.org

We present a solution to the huge variance problem that has plagued B-tree supported genome interval overlap queries. Our approach is based on translating the overlap query into a two-dimensional point-in-box geometric query supported by an R-tree index.

Long abstract


D-33  A Pathway DB: Annotating Signal Transduction Pathways with bio-processes using hierarchical multi-layered structures.
Ken Ichiro Fukuda1, Yuki Yamagata2, Toshihisa Takagi
1fukuda-cbrc@aist.go.jp, CBRC, AIST; 2snowfox@hgc.jp, BIRD, JST
Correspondence address: fukuda-cbrc@aist.go.jp

A database that formalizes Signal Transduction Pathway knowledge in scientific literatures is presented. The database focuses on annotating pathways or sub-pathways according to their related biological processes. Every process and element in a pathway has a pointer to ontologies, such as GO, and one can search (sub-)pathways, molecules by using them.

Long abstract


D-34  ANTIMIC: A database of antimicrobial peptides
Manisha Brahmachary1, Judice L.Y.Koh, Mohammad Asif Khan,Seah Seng Hong Tin Wee Tan, Vladimir Bajic
1manisha@lit.org.sg, Institute of Infocomm Research
Correspondence address: manisha@lit.org.sg

ANTIMIC is a specialized database dedicated to antimicrobial peptides. It contains useful analysis tools that can aid wet-lab scientists to determine the family and function of a putative new anti-microbial peptide and also in design of artificial anti-microbial peptides.

Long abstract


D-35  OrthoDisease: A Human Disease Ortholog Database
Kevin O Brien1, Isabelle Westerlund, Erik Sonnhammer
1kevobr@mbox.ki.se, Karolinska Institutet
Correspondence address: kevobr@mbox.ki.se

We report the construction of a novel database termed OrthoDisease, which was constructed using the Inparanoid program to analyze a list of disease genes derived from the Mendelian Inheritance in Man database. Our database is accessible online at orthodisease.cgb.ki.se and can be searched according to disease/gene/protein name or EC/MIM number.

Long abstract


D-36  BASE - a free microarray database system
Carl Troein1, Johan Vallon-Christersson2, Lao Saal, Jari Häkkinen
1carl@thep.lu.se, Dept. of Theor. Phys., Lund University; 2johan.vallon-christersson@onk.lu.se, Dept. of Oncology, Lund University
Correspondence address: carl@thep.lu.se

BASE is a free microarray database system with a clean and intuitive web interface. It manages biomaterials, array production and raw data with images. Analysis tools are included, and users can provide new tools through a plugin interface. The BASE web site is http://base.thep.lu.se/.

Long abstract


D-37  MARS: Mutation Analysis Reporting System for Human Genetic Disease
Byeong-Chul Kang1, Jun-Hyung Park2, In-Joo Kim, Hyo-Myung Kim, Hee-Kyung Park, and Cheol-Min Kim
1bckang@pusan.ac.kr, Interdisciplinary Program of Bioinformatics, Graduate School, Pusan National University; 2jhaprk98@pusan.ac.kr, Busan Genome Center, College of Medicine, Pusan National University
Correspondence address: bckang@pusan.ac.kr

MARS is an intelligent diagnosis system for human genetic disease. The MARS consists of databases of human genetic disease information and mutation detection system. The first release of MARS contains genetic information of MECP2 gene and its website is available at http://www.genome.re.kr/mars/

Long abstract


D-38  SDPS: Small Disulphide-bonded Proteins Structural database
Lesheng Kong1, Shoba Ranganathan2
1lesheng@bic.nus.edu.sg, National University of Singapore; 2shoba@bic.nus.edu.sg, National University of Singapore
Correspondence address: lesheng@bic.nus.edu.sg

SDPS database is a comprehensive structural database of small disulphide-bonded proteins. This database is enriched with a number of new features which cannot be easily accessed through public databases. The database aims to facilitate the research on small disulphide-bonded proteins especially on disulphide connectivity features. SDPS database can be accessed freely at http://origin.bic.nus.edu.sg/sdps.

Long abstract


D-39  BioPAX - Biological Pathway Data Exchange Format
BioPAX Group1
1pax@cbio.mskcc.org, BioPAX
Correspondence address: jluciano@biopathways.org

BioPAX (http://biopax.org) is a new community-based initiative to address the growing need for a unified framework for sharing pathway information.  Several groups are participating in BioPAX to develop a data exchange format that will allow communication between existing pathway databases and facilitate deposition of data into a common public repository.


Long abstract


D-40  Mouse Genome Informatics: Integration Nexus for Mammalian Biology
B Sinclair1, JA Blake 2, M Ringwald, CJ Bult, JA Kadin, JE Richardson, JT Eppig, Mouse Genome Informatics Group
1bobs@informatics.jax.org, Jackson Laboratory; 2jblake@informatics.jax.org, Jackson Laboratory
Correspondence address: jblake@informatics.jax.org

The Mouse Genome Informatics (MGI) databases provide access to comprehensive, integrated, experimental data for the laboratory mouse in the domains of sequence, expression, gene function (GO), molecular variation, phenotype, inbred strain characterization, homology, and tumor biology. MGI provides the definitive mouse gene index. MGI can be accessed at http://www.informatics.jax.org/.

Long abstract


D-41  An Effective Query Method for DNA Sequence
Jiyuan An1, Yi-Ping Phoebe Chen2
1j.an@qut.edu.au, Queensland University of Technology; 2p.chen@qut.edu.au, Queensland University of Technology
Correspondence address: j.an@qut.edu.au

A measurement with edit distance is a typical way for searching for similar DNA sequences. But in some cases, time warping distance is more appropriate for measuring similarity between two DNA sequences. In this paper we propose a query method based on time warping distance by coding DNA sequence.

Long abstract


D-42  Generating Database Technologies and Simulations for Branching Structure Applications
Yi-Ping Phoebe Chen1
1p.chen@qut.edu.au, Queensland University of Technology
Correspondence address: p.chen@qut.edu.au

This research will investigate the ways in which biologists analyse data in their plant studies, and how these requirements may be expressed through a visual query language that allows researchers to directly address the plant characteristics that are of interest.

Long abstract


D-43  A dimensional data warehouse for biological data
Tore Eriksson1, Katsuki Tsuritani2
1tore.eriksson@po.rd.taisho.co.jp, Taisho Pharmaceuticals, Co., Ltd.; 2k.tsuritani@po.rd.taisho.co.jp, Taisho Pharmaceuticals, Co., Ltd.
Correspondence address: tore.eriksson@po.rd.taisho.co.jp

Data warehousing using dimensional modeling and s tar schema data marts was applied to biological information. The data marts capture biological relations hips like gene--protein, as well as numerical data in the intersection between dimensions for example expression data.

Focus was also on building a automatized extraction and transformation of a wide range of public data sources.


Long abstract


D-44  SeqHound: biological sequence and structure database as a platform for bioinformatics research
Katerina Michalickova1, Hao Lieu2, Gary D. Bader, Michel Dumontier, Doron Betel, Ruth Isserlin, Christopher W.V. Hogue
1katerina@mshri.on.ca, Samuel Lunenfeld Research Institute and Department of Biochemistry, University of Toronto; 2lieu@mshri.on.ca, Samuel Lunenfeld Research Institute
Correspondence address: katerina@mshri.on.ca

SeqHound is a resource containing daily updated Entrez databases and 3-D structural data. It holds links to similar sequences, taxonomy, complete genomes, functional annotation, structural domains and literature. SeqHound is accessible directly through a C API and via a web server through PERL, Bioperl, C or C++ APIs.

Long abstract


D-45  GENA - Genomics Array Database
Gavin Kennedy1
1gavin.kennedy@csiro.au, CSIRO
Correspondence address: gavin.kennedy@csiro.au

The GENA Genomics Array database stores data generated by the Microarray process as well as information describing the experimental conditions. GENA provides structured queries to extract meaningful data that supports comparative analysis of gene expression ratios. Gena is unique in its capacity to mine gene expression data from several perspectives.

Long abstract


D-46  ScriptSure: A Non Redundant View of the Human Transcriptome
Jarret Glasscock1, Warren Gish2
1jglassco@sapiens.wustl.edu, Washington University; 2gish@watson.wustl.edu, Washington University
Correspondence address: jglassco@sapiens.wustl.edu

The goal of the ScriptSure project is to create a database that gives an accurate, comprehensive representation of the human transcriptome. ScriptSure provides a non-redundant representation of the transcript data, provides high quality (genomic) sequence, and alleviates problems associated with current approaches to representing transcript data. http:://sapiens.wustl.edu/ScriptSure

Long abstract


D-47  Designing XML and XML Schemas for Bioinformatics using UML
Philip Burton1, Russel Bruhn2
1pjburton@ualr.edu, University of Arkansas at Little Rock; 2rebruhn@ualr.edu, University of Arkansas at Little Rock
Correspondence address: pjburton@ualr.edu

The Unified Modeling Language (UML) can be used to display Bioinformatic data objects and their relationships graphically. The first step in the process is done at the conceptual level, allowing domain experts like biologists to participate. In this paper, we sketch the process of creating an XML document from scratch.

Long abstract



Functional Genomics
F-1  Bioinformatics Tools to Support siRNA Technology
Fran Lewitter1, Bingbing Yuan2, Markus Hossbach, Thomas Tuschl, George Bell, Robert Latek
1lewitter@wi.mit.edu, Biocomputing Group, Whitehead Institute, Cambridge MA; 2yuan@wi.mit.edu, Biocomputing Group, Whitehead Institute, Cambridge MA
Correspondence address: lewitter@wi.mit.edu

We have built a first-generation computational tool for siRNA selection (http://jura.wi.mit.edu/bioc/siRNA) which implements sophisticated selection algorithms to identify siRNAs with a high probability of specifically silencing the target gene. We have also designed a prototype database called sirBank to be a repository for siRNA molecules known to silence target genes.

Long abstract


F-2  Cancer-Specific Alternative Splicing is prevalent in the Human Genome
Qiang Xu1, Christopher Lee2
1qxu@chem.ucla.edu, Molecular Biology Institute, Department of Chemistry and Biochemistry, UCLA; 2leec@mbi.ucla.edu, Molecular Biology Institute, Department of Chemistry and Biochemistry, UCLA
Correspondence address: qxu@chem.ucla.edu

We found strong evidence (p<0.01) of cancer-specific splice variants in 316 human genes through a genome-wide analysis of human expressed sequences. The majority of these genes have functions associated with cancer. For a large number of cancer-associated genes, it appears the normal form instead of the cancer form that is previously uncharacterized.

Long abstract


F-3  Identification of Novel Two-partner Secretion Family in Burkholderia pseudomallei
Annapoorna Nimaggadda1, Sheila Nathan2, Rahmah Mohamed
1anulins@yahoo.com, Universiti Kebangsaan Malaysia; 2sheila@pkrisc.cc.ukm.my, Universiti Kebangsaan Malaysia
Correspondence address: sheila@pkrisc.cc.ukm.my

Identification Of Novel Two-partner Secretion Family in Burkholderia pseudomallei Filamentous hemagglutinin belongs to the Two Partner Secretion (TPS) family. Sequence analysis indicated the presence of filamentous hemagglutinin (FhaB) and its transporter (FhaC) in an operon in Burkholderia pseudomallei. Motif recognition, phylogenetic analysis using N-J and PAM matrix methods provide more information of the functionality of the operon.

Long abstract


F-5  BRIDGE - Building a Bioinformatics Ressource for the Integration of heterogeneous Data from Genomic Explorations into a platform for Systems Biology
Alexander Goesmann1, Folker Meyer2, D. Bartels, L. Krause, B. Linke, O. Rupp, A. Pühler
1Alexander.Goesmann@Genetik.Uni-Bielefeld.DE, Center for Genome Research, Bielefeld University; 2fm@Genetik.Uni-Bielefeld.DE, Center for Genome Research, Bielefeld University
Correspondence address: Alexander.Goesmann@Genetik.Uni-Bielefeld.DE

We describe our concept for the integration of heterogeneous data into a platform for systems biology. We have implemented a Bioinformatics Resource for the Integration of heterogeneous Data from Genomic Explorations (BRIDGE) and illustrate the useability of our approach as a platform for systems biology for two sample applications.

Long abstract


F-6  A Reconstruction Algorithm from Expression Data for Sparse Noninteracting Gene Networks
Ilaria Mogno1, Lorenzo Farina2, Salvatore Monaco
1mogno@dis.uniroma1.it, DIS, Universita di Roma La Sapienza; 2lorenzo.farina@uniroma1.it, DIS, Universita di Roma La Sapienza
Correspondence address: mogno@dis.uniroma1.it

We propose an algorithm, which reconstructs gene networks from expression data, trying to face the problem of "small" available data, assuming some reasonable biologically consistent hypotheses. We also evaluate algorithm performance on artificial problems.

Long abstract


F-7  Extraction of Pathways Involved in Microarray Time Course Experiments
Christine Steinhoff1, Tobias Mueller2, Hannes Luz, Martin Vingron
1christine.steinhoff@molgen.mpg.de, Max Planck Institute; 2Tobias.Mueller@biozentrum.uni-wuerzburg.de, Biocenter, University Würzburg
Correspondence address: christine.steinhoff@molgen.mpg.de

We present a procedure that integrates knowledge of multiple biological databases to recover potentially involved pathways from time course microarray experiments. Starting with a refined new clustering algorithm we group similarly behaving genes, followed by an integrated analysis of common transcription factor binding patterns, functional categories and biologically verified pathways.

Long abstract


F-8  Computational Discovery of Gene Modules and Regulatory Networks
Georg K. Gerber1, Ziv-Bar Joseph2, Tong Ihn Lee, François Robert, D. Benjamin Gordon, Ernest Fraenkel, Itamar Simon, Tommi S. Jaakkola, Richard A. Young, David K. Gifford
1georg@mit.edu, Massachusetts Institute of Technology, Laboratory for Computer Science; 2georg@mit.edu, Massachusetts Institute of Technology, Laboratory for Computer Science
Correspondence address: georg@mit.edu

We present an algorithm for combining genome-wide expression and protein-DNA binding data to discover co-regulated modules of genes and associated regulatory networks. Our algorithm operates on discovered networks to label transcription factors as activators or repressors, identify patterns of combinatorial regulation, and uncover sub-networks for biological processes in Saccharomyces cervisiae.

Long abstract


F-9  Non-conserved alternative splicing of human and mouse genes
I. Artamonova1, M. Gelfand2, A. Mironov, R. Nurtdinov.
1irena@humgen.siobc.ras.ru, Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry RAS, Miklukho-Maklaya 16-10, Moscow, 117997, Russia; 2gelfand@ig-msk.ru, State Scientific Center GosNIIGenetika, 1st Dorozhny 1, Moscow 113545, Russia
Correspondence address: irena@humgen.siobc.ras.ru

We analyzed conservation of alternative splicing patterns in pairs of orthologous genes from the human and mouse genomes. Our results demonstrate considerable diversity of alternative splicing in these genomes: at least half of alternatively spliced genes have species-specific isoforms. Orthologs with non-conserved isoforms may play a role in species-specific development.

Long abstract


F-10  ProDB - Bioinformatics support for high throughput proteomics
Andreas Wilke1, Christian Rueckert2, Sebastian Kespoh , Martina Mahne, Andrea T. Hueser, Folker Meyer
1andreas.wilke@genetik.uni-bielefeld.de, UniversBielefeld University, Institute for Genome Research, Germany; 2christian.rueckert@genetik.uni-bielefeld.de, Int. NRW Grad. School in Bioinformatics Genome Research, Bielefeld Unive rsity, Germanyy
Correspondence address: andreas.wilke@genetik.uni-bielefeld.de

To cope with the need for automated data conversion, storage, and analysis in the field of proteomics, the open source system ProDB was developed. The system handles data conversion from different mass spectrometer software, automates data analysis, and will allow the annotation of MS spectra.

Long abstract


F-11  Functional Annotation in the Twilight Zone using Machine Learning
Ali Al-Shahib1, David Gilbert2
1alshahib@dcs.gla.ac.uk, University of Glasgow; 2drg@dcs.gla.ac.uk, University of Glasgow
Correspondence address: alshahib@dcs.gla.ac.uk

In functional genomics, many are worried at the number of functionally unknown genes we have. One of the areas that we think has contributed to this is the uncertainty in low sequence alignments (twilight zone). Our work involves the development of a rule-based system that allows us to accurately assign functional annotation in the twilight zone.

Long abstract


F-12  Predicting Co-Complexed Protein Pairs Using Genomic and Proteomic Data Integration
Lan V. Zhang1, Sharyl L. Wong2, Oliver D. King, Frederick P. Roth
1lan_zhang@student.hms.harvard.edu, Harvard Medical School; 2sharyl_wong@student.hms.harvard.edu, Harvard Medical School
Correspondence address: lan_zhang@student.hms.harvard.edu

We took a probabilistic decision tree approach to predict co-complexed pairs (CCPs) of proteins by integrating high-throughput interaction datasets with other characteristics of gene/protein pairs. Our method made more sensitive and specific predictions than high-throughput interaction screens, and is also promising in detecting unknown CCPs.

Long abstract


F-13  Protein Function Prediction Using Probabilistic Protein Interaction Networks
Debra S. Goldberg1, Sharyl Wong2, Frederick P. Roth
1debg@hms.harvard.edu, Harvard Medical School; 2sharyl_wong@student.hms.harvard.edu, Harvard Medical School
Correspondence address: debg@hms.harvard.edu

To improve protein function prediction from high-throughput, error-prone data, we compute a posterior probability for the validity of each interaction. These edge weights are based on the experimental data and how well each observation fits the expected network topology. Our function predictions compare favourably to previously published methods.

Long abstract


F-14  GOPArcII - new features of the GeneOntology and Pathways Architecture
Daniela Bartels1, Alexander Goesmann2, Oliver Rupp, Folker Meyer
1Daniela.Bartels@Genetik.Uni-Bielefeld.DE, Center for Genome Research, Bielefeld University; 2Alexander.Goesmann@Genetik.Uni-Bielefeld.DE, Center for Genome Research, Bielefeld University
Correspondence address: Daniela.Bartels@Genetik.Uni-Bielefeld.DE

We present GOPArcII, a new version of our comprehensive, open source framework for the integration of functional classifications and metabolic pathways. GOPArcII is based on a relational database. It enables a user to search and handle data like genome data from the perspective of functional categories and metabolic pathways.

Long abstract


F-15  On the Sequence Pattern Distribution in Splice Junctions. An Analysis Using Information Theoretic and Machine Learning
Christina Zheng1, Virginia R de Sa2, Michael Gribskov, T. Murlidharan Nair
1nair@sdsc.edu, UCSD SDSC; 2desa@cogsci.ucsd.edu, UCSD
Correspondence address: nair@sdsc.edu

The computational recognition of precise splice junctions is a challenge faced in the analysis of newly sequenced genomes. To understand the sequence signatures at the splice junctions, comparative analysis using both neural network based calliper randomization and information theoretic based feature selection approaches have been used.

Long abstract


F-16  LOC3D: annotate sub-cellular localization for protein structures
Rajesh Nair1, Burkhard Rost2
1nair@cubic.bioc.columbia.edu, Columbia University; 2rost@columbia.edu, Columbia University
Correspondence address: nair@cubic.bioc.columbia.edu

LOC3D is both a database and a web server for predicting the sub-cellular localization of eukaryotic proteins of known structure. Localization is predicted using a combination of four different methods; prediction of nuclear localization signals, through sequence homology to proteins with known localization, automatic text analysis of SWISS-PROT keywords and using neural networks.

Long abstract


F-17  Identification of putative insulin binding motifs of the insulin receptor
Steve Bottomley1, Jessica Mitchell2, Brian Plewright, Erik Helmerhorst
1S.Bottomley@curtin.edu.au, Western Australian Biomedical Research Institute, School of Biomedical Sciences, Curtin University of Technology; 2MITCHEJM@ses.curtin.edu.au, Western Australian Biomedical Research Institute, School of Biomedical Sciences, Curtin University of Technology
Correspondence address: S.Bottomley@curtin.edu.au

Overlapping 9 and 15mer peptides covering the insulin receptor alpha-subunit sequence were synthesised and measured for their ability to specifically bind 125I-insulin. The insulin binding sequences were analysed to identify putative insulin-binding regions of the receptor, insulin-binding motifs, and develop a preliminary insulin-binding scoring matrix.

Long abstract


F-18  A Functional Annotation Project for Novel and Uncharacterised Genes
William Wilson1, Emily Hodges2, Ivana Novak, Claes Wahlestedt, Christer Höög, Boris Lenhard
1bill.wilson@cgb.ki.se, Karolinska Institute; 2emily.hodges@cgb.ki.se, Karolinska Institute
Correspondence address: bill.wilson@cgb.ki.se

Annotation of novel genes is a challenge to genome sequencing efforts. We streamlined the process for genes with novel protein-coding domains by integrating web-based databases and annotation tools with gene data from diverse sources. We show examples of how our approach enhances experimental design and leads to accurate gene annotation.

Long abstract


F-19  Interacting Determinants of Migraine Susceptibility
Rod Lea1, Lyn Griffiths2
1r.lea@griffith.edu.au, Genomics Research Centre, Griffith University; 2l.griffiths@griffith.edu.au, Genomics Research Centre, Griffith University
Correspondence address: r.lea@griffith.edu.au

It is likely that multiple genetic variants interact to confer susceptibility to complex disease. We have shown that functional variants in the MTHFR and ACE genes interact to increase risk of migraine.

Long abstract


F-20  Computational analysis of stop codon readthrough in D.melanogaster
Misaki Sato1, Hitomi Umeki2, Rintaro Saito, Akio Kanai, Masaru Tomita
1s00457ms@sfc.keio.ac.jp, Laboratory for Bioinformatics, Institute for Advanced Biosciences, Keio University; 2t01513hu@sfc.keio.ac.jp, Laboratory for Bioinformatics, Institute for Advanced Biosciences, Keio University
Correspondence address: s00457ms@sfc.keio.ac.jp

We constructed a system that lists candidates of readthrough genes based on the existence of a “protein motif” at the 3’UTR. Using this system, we extracted 85 candidates in Drosophila melanogaster, and found features in those sequences which are known to have an effect on readthrough events.

Long abstract


F-21  GOODIES: Gene Ontology-based Data-mining Tool for Biological Interpretation and Functional Classification on a Group of Biological Entities
Sung Geun Lee1, Wan Seon Lee2
1sglee@istech21.com, Bioinformatics Unit, ISTECH Inc.; 2konan@istech21.com, Bioinformatics Unit, ISTECH Inc.
Correspondence address: yskim@istech21.com

GOODIES is a Gene Ontology-based data-mining tool for effective functional classification. Given biological entities, GOODIES classifies them along their annotational attributes and selects optimal GO candidate terms from combinatorially many choices for overall biological interpretation, with intuitive visualization. The major applications of GOODIES include biologically-oriented cluster analysis and functional categorization.

Long abstract


F-22  Generation and Clustering of Phylogenetic Profiles for automatic Functional Annotation of Proteins
Yen-Chen Steven Huang1, Vic Arcus, Ted Baker, Shaun Lott, Patricia Riddle, Chris Triggs
1yc.huang@auckland.ac.nz, The Centre for Molecular Biodiscovery, University of Auckland
Correspondence address: yc.huang@auckland.ac.nz

Phylogenetic profile analysis assigns functional clues to proteins in a manner that is independent of sequence similarity. We present an improved algorithm that constructs the phylogenetic profiles of proteins based on the unambiguous Smith-Waterman Alignment algorithm. We used MetaCyc metabolic pathway clusters to estimate the prediction accuracy of the method.

Long abstract


F-23  A Whole-genome Analysis of Transcription Factor Binding Site Data.
Caroline Finnerty1, Dr. James McInerney2
1caroline.s.finnerty@may.ie, Bioinformatics and Pharmacogenomics Laboratory; 2james.o.mcinerney@may.ie, Bioinformatics and Pharmacogenomics Laboratory
Correspondence address: caroline.s.finnerty@may.ie

It is widely accepted that our complexity as a species results from the regulation of our genes. Our approach is to analyse, on a genome-wide scale the upstream regions of human genes with particular emphasis on transcription factor binding sites. The ultimate goal is to infer expression pattern from sequence.

Long abstract


F-24  In the search of genomic clusters of human co-expressed genes using microarray gene expression data.
Johannes Olson1, Per Broberg2, Krzysztof Pawlowski
1Johannes.EXT.Olson@astrazeneca.com, AstraZeneca; 2Per.Broberg@astrazeneca.com, AstraZeneca
Correspondence address: Krzysztof.Pawlowski@astrazeneca.com

Adjacent human gene co-expression was investigated using GeneLogic library. We analyzed average and maximum expression profile correlation for gene pairs within sliding genomic windows for normal or diseased tissue samples. Significance estimates used randomized gene sets. Co-expressed clusters were analyzed for gene duplication, subcellular localization, functional themes, inter-species conservation.

Long abstract


F-25  LacplantCyc: a Pathway / Genome Database for Lactobacillus plantarum as the model for Lactic Acid Bacteria.
Frank H.J. van Enckevort1,2, Bas Teusink1,3, Roland J. Siezen1,2,3
Frank.van.Enckevort@nizo.nl, Bas.Teusink@nizo.nl, Roland.Siezen@nizo.nl
1NIZO food research, Ede, The Netherlands; 2Centre for Molecular and Biomolecular Informatics, University of Nijmegen, The Netherlands; 3Wageningen Centre for Food Sciences, Wageningen, The Netherlands.
Correspondence address: Frank.van.Enckevort@nizo.nl

Lactobacillus plantarum is a versatile lactic acid bacterium that is encountered in various niches. LacplantCyc is a newly created pathway/genome database predicted from the annotated complete genome sequence of L. plantarum WCFS1 (PNAS 2003;100:1990), using the PathoLogic software from PathwayTools (Peter Karp). Manual editing and experimental verification is in progress.

Long abstract


F-26  Reconstruction of Genetic Networks from Gene Expression Perturbation Data Using a Boolean Model
Ronald Taylor1
1ronald.taylor@uchsc.edu, U of Colorado
Correspondence address: ronald.taylor@uchsc.edu

The use of Boolean models is explored in reconstruction of the topology of genetic transcriptional networks, employing gene expression data from simulated perturbations. The construction and employment of a software suite for such exploration is described. Results are compared for different simulated topologies, inference methods, and amount of noise.

Long abstract


F-27  Scalable multi-processor application for gene expression profile clustering
Andrey Ptitsyn1
1ptitsyaa@pbrc.edu, Pennington Biomedical Research Center
Correspondence address: ptitsyaa@pbrc.edu

We would like to report a scalable multi-processor application for clustering of gene expression profiles. The program implements the flexible clustering algorithms developed at PBRC. The program is written in MPI standard and tested on IBM AIX Parallel Environment.

Long abstract


F-28  Detection of Global and Gene Specific Translational Control Signals in mRNAs
Chris M Brown1, Grant Jacobs2, Mark Dalphin and Peter Stockwell
1chris.brown@otago.ac.nz, University of Otago; 2gjacobs@bioinfotools.com, Bioinfotools
Correspondence address: chris.brown@otago.ac.nz

We wish to detect signals in mRNAs that influence their translation. These include signals that modulate translation efficiency, mRNA stability or its localisation. To do this we have developed the TransTerm database (http://transterm.otago.ac.nz/) of mRNA regions and regulatory elements. We have detected novel elements and are testing them in vivo.

Long abstract


F-29  PATIKA: Pathway Analysis Tool for Integration and Knowledge Acquisition
Emek Demir1, Ozgun Babur2, Ugur Dogrusoz, Attila Gursoy, Asli Ayaz, Gurcan Gulesir, Gurkan Nisanci, Rengul Cetin-Atalay, and Mehmet Ozturk
1emek@cs.bilkent.edu.tr, BCBI, Bilkent University, Ankara, Turkey; 2babur@cs.bilkent.edu.tr, BCBI, Bilkent University, Ankara, Turkey
Correspondence address: ugur@cs.bilkent.edu.tr

PATIKA is an ongoing research and development project for collaborative construction and analysis of cellular pathways. Our software tool provides an integrated, multi-user environment for visualizing and manipulating network of cellular events. PATIKA is available at http://www.patika.org.

Long abstract


F-30  Prediction of a full length gene from partial sequence
Chung, Myungguen1, Cho, Sooyoung2, Ban, hyojeong ; Kim, hyun and Lee Youngseek
1aobo@ihanyang.ac.kr, Hanyang University; 2singylu@hanmail.net, Hanyang University
Correspondence address: aobo@chollian.net

We obtained 3’ end partial sequence of cDNA which have not found homologs in ‘nr’ by using BLAST. We predicted full length genes from partial sequence and cloned full length genes by using a predicted sequence

Long abstract


F-31  Design of Antisense Oligonucleotides
Alistair M. Chalk1, Erik L.L. Sonnhammer2
1alistair.chalk@cgb.ki.se, CGB, Karolinska Institute; 2esr@algol.cgb.ki.se, CGB, Karolinska Institute
Correspondence address: alistair.chalk@cgb.ki.se

Antisense oligonucelotides are an important tool for gene-knockdown approaches in functional genomics. We assess the usefulness of current approaches predicting accessibility and/or efficacy using a database of known results. A set of utilities developed for AO and siRNA design (control design, specificity, site selection) is available at http://sonnhammer.cgb.ki.se.

Long abstract


F-32  Detection of natural antisense transcripts conserved between human and mouse
Par Engstrom1, Hidenori Kiyosawa2, Claes Wahlestedt, Yoshihide Hayashizaki, Boris Lenhard
1par.engstrom@cgb.ki.se, Center for Genomics and Bioinformatics, Karolinska Institutet, Stockholm, Sweden; 2kiyosawa@rtc.riken.go.jp, RIKEN Bio Resource Center, RIKEN Tsukuba Institute, Tsukuba, Japan
Correspondence address: par.engstrom@cgb.ki.se

We devised an automated computational procedure to detect pairs of overlapping and oppositely directed human transcriptional units (sense-antisense pairs) equivalent to a large set of putative mouse sense-antisense pairs previously predicted by cDNA mapping. We found support in the human transcriptome for a significant proportion of mouse sense-antisense pairs.

Long abstract


F-33  A modular software platform integrating the processing and bioinformatic analysis of proteomics data
Soeren Schandorff1, Hans Jespersen2, C. H. Ahrens, M. Damsbo, S. Larsen, B. K. Ramsgaard, E. T. Nielsen, G. Thorvil, J. P. Kristensen, K. P. Budin, J. Matthiesen, P. Venø, J. C. Brønd, T. Topaloglou, P. T. Ruhoff
1schandorff@mdsdenmark.com, MDS Denmark; 2hjespersen@mdsdenmark.com, MDS Denmark
Correspondence address: cahrens@mdsdenmark.com

We have developed an integrated software platform that addresses data generation and handling, data verification/quality control and bioinformatic analysis steps of large scale proteomics projects. Experimental data is integrated with computationally enriched data from public and proprietary databases enabling protein isoform distinction, protein-protein interactions analysis, pathway analysis and text mining.

Long abstract


F-34  LION Target Engine 1.0: an Enterprise Platform for Target Identification and Validation
S. Bernauer, Z. Bilkic, N. Bojunga, T. Brostroem, D. Croft, N. Delhomme, A. Denagbe, L. Ehrlich, K. Fries, C. Girardot, M. Goeschl, M. Gumbel, J. Hermanns, C. Kaestner, C. Katz, U. Keck, H.-P. Keck, R. Kern, G. Kurapkat, P. Lederer, D. Leon, S. Marcel, S. Markel, B. Markus, J.E.M. Meyer, E. Minch, J. Mistry, C. Muench, S. I O'Donoghue1, C. Ohr, S. Richter, H.-J. Roemming, R. Russ, S. Schaefer, A. Schafferhans, T. Schlegl, T. Schlueter, A. Schmidt, O. Schmidt, D. Schulz, A. Sooky, A. Sergienko, F. Spangenberg, J. Suckow, B. Sulzer, C. Suter-Crazzolara, A. Tarasenko, E. Vatcheva, H. Voss, M. Weindel, G. Zhang
1sean.odonoghue@lionbioscience.com, LION bioscience AG, Waldhoferstr. 98, 69123 Germany
Correspondence address: sean.odonoghue@lionbioscience.com

LION Target Engine is a user-friendly system designed to streamline the identification and validation of targets in an enterprise environment. The system components for: sequence registration and curation; sequence analysis; gene and protein index; text mining; pathways and interaction networks; 3D structures; TaqMan and microarray data handling, and target tracking.

Long abstract


F-35  siRNA Design Tool: A Functional Genomics Accelerator
Natasha Levenkova1, Qingjuan Gu2, John J. Rux
1nlevenkov@wistar.upenn.edu, The Wistar Institute; 2qingjuan@wistar.upenn.edu, The Wistar Institute
Correspondence address: rux@wistar.upenn.edu

Small interfering RNA (siRNA) is used in functional genomics applications to produce “knock-down” cells. The siRNA design tool scans a target gene for candidate siRNA sequences that satisfy user-adjustable rules. Selected candidates are then screened to identify those siRNA sequences that match only the gene of interest.

Long abstract


F-36  Statistical Analysis of Arabidopsis T-DNA-flanking sequences
Hyung Seok Choi1
1gnie@lycos.co.kr, Seoul National University
Correspondence address: shchoe@snu.ac.kr

T-DNA use in plant functional genomics is based on a hypothesis that the T-DNA randomly inserts plant genomes. To test this, we analyzed 120,000 T-DNA flanking sequences of the SIGnAl database. Of the total 29,084 Arabidopsis genes, approximately 70% have >1insert, whereas 8760 (30%) genes are still left without any.

Long abstract


F-37  The Yeast Interactome -- analysis and evaluation of diverse sources of information.
Jeremiah J Faith1, Ravi Sachidanandam2
1faith@cshl.org, Cold Spring Harbor Laboratory; 2sachidan@cshl.org, Cold Spring Harbor Laboratory
Correspondence address: faith@cshl.org

A network analysis of the protein-protein interactions in yeast reveals distinct clusters. Some of the clusters are due to functional groups, while most are due to methods of detection; interactions detected by yeast two-hybrid tend to cluster proteins into groups that are different from the clusters due to mass spectrometry. We quantify these differences and discuss implications.

Long abstract


F-38  cSAGE and the Serial Analysis of Gene Expression in Arabidopsis thaliana
Christopher T Lewis1, Stephen Robinson2, Tony Kusalik, Isobel AP Parkin
1LewisCT@agr.gc.ca, Agriculture and Agri-food Canada; 2RobinsonS@agr.gc.ca, Agriculture and Agri-food Canada
Correspondence address: LewisCT@agr.gc.ca

cSAGE is an open-source application written in C to provide an efficient mechanism for extracting SAGE tags and assigning matches to DNA sequences. It has been used for a cold tolerance experiment in A. thaliana involving 3 librarys with more than 180,000 tags. See http://homepage.usask.ca/ctl271/csage for more information.


Long abstract


F-39  Reconstructing Genome Architectures by End Sequence Profiling: Applications to Tumor Genomes
Ben Raphael1, Pavel Pevzner2, Stas Volik, Colin Collins
1braphael@ucsd.edu, University of Californa, San Diego; 2ppevzner@cs.ucsd.edu, Univeristy of Californa, San Diego
Correspondence address: braphael@ucsd.edu

We describe a computational approach to the reconstruction of the architecture of a rearranged genome based on data from end sequence profiling experiments. We apply our techniques to the reconstruction of the genome of a human MCF7 tumor cell.

Long abstract


F-40  Phosphoregulators: Protein kinases and Protein phosphatases of mouse
Alistair RR Forrest1, Timothy Ravasi2, Darrin Taylor, Rohan Teasdale, RIKEN GER Group Members ,and Sean Grimmond
1a.forrest@imb.uq.edu.au, IMB; 2t.ravasi@imb.uq.edu.au, IMB
Correspondence address: a.forrest@imb.uq.edu.au

We describe the identification and classification of the complement of protein kinases and phosphatases in mouse. We also present preliminary results from a functional screen of these proteins, coupling sequence based classification with high throughput functional screens.

Long abstract


F-41  BioinformatIQ: Integrating devices, data types, and bioinformatic analysis in an information management system for proteomics
F. Keith Junius1, P. Bizannes, P. Doggett, M. Harrison, B. Srinivasan, E. Shaw, M. Traini, W. McDonald, and Marc R. Wilkins
1Keith.Junius@proteomesystems.com, Proteome Systems
Correspondence address: Keith.Junius@proteomesystems.com

BioinformatIQ® is an integrated system for handling the information needs of proteomics from sample preparation, through automation of instrumentation, to protein identification and characterization. This informatcs platform for proteomics is demonstrated through application to the proteomic analysis of human plasma. More information on BioinformatIQ® can be found at http://www.proteomesystems.com

Long abstract


F-42  Visualizing and Exploring Linked Functional Genomic Data Sets in YETI: Yeast Exploration Tool Explorer
Richard J. Orton1, William I. Sellers2, Dietlind L. Gerloff
1Richard.Orton@ed.ac.uk, University of Edinburgh; 2W.I.Sellers@lboro.ac.uk, University of Loughborough
Correspondence address: d.gerloff@ed.ac.uk

YETI is a novel bioinformatics tool for integrated visualization and analysis of functional genomic data from the yeast Saccharomyces cerevisiae. YETI 1.0 consists of three fully inter-linked sections allowing users to explore the “genomic” (e.g. chromosomal location), and “proteomic” (e.g. associated protein-protein interactions) context of multiple proteins of interest simultaneously.

Long abstract



Genome Annotation
E-1  In silico prediction of UTR repeats using clustered EST data
Stefan Rensing1, Daniel Lang2, Ralf Reski
1stefan.rensing@biologie.uni-freiburg.de, University of Freiburg, Plant Biotechnology; 2daniel.lang@biologie.uni-freiburg.de, University of Freiburg, Plant Biotechnology
Correspondence address: stefan.rensing@biologie.uni-freiburg.de

Three approaches for the in silico prediction of UTR repeats have been used on a test data set, resulting in the detection of sequence stretches in ~5% of the input sequences during clustering and reduction in size of large clusters. Seven of those putative repeats have been proven to be repetitive in vivo by Southern blot analysis.

Long abstract


E-2  Using proteomics to mine genome sequences
Jonathan W Arthur1, Marc R Wilkins2
1jonathan.arthur@proteomesystems.com, Proteome Systems Ltd; 2marc.wilkins@proteomesystems.com, Proteome Systems Ltd
Correspondence address: jonathan.arthur@proteomesystems.com

We present a hypothesis-independent method for identifying the region of a genome coding for a protein sequence using proteomic information. The method can be used to identify novel genes that were not found by other annotation techniques. It is demonstrated using theoretical and experimental data sets from prokaryotic and eukaryotic organsims.


Long abstract


E-3  Global insights into protein complexes through integrated analysis of the interactome and knockout lethality
Harukazu Suzuki1, Rintaro Saito 2, Yoshihide Hayashizaki
1harukazu@gsc.riken.go.jp, RIKEN Genomic Sciences Center ; 2, RIKEN Genomic Sciences Center
Correspondence address: harukazu@gsc.riken.go.jp

We have developed the new interaction generality measure (IG2), which can be used to computationally assess the reliability of the interactome data. We performed an integrated analysis by using comprehensive phenotype dataset and IG2-treated interactome dataset from yeast, which yielded global insights into the biological features of the protein complexes.

Long abstract


E-4  An evaluation of new criteria for CpG islands in the human genome as gene markers
Patrick, Yong Wang1, Frederick, C. Leung2
1wangyong@hkusua.hku.hk, HKU, Dept of Zoology; 2fcleung@hkucc.hku.hk, HKU, Dept of Zoology
Correspondence address: wangyong@hkusua.hku.hk

Using the new criteria for CpG islands introduced by Takai and Jones, we investigated several association types between CpG islands and genes to further establish the importance of CpG islands as gene markers. Our investigation gave us a useful tool for evaluating the accuracy of gene annotation in human chromosomes.

Long abstract


E-5  cDNA2Genome: A TOOL FOR MAPPING AND ANNOTATING cDNAS
Coral del Val1, Karl-Heinz Glatting2, S.Suhai
1c.delval@dkfz.de, Department of Molecular Biophysics DKFZ, German Cancer Research Center; 2glatting@dkfz.de, Department of Molecular Biophysics DKFZ, German Cancer Research Center
Correspondence address: c.delval@dkfz.de

cDNA2Genome is a web tool for automatic high-throughput mapping and characterization of cDNAs. It uses already existing annotation data and improves them when possible in the case of ESTs, proteins and mRNAs. It is focussed on the determination of the cDNA exon-intron structure. The final result of cDNA2Genome is an XML file with all information obtained by the task.


Long abstract


E-6  Identifying and Annotating Disease Specific Rat Genome Sequences
Jedidiah Mathis1, Mary Shimoyama2, Aubrey Hughes, Norberto Dela Cruz, Charles Wang, Simon Twigger, Michael Jensen-Seamen, Michelle Feldmann, Artur Rangel Filho, Jozef Lazar, Howard Jacob, Peter Tonellato
1jmathis@mcw.edu, Medical College of Wisconsin; 2shimoyma@mcw.edu, Medical College of Wisconsin
Correspondence address: jmathis@mcw.edu

Disease related Genomic Regions Of Interest (GROI) were submitted to the Rat Genome Sequencing Consortium for prioritized sequencing. These areas were analyzed to determine whether greater coverage facilitated denser and more accurate annotation. Functional annotation of genes in these regions was achieved using datamining of public databases and manual curation.

Long abstract


E-7  Assembly and finishing tools for repeated and polymorphic genomes
Martti T. Tammi1, Erik Arner2, Ellen Kindlund, Björn Andersson
1martti.tammi@cgb.ki.se, Center for Genomics and Bioinformatics, Karolinska Institutet; 2erik.arner@cgb.ki.se, Center for Genomics and Bioinformatics, Karolinska Institutet
Correspondence address: martti.tammi@cgb.ki.se

DNPTrapper is a graphical tool specifically designed for finishing shotgun assemblies containing complex repeated regions. DNPTrapper allows visualization of sequences and sequence features, e.g DNPs, mate-pairs, repeat boundaries, chromatograms, etc. in horizontal and vertical representation, followed by manual and semi-automatic manipulation, which greatly simplifies finishing.

Long abstract


E-8  e-PROTEIN: A Distributed Pipeline for Structure-based Proteome Annotation using Grid Technology
Keiran Fleming1, Liam McGuffin2, Stefano Street, Andreas Kahari, Tim Massingham, Steven Newhouse, James Cuff, Ewan Birney, Soren Sorenson, Christine Orengo, John Darlington, David Jones, Janet Thornton, Michael Sternberg
1k.fleming@imperial.ac.uk, Imperial College London; 2l.mcguffin@cs.ucl.ac.uk, University College London
Correspondence address: k.fleming@imperial.ac.uk

The e-Protein project aims to provide structure-based annotations of proteins in the major genomes by linking resources via Grid technology at 3 sites; Imperial College London, University College London, and the EBI. At the end of the first 6 months we have established a pre-prototype using GLOBUS/Grid technology.


Long abstract


E-9  Gene Ontology Toolkit: A Comprehensive Software Package for Working with Gene Ontology
Jing Ding1, Jun Xu2, Andy W. Fulmer
1dingjing@iastate.edu, Iowa State University; 2xu.j.1@pg.com, The Procter and Gamble Company
Correspondence address: fulmer.aw@pg.com

Gene Ontology Toolkit is a Java GUI for working with the Gene Ontology (GO). Its three integrated components (editor, annotator and merger) enable biologists to visualize and extend GO, annotate gene products with GO terms, and merge the extensions and/or the annotations with new release of GO or others’ extensions.

Long abstract


E-10  Assembly and Annotation of the Leptospira borgpetersenii serovar Hardjobovis Genome Sequence.
Annette McGrath1, John Davis2, Peter J Wilson, Dieter Bulach, Torsten Seemann, John Davies, Ross Coppel, Ben Adler, Elizabeth S Kuczek
1annette@agrf.org.au, Australian Genome Research Facility; 2john@agrf.org.au, Australian Genome Research Facility
Correspondence address: annette@agrf.org.au

We have completed the assembly of the Leptospira borgpetersenii serovar hardjobovis genome, which contains a large chromosome (CI) of 3,614,529 base pairs and a smaller chromosome (CII) of 317,585 base pairs. The annotation of CII is complete and is currently in progress for CI.

Long abstract


E-11  Towards the Bovine Ensembl
Sean M. McWilliam1, Wes Barris2, Brian P. Dalrymple
1sean.mcwilliam@csiro.au, CSIRO Livestock Industries, Brisbane, Australia; 2wes.barris@csiro.ai, CSIRO Livestock Industries, Brisbane, Australia
Correspondence address: sean.mcwilliam@csiro.au

We have implemented the Ensembl genome sequence database and interface for handling the annotation of the bovine genome. Initial efforts have focussed on display of the annotation of small and micro RNAs and linking to the SNP database, IBISS.

Long abstract


E-12  Annotation of non-coding RNA molecules in cattle genomic sequences
Brian Dalrymple1, Sean McWilliam2, Wes Barris, Pradeep Tokachichu
1Brian.Dalrymple@csiro.au, CSIRO Livestock Industries; 2Sean.McWilliam@csiro.au, CSIRO Livestock Industries
Correspondence address: Brian.Dalrymple@csiro.au

To identify members of known families of RNAs a combination of BLAST and INFERNAL is used with the RNA covariance models in Rfam. To identify potential new RNAs a combination of genome specific BLAST and QRNA is used. The results of analysis of bovine BAC-end sequences will be shown.

Long abstract


E-13  ASmodeler: Gene modeling of alternative splicing events from genomic alignment of mRNA and ESTs
Namshin Kim1,2, Seokmin Shin2, Sanghyuk Lee
1deepreds@hanmail.net, Division of Molecular Life Sciences, Ewha Womans University, Seoul 120-750, KOREA; 2sshin@snu.ac.kr, School of Chemistry, Seoul National University
Correspondence address: deepreds@hanmail.net

ASmodeler is a novel web-based utility to find gene models of alternative splicing events from genomic alignment of mRNA and ESTs. It can be used as a transcript assembly program, an EST clustering utility, and a method of comparative gene modeling. ASmodeler is available at http://genome.ewha.ac.kr/ASmodeler/.

Long abstract


E-14  Protein domain extraction by quasi-convex set functions
HwaSeob Yun1, Casimir Kulikowski2, Ilya Muchnik
1seabee@cs.rutgers.edu, Rutgers University; 2kulikows@cs.rutgers.edu, Rutgers University
Correspondence address: seabee@cs.rutgers.edu

We present a fast, fully automatic procedure for protein domain extraction from single query sequences without pre-calculation of domain statistics. Combinatorial clustering of domains from BLAST hits using quasi-convex set functions, followed by domain parsing and pattern discovery permits highly efficient (polynomial time) whole genome functional annotation.

Long abstract


E-15  Development of a Web-based Genome Annotation System and Two Analysis Tools
Hongseok Tae1, Hyeweon Nam2, Daesang Lee, Kiejung Park
1hstae@smallsoft.co.kr, Dept. of Microbiology, Kyungpook National University; 2hwnam@smallsoft.co.kr, Information Technology Institute, SmallSoft Co., Ltd.
Correspondence address: hstae@smallsoft.co.kr

Our web-based genome annotation system has major modules such as gene prediction, homology search, promoter analysis, motif analysis, gene ontology analysis, annotation databases, and a genome browser which shows the entire information of a genome. We have also developed a motif analysis and a gene prediction programs based on HMM.

Long abstract


E-16  New approach to build models for predicting prokaryotic genes
Chungoo Park1, Mihwa Park2, Jongwon Chang, Jeongho Huh, Dong Soo Jung, Hong Gil Nam, Young Bock Lee, Jiin Choi, Seungsik Yoo, Jaewoo Kim
1madreach@bric.postech.ac.kr, Biological Research Information Center, Pohang University of Science and Technology; 2bfpark@posdata.co.kr, Solution Development Research Institute,POSDATA
Correspondence address: madreach@bric.postech.ac.kr

We propose a new method for increasing the gene prediction accuracy without using information of known genes. To increase the gene prediction accuracy we used the additional learning data through the phylogenetic concept. Tests on 3 complete prokaryotic genomes performed with the GLIMMER program demonstrate the ability of the new approach to detect additional genes.

Long abstract


E-17  A flexible model for promoter motifs
Wei-Mou Zheng1
1zheng@itp.ac.cn, Inst. Theor. Phys., Academia Sinica
Correspondence address: zheng@itp.ac.cn

A general and flexible multi-motif model is proposed for promoter motif analysis based on dynamic programming. By extending the Gibbs sampler to the dynamic programming and introducing temperature, an efficient algorithm is developed for searching motifs in promoters. The algorithm is tested with plant promoters.

Long abstract


E-18  Analysis of human herpesvirus genomes based on COGs and Phylogenomics
Chang-Jin Shin1, Cheol-Min Kim2, Byeong-Chul Kang, Jun-Hyung Park, Dong-Hoon Shin, Ok-Kyung Ham, Yoon-Jung Choi, In-Joo Kim, Choon-Hwan Lee, Cheol-Min Kim
1teragene@pusan.ac.kr, Busan Genome Center, Busan Genome Center, College of Medicine, Pusan National University; 2kimcm@pusan.ac.kr, Busan Genome Center, College of Medicine, Pusan National University
Correspondence address: teragene@pusan.ac.kr

The aim of this study is a development of a suitable procedure to predict the function of viral genes. To overcome conventional searches based on similarity, HHV (Human Herpesvirus genomes) were analyzed by COG and phylogenomic methods. It will provide a practical method to predict the function of new genes in viral genome.

Long abstract


E-19  Automated Gene Ontology annotation for anonymous sequence data
Steffen Hennig1, Detlef Groth2, Hans Lehrach
1hennig@molgen.mpg.de, MPI for Molecular Genetics, Berlin; 2dgroth@molgen.mpg.de, MPI for Molecular Genetics, Berlin
Correspondence address: hennig@molgen.mpg.de

The unified vocabulary of terms provided by the Gene Ontology consortium has become a standard tool in annotation of genes and their products. We present a web-service available at http://goblet.molgen.mpg.de, which allows annotation of anonymous cDNA or protein sequences by GO terms.

Long abstract


E-20  IBISS - the interactive bovine in Silico SNP database.
Rachel Hawken1, Wes Barris2, Brian Dalrymple
1Rachel.Hawken@csiro.au, CSIRO Livestock Industries; 2Wes.Barris@csiro.au, CSIRO Livestock Industries
Correspondence address: Rachel.Hawken@csiro.au

A bovine in Silico SNP database has been constructed. Contigs of ‘unique bovine sequences’ were established which were treated as model mRNAs. A comprehensive web interface has been developed which highlights putative identity of each contig, putative SNPs, location of predicted intron-exon boundaries, and genome mapping data for each model mRNA.

Long abstract


E-21   The Encyclopedia of Life (EOL) Project
Phil Bourne1, Wilfred Li2, Baldridge, K.; Baru, C.; Byrnes, R.; Clingman, E.; Cotofana, C.; Ferguson, C.; Fountain, A.; Greenberg, J.; Jermanis, D.; Matthews, J.; Miller, M.; Mitchell, J.; Mosley, M.; Pekurovsky, D.; Quinn, G.B.; Reyes, V.; Rowley, J.; Shindyalov, I.; Smith, C.; Stoner, D.; Veretnik, S.
1bourne@sdsc.edu, San Diego Supercomputer Center; 2wilfred@sdsc.edu, San Diego Supercomputer Center
Correspondence address: bourne@sdsc.edu

The Encyclopedia of Life Project(EOL; http://www.eolproject.info) is aimed at utilizing Grid computing resources to catalog the complete proteome of every living species in a flexible, powerful reference system using a scalable protein annotation pipeline. Recognized protein sequences are assigned putative functional annotation, structure assignment, and cross-referenced to other data sources.

Long abstract


E-22  Sabiá - System for Automated Bacterial Integrated Annotation
Ana Tereza R. Vasconcelos1, Roger Paixao2, Rangel C. Souza, Luiz Gonzaga, Gisele C. da Costa, Frank J. A. Barrientos, Marcelo T. dos Santos and Darcy F. de Almeida.
1atrv@lncc.br, Laboratorio Nacional de Computacao Cientifica; 2roger@lncc.br, Laboratorio Nacional de Computacao Cientifica
Correspondence address: atrv@lncc.br

A new tool called System for Automated Bacterial Integrated Annotation - SABIA was developed for the assembly and annotation of bacterial genomes. This system performs automatic tasks of assembly analysis, ORFs identification/analysis, and extragenic regions analysis. Genome assembly and contigs automatic annotation data are also available in the same working environment.

Long abstract


E-23  Identification of putative transcription factor binding sites conserved across orthologous human, mouse and rat sequences
Alex Gout1, Tim Beissbarth2, Joelle Michaud, Catherine Carmichael, Matthew Ritchie, Gordon
Correspondence address: gout@wehi.edu.au

The patterns of transcription factor binding sites (TFBSs) within upstream regions of differentially expressed genes identified via microarray analysis may help identify regulatory genetic networks. We have thus created a database of putative TFBSs conserved across orthologous human, mouse and rat genes through the use of MAVID, Match and Transfac.

Long abstract


E-24  An Automated Procedure to Create a Protein Structure Family Database and Application to Whole-Genome Annotation
Kenneth J Kelly1
1kjk@chemcomp.com, Chemical Computing Group Inc
Correspondence address: kjk@chemcomp.com

We present a fully-automated procedure to create a protein structure family database, along with a corresponding homology searching algorithm based on a combined E-value/Z-score approach. Whole-genome struture-based annotation tests on several completely sequenced genomes have demonstrated results comparable to PSI-BLAST.

Long abstract


E-25  Disease Ontology - Unifying Bioinformatics and Clinical Medicine
Patricia A. Dyck1, Rex L. Chisholm2
1p-dyck@northwestern.edu, Northwestern University; 2r-chisholm@northwestern.edu, Northwestern University
Correspondence address: p-dyck@northwestern.edu

The Disease Ontology is a hierarchical controlled vocabulary created to represent human disease. The ontology was created in order to enable database curation of disease gene associations. All terms in the ontology also maps to billing codes for the purpose of medical record mining. The ontology is available at: http://sourceforge.net/projects/diseaseontology/.

Long abstract



Microarrays
A-1  Binned-Intensity Normalization Algorithm for Single-Dye Microarrays
Gene Cutler1
1cutler@tularik.com, Tularik Inc
Correspondence address: cutler@tularik.com

To generate meaningful mRNA expression ratios, data from separate arrays or probes must be normalized. Median normalization performs adequately only when differences between data sets are linear. To cope with non-linearities and noisy data, I have implemented a binned-intensity normalization algorithm which outperforms simple median normalization.

Long abstract


A-2  Overcoming Confounded Controls in the Analysis of Gene Expression Data from Microarray Experiments
Soumyaroop Bhattacharya1, Dang Duc Long2, James Lyons-Weiler
1bhattacharyas@msx.upmc.edu, Benedum Oncology Informatics Center, University of Pittsburgh; 2Dang_Long@student.uml.edu, Center for Bioinformatics and Computational Biology, University of Massachusetts Lowell
Correspondence address: bhattacharyas@msx.upmc.edu

The robust clustering of some normal samples within tumor groups and robust clustering of other normal samples in a separate, 'normal' group indicates the confounding of control samples. Our approach uses the maximum difference subset algorithm (MDSS) and bootstrap validation, which evaluates the difference in mean expression between two groups.

Long abstract


A-3  Gene Ontology Driven Classification of Gene Expression Patterns
Claudio Lottaz1, Renate Kirschner2, Stefan Bentink, Christian Hagemeier and Rainer Spang
1Claudio.Lottaz@molgen.mpg.de, Max-Planck-Institute for Molecular Genetics, Berlin; 2r.kirschner@charite.de, Medical Center Charité, Berlin
Correspondence address: r.kirschner@charite.de

We propose to structure analysis of microarry according to biological knowledge in order to provide an intuitive and biologically meaningful rationale for computational classification results. Thereby, we rely on the Gene Ontology to attribute genes to biological aspects and usual machine learning methods for classification.

Long abstract


A-4  Improving the reliability of transcriptomics data; The effect of quenching on RNA transcription profiles
Bart Pieterse1, Renger H. Jellema2, Mariët J. van der Werf
1Pieterse@voeding.tno.nl, 1. Wageningen Centre for Food Sciences; 2jellema@voeding.tno.nl, 2. TNO Nutrition and Food Research
Correspondence address: jellema@voeding.tno.nl

We validated a quenching method for the harvesting of micro-organisms from liquid cultures for gene expression studies. The transcription profiles of quenched L. plantarum WCFS1 cells were compared with those of cells that were harvested by alternative methods. PCA analysis and hierarchal clustering of the resulting transcriptomics data show a clear effect of this quenching method on the transcription profiles.


Long abstract


A-5  Statistical Promoter Regulatory Element Analysis of cDNA Microarray Data For the Prediction of cAMP Responsive Genes
Lyle D Burgoon1, Ken Y Kwan, Tim Zacharewski2
1 Dept of Pharmacology & Toxicology; 2 Institute for Environmental Toxicology, National Food Safety & Toxicology Center
Correspondence address: burgoonl@msu.edu

We have developed a method for predicting transcription factor responsive genes by combining response element prediction with cDNA microarray data. Our Statistical Promoter REgulatory Element (SPREE) application program identified cAMP responsive elements. SPREE output was combined with cDNA microarray data to design an SVM model for predicting cAMP responsive genes.

Long abstract


A-6  The RNA Abundance Database and its Annotation Web-Forms.
Elisabetta Manduchi1, G.R. Grant, Hongxian He, J. Liu, M.D. Mailman, A. Pizarro, P.L. Whetzel, C.J. Stoeckert Jr.
1manduchi@pcbi.upenn.edu, Center for Bioinformatics, University of Pennsylvania
Correspondence address: manduchi@pcbi.upenn.edu

RAD and its web-based annotation forms are a system aimed at the collection, organization, and exchange of all relevant information pertaining to gene expression array (and SAGE) studies. The richness of information captured and the use of ontologies render RAD a very powerful infrastructure for querying and analysis ( http://www.cbil.upenn.edu/RAD3).

Long abstract


A-7  MPRIME: Efficient Large Scale Multiple Primer Design for Customized Microarrays
Eric Rouchka1, Nigel Cooper2, Abdelnaby Khalyfa
1eric.rouchka@louisville.edu, University of Louisville; 2nigelcooper@louisville.edu, University of Louisville
Correspondence address: eric.rouchka@louisville.edu

MPrime is a system for efficiently creating large sets of PCR primer pairs for use in designing products for custom cDNA microarrays. MPrime has allowed us to effectively design custom neurodegenerative microarray chips for humans as well as the rat and mouse genomes. MPrime is available at: http://kbrin.a-bldg.louisville.edu/Tools/MPrime/

Long abstract


A-9  Estimation of oncogenes by Bayesian inverse modeling of gene-expression patterns
Mathaeus Dejori1, Martin Stetter2
1mathaeus.dejori.external@mchp.siemens.de, Technical University of Munich; 2stetter@siemens.com, Siemens AG
Correspondence address: mathaeus.dejori.external@mchp.siemens.de

We train a Bayesian network to represent statistical dependencies between gene-expression levels from DNA-microarray datasets. The trained network is used to predict the effect of pathologically altered expression levels on the global expression pattern (inverse modeling). By use of this ability we can powerfully predict new genes involved in pathogenesis.

Long abstract


A-10  Using scale-free topology to estimate critical genes from regulatory networks
Mathaeus Dejori1, Martin Stetter2
1mathaeus.dejori.external@mchp.siemens.de, Technical University of Munich; 2stetter@siemens.com, Siemens AG
Correspondence address: mathaeus.dejori.external@mchp.siemens.de

We present a method for estimating key regulatory genes of genetic networks by analyzing their network topology. In networks learned from childhood leukemia microarray datasets we find a small number of genes such as POU2AF1 that may contribute to B-cell tumorigenesis.

Long abstract


A-11  Using bayesian network learning to model yeast transcriptional response to nitrogen oxide
Jingchun Zhu1, Joe DeRisi2
1jzhu@itsa.ucsf.edu, UCSF; 2joe@derisilab.ucsf.edu, UCSF
Correspondence address: jzhu@itsa.ucsf.edu

We used a Bayesian Network learning technique to analyze microarray transcriptional response profiles of yeast to nitrogen oxide. Using gene clusters as network nodes, the learned transcription response networks are consistent with the proposed biological hypotheses. The model also revealed a previously unknown link between galactose input and a fzf1 dependent cluster.

Long abstract


A-12  GEPAS, a web-based resource for microarray gene expression data analysis
Javier Herrero1, Fatima Al-Shahrour2, Ramon Diaz-Uriarte, Alvaro Mateos, Juan M. Vaquerizas, Javier Santoyo, Joaquin Dopazo
1jherrero@cnio.es, CNIO; 2falshahrour@cnio.es, CNIO
Correspondence address: jdopazo@cnio.es

GEPAS is a web-based pipeline for microarray gene expression profile analysis, freely available at http://gepas.bioinfo.cnio.es. The most commonly used tools for the processing and management of different functional genomics data are included in GEPAS as interconnected modules that exchange information in a user friendly manner.

Long abstract


A-13  Robust k-means Clustering of Gene Expression
Chris1, Dimitri2, Yong-Chuan Tao, Karine G. Le Roch, Garret Hampton, Elizabeth A. Winzeler, Jiayu Liao, Guangzhou Zou, Peter Schultz, Yingyao Zhou
1cbenner@gnf.org, Benner; 2dpetrov@gnf.org, Petrov
Correspondence address: zhou@gnf.org

The existing k-means clustering algorithm for gene expression data suffers from its uncertainty and ambiguity. This robust k-means clustering algorithm demonstrates how both variations in data sources and the intrinsic indeterminacy of clustering procedures can be overcome and that reliable, informative, and optimal clustering results can be achieved.

Long abstract


A-14  Quantitative microarray spot profile optimization: A systematic evaluation of buffer/slide combinations
D P Kreil1, R P Auburn, L A Meadows, S Russell, G Micklem
1ISMB03@Kreil.Org, University of Cambridge
Correspondence address: ISMB03@Kreil.Org

Selection of spotting-buffer and slide-chemistry is critical for the reliability of microarray-hybridization-experiments. Comparisons, however, have tended to be subjective, not suitable for systematic study. We present a novel approach, objectively assessing spot-morphology and -variance by measuring average (variance) of radial spot-pixel-intensity-profiles. It is successfully demonstrated comparing over 24x6 buffer/slide combinations.

Long abstract


A-15  Module Networks: Identifying Regulatory Modules and their Condition Specific Regulators from Gene Expression Data
Eran Segal1, Aviv Regev2, Dana Pe'er, Michael Shapira, David Botstein, Daphne Koller, Nir Friedman
1eran@cs.stanford.edu, Stanford; 2ARegev@CGR.Harvard.edu, CGR
Correspondence address: eran@cs.stanford.edu

We present a probabilistic method for identifying regulatory modules from gene expression data. Our procedure identifies modules of coregulated genes, their regulators and the conditions under which regulation occurs, generating testable hypotheses in the form ‘regulator X regulates module Y under conditions W ’. We present microarray experiments supporting three novel predictions,suggesting regulatory roles for previously uncharacterized proteins.

Long abstract


A-16  Integrated Storage For Microarray Experimental Data
Supawan Prompramote1, Yi-Ping Phoebe Chen2, Frederic Maire
1s.prompramote@student.qut.edu.au, Centre for Information Technology Innovation, Faculty of Information Technology, Queensland University of Technology; 2p.chen@qut.edu.au, Centre for Information Technology Innovation, Faculty of Information Technology, Queensland University of Technology
Correspondence address: s.prompramote@student.qut.edu.au

The use of different terminologies and structures in microarray databases is limiting the sharing of data and the collating of results between laboratories. We have proposed an integrated information management architecture for microarray experimental data that will focus on addressing these problems.


Long abstract


A-17  Implementation of BASE for microarray data analysis at ACGT Microarray facility, Pretoria, South Africa
Daniel F. Theron1, David K. Berger2, Sanushka Naidoo, Fourie Joubert
1danie.theron@fabi.up.ac.za, University of Pretoria; 2dberger@postino.up.ac.za, University of Pretoria
Correspondence address: danie.theron@fabi.up.ac.za

This concise guide demonstrates the implementation of BASE as a microarray data analysis pipeline. BioArray Software Environment is an open-source platform for archiving, analysis and visualizing of microarray data. Data for this demonstration compares gene expression between a wild-type and mutant Arabidopsis plants that identified 86 differentially expressed genes.


Long abstract


A-18  Estimating Gene Networks by Bayesian Networks from Microarrays and Biological Knowledge
Seiya Imoto1, Tomoyuki Higuchi2, Takao Goto, Kousuke Tashiro, Satoru Kuhara, Satoru Miyano
1imoto@ims.u-tokyo.ac.jp, University of Tokyo; 2higuchi@ism.ac.jp, The Institute of Statistical Mathematics
Correspondence address: imoto@ims.u-tokyo.ac.jp

We propose a statistical method for estimating a gene network based on Bayesian networks from microarray data together with biological knowledge including protein-protein interactions, protein-DNA interactions, binding site information, existing literature and so on. Our method can optimize the balance between microarray and biological knowledge automatically.

Long abstract


A-19  A new FDR algorithm for differential expression analysis of microarray data
Gregory Grant1, Elisabetta Manduchi2, Christian Stoeckert
1ggrant@pcbi.upenn.edu, CBIL; 2manduchi@pcbi.upenn.edu, CBIL
Correspondence address: ggrant@pcbi.upenn.edu

PaGE is an algorithm we have developed at CBIL which uses statistical methods to assign discrete patterns to gene expression data. This poster will highlight the new implementation (version 5.0) with improved novel FDR alrogithm based on the Westfall and Young minP stepdown distributions, and new interface.

Long abstract


A-20  GenMAPP and MAPPFinder 2.0: Tools for Viewing and Analyzing Genomic and Proteomic Data Using Gene Ontology and Biological Pathways
Kam D. Dahlquist1, Scott W. Doniger2, Nathan Salomonis, Karen Vranizan, Steven C. Lawlor, and Bruce R. Conklin
1kadahlquist@vassar.edu, Department of Biology, Vassar College; 2sdoniger@gladstone.ucsf.edu, Gladstone Institute of Cardiovascular Disease
Correspondence address: kadahlquist@vassar.edu

GenMAPP is designed for viewing expression data on biological pathways. GenMAPP automatically color-codes the genes according to criteria supplied by the user. MAPPFinder matches expression data to the Gene Ontology and indicates whether there is a significant over-representation of genes meeting the user’s criterion for each GO term. http://www.GenMAPP.org.

Long abstract


A-21  Decision-tree approach to the classification of prostate tissue samples using microarray gene expression data
Changqing Ma1, Rajiv Dhir2, Jianhua Luo, George Michalopoulos, Michael Becich, John Gilbertson
1chmst40@pitt.edu, University of Pittsburgh; 2dhirr@MSX.UPMC.EDU, University of Pittsburgh
Correspondence address: chmst40@pitt.edu

A decision-tree learning approach was applied to classify three types of prostate tumor tissue samples using microarray gene-expression data. In LOOCV, results were comparable to those obtained from applying SVMs or weighted voting method to this dataset. Furthermore, human-understandable models from decision-tree learning correctly predicted sample classes in previously published prostate tumor datasets.

Long abstract


A-22  Common transcription factor binding sites in the regulatory regions of a cluster of genes statistically linked to the hox gene HB24
Mar Bellido 1, Whipple Neely2, Fan W, Beppu L, Zhao LP, Radich JP
1mbellido@fhcrc.org, Fred Hutchinson Cancer Research Center; 2whipple@fhcrc.org, Fred Hutchinson Cancer Research Center
Correspondence address: mbellido@fhcrc.org

The hox gene HB24 encodes a protein expressed in CD34 cells which plays a role in T-cell activation. Using a statistical approach based on regression analysis we identified 29 genes co-regulated with the HB24 gene. The analysis of the regulatory regions of these genes revealed common transcription factor binding sites.

Long abstract


A-23  BioRag - Bio Resource for Array Genes: An Online Resource for Analyzing and interpreting Microarray data
Ritu Pandey1, Raghavendra K Guru 2, David W Mount
1ritu@u.arizona.edu, Bioinformatics, Arizona Cancer Center, University of Arizona; 2graghave@cs.arizona.edu, Bioinformatics, Arizona Cancer Center, University of Arizona
Correspondence address: ritu@u.arizona.edu

BioRag (Bio Resource for Array Genes at http://www.biorag.org) is an interactive platform for analyzing and developing a biological interpretation of the microarray results. Differential gene expression patterns can be interpreted using tools that mine and extract variety of biological relationships captured in this integrative resource.

Long abstract


A-24  QUINTET: An R-based unified cDNA microarray data analysis system with graphical user interface
Tae-Hoon Chung1, Cheol-Goo Hur2, Sun Yong Park, Hyo Soo Lee
1thcng@kribb.re.kr, KRIBB; 2hurlee@kribb.re.kr, KRIBB
Correspondence address: thcng@kribb.re.kr

We present QUINTET: an R-based unified cDNA microarray data analysis system with graphical user interface. It can seamlessly perform five principal categories of the data analysis: data quality assessment, faulty spot filtering and normalization, identification of differentially expressed genes, clustering of gene expression profiles and classification of samples.

Long abstract


A-25  Combining Bayesian Network Model with Promoter Element Detection for Estimating Gene Networks from Gene Expression Data
Yoshinori Tamada1, SunYong Kim2, Hideo Bannai, Seiya Imoto, Kousuke Tashiro, Satoru Kuhara, Satoru Miyano
1tamada@kuicr.kyoto-u.ac.jp, Laboratory of Biological Information Network, Bioinformatics Center, Institute for Chemical Research, Kyoto University; 2sunk@ims.u-tokyo.ac.jp, Laboratory of DNA Information Analysis, Human Genome Center, Institute of Medical Science, University of Tokyo
Correspondence address: tamada@kuicr.kyoto-u.ac.jp

We developed a statistical method for estimating gene networks and detecting promoter elements simultaneously. The estimation of gene network from cDNA microarray data alone is likely to cause misdirected edges. Our method overcomes this problem by integrating microarray data and the DNA sequence information into a Bayesian network estimation.

Long abstract


A-26  Design of the custom whole-genome malaria oligonucleotide array
Serge Batalov1, Elizabeth A. Winzeler2
1batalov@gnf.org, GNF; 2winzeler@scripps.edu, TSRI/GNF
Correspondence address: batalov@gnf.org

To study the transcriptome of the malaria parasite, we designed a custom no-mismatch oligonucleotide array containing 260,596 25mer single stranded probes from predicted coding sequence (including mitochondrion and plastid genome sequences) and 106,630 probes from non-coding sequence. In addition 124,957 probes from Plasmodium yoelli contigs are include on the array.

Long abstract


A-27  Better Affymetrix Estimates
Mark Reimers1
1mark.reimers@cgb.ki.se, Karolinska
Correspondence address: mark.reimers@cgb.ki.se

This shows improvements on the best current estimates for gene abundance using Affymetrix raw data, by compensating for spatial heterogeneity, and by assessing individual probe quality and background, prior to using a robust method for fit.

Long abstract


A-28  A New Method of Block/Spot Indexing with Maximal epsilon-Regularity Point Set for Microarray Image Analysis
Hee-Jeing Jin1, Ho-Youl Jung2, Hyun-Kyung Lee, Choon-Hwan Lee, Hwan-Gue Cho
1hjjin@pearl.cs.pusan.ac.kr, Pusan National University; 2hyjung@ngri.re.kr, National Genome Research Institute
Correspondence address: hjjin@pearl.cs.pusan.ac.kr

It is very difficult to automatically analyze microarray images due to several problems such as spot position variation. We propose a novel block and spot indexing algorithm with the use of maximal epsilon-regularity. The time complexity of our algorithm is O(n2) where n is the number of cells.

Long abstract


A-29  A Novel Feature Selection Method using Evolving Supervised Clustering and Applications for Gene Expression Data Modeling
Nikola Kasabov1, Liang Goh2
1nkasabov@aut.ac.nz, KEDRI; 2liang.goh@aut.ac.nz, KEDRI
Correspondence address: liang.goh@aut.ac.nz

The method combines the tasks of classification and feature selection by using the obtained clusters in ECOS to further extract specific features for each of the clusters. The method overcomes the problem of the signal to noise ratio method when data of the same class are spread in several clusters.

Long abstract


A-30  GeneAnnot: Annotation of high-density oligunocleotide arrays and their linking with GeneCards.
Vered Chalifa-Caspi1, Itai Yanai2, Ron Ophir, Michael Shmoish, Hila Benjamin-Rodrig, Naomi Rosen, Pavel Kats, Marilyn Safran, Orit Shmueli and Doron Lancet.
1vered.caspi@weizmann.ac.il, Weizmann Insitute of Science; 2Iyanai@wisemail.weizmann.ac.il, Weizmann Insitute of Science
Correspondence address: vered.caspi@weizmann.ac.il

The availability of entire genomic sequences enables matching the short probe sequences of oligonucleotide arrays to their annotated gene representations. Here, we present a framework for estimating the sensitivity and specificity of gene-representing probe sets, and for integrating this information with the comprehensive genome and transcriptome repositories of the GeneCards databases suite.

Long abstract


A-31  Reliable feature extraction from mechanically spotted two-color microarrays.
Yuching Lai1, Greg Tyrelle2, Daniel Di Giusto, Garry C. King
1yuching@kinglab.unsw.edu.au, UNSW; 2greg@kinglab.unsw.edu.au, UNSW
Correspondence address: yuching@kinglab.unsw.edu.au

An intrinsic weakness – spot inhomogeneity – can be turned into a strength by using pixel correlation methods to reliably extract red/green ratios, identify dye-selective quenching and cull spots by data quality. We compare our methods to established approaches.

Long abstract


A-32  Target selection for the custom oligonucleotide array by clustering experimentally determined and computationally predicted transcript sets in mouse
Serge Batalov1
1batalov@gnf.org, GNF
Correspondence address: batalov@gnf.org

Custom oligonucleotide array design is aimed at effectively interrogating a largest possible non-redundant set of transcripts under a physical size constraint. 200,000+ publicly available and proprietary mouse transcript sequences were clustered. The custom chip was subsequently extensively used to profile the expression in 70+ different tissues.

Long abstract


A-33  Normalization of cDNA Microarray Data Using R-Language
Sang Cheol Kim1, In Uk Hwang2, In Young Kim¹, Sunho Lee³, Hyun Cheol Chung¹, Sun Young Rha¹ Byung Soo Kim²
1kimsc77@yonsei.ac.kr, Brain Korea 21 Project for Medical Science, Cancer Metastasis Research Center, Yonsei University College of Medicine; 2mzhwang@yonsei.ac.kr, Applied Statistics, Yonsei University
Correspondence address: kimsc77@yonsei.ac.kr

The user-friendly R-based software program NOM-R, implemented Yang et al’s normalization procedures is developed. This program is not only convenient for the biologists to use in microarray data normalization, but also can handle the repeated intensity values and dye-swap experiments, simultaneously.

Long abstract


A-34  Expression profiling and analysis of transcription factors for neuronal differentiation from stem cells
Dong Mi Shin1, Joon Ik Ahn2, Ki Hwan Lee, Young Seek Lee, Yong Sung Lee
1dongmishin@yahoo.com, Hanyang University; 2joonic@gaiagene.com, Hanyang University
Correspondence address: dongmishin@yahoo.com

To identify transcription factors that may play an important role in the differentiation of stem cells to neurons, high throughput gene expression experiment and computational analysis were performed. Our result suggest many transcription factors- novel transcription factors as well as those previously known to be involved in differentiation signaling.

Long abstract


A-35  Latin Square Design to Gene Expression Experiments
Tetsutaro Hamano1, Akira Ohide2, Masaru Sekijima, Kazuto Nishio, Masahiro Takeuchi, Yasuhiro Fujiwara
1hamanot@pharm.kitasato-u.ac.jp, Division of Biostatistics, Kitasato University Graduate School, Japan; 2a-ohide@ankaken.co.jp, Applied Biology Division, Kashima Laboratory, Mitsubishi Chemical Safety Institute, Japan
Correspondence address: hamanot@pharm.kitasato-u.ac.jp

We used a randomized Latin square design for blocking experimental variations and for estimating tamoxifen effect in human breast cancer cell lines assay. Orthogonal decomposition of a gene expression map based on the experimental design is expected to elucidate treatment effects as well as systematic error components.

Long abstract


A-36  Quality measures for Affymetrix data
Ken Simpson1
1ksimpson@wehi.edu.au, The Walter and Eliza Hall Institute
Correspondence address: ksimpson@wehi.edu.au

We present several methods (qualitative and quantitative) for determining the quality of hybridizations to Affymetrix GeneChips. In particular, we make an attempt to quantify the effect of hybridization artifacts on estimates of gene expression.

Long abstract


A-37  Statistical Characterization of Spervised Learning and Gene Selection Algorithms for Gene Expression Analysis
Eisaku Maeda1, Ichiro Takemasa2, Tomonori Izumitani, Hirotoshi Taira, Kenichi Matsubara, Morito Monden
1maeda@cslab.kecl.ntt.co.jp, NTT Communication Science Laboratories; 2alfa-t@sf6.so-net.ne.jp, Graduate School of Medicine, Osaka University
Correspondence address: maeda@cslab.kecl.ntt.co.jp

We focused on histopathological phenotype prediction in colorectal carcinoma from microarray expression data, and investigated statistically their prediction performance of various combinations of classification techniques and gene selection methods. The results demonstrated detected marker genes and prediction accuracy strongly depends on the employed combination.

Long abstract


A-38  Automation of cDNA microarray image analysis
Jin Hyuk Kim1, Hye Young Kim2, Tae Sung Park, Ki Woong Kim, Young Seek Lee, and Yong Sung Lee
1jhkim1@hanyang.ac.kr, Hanyang Univeirsity College of Medicine; 2hykim121@hanyang.ac.kr, Hanyang University College of Medicine
Correspondence address: jhkim1@hanyang.ac.kr

To automate the microarray image analysis, several processes were developed for detecting spots, filtering bad spots, and generating HTML reports. It can analyze a lot of microarray images without user’s attention. Therefore, it can be a connection in high throughput pipeline.

Long abstract


A-39  Compensation of scanner before robust M regression normalization in cDNA microarray
Hye Young Kim1, Jin Hyuk Kim2, Yong Sung Lee, Young Seek Lee, Tae Sung Park, Ki Woong Kim, and Hyun Ju Chang
1hykim121@hanyang.ac.kr, Hanyang Univeirsity College of Medicine; 2jhkim1@hanyang.ac.kr, Hanyang University College of Medicine
Correspondence address: hykim121@hanyang.ac.kr

In cDNA microarray, the conversion of the amount of fluorescence to image intensity with the scanning process must be carefully handled to find the gene expression ratio. We developed a reverse scanning method for the microarray image and applied robust M regression to normalize the data from the compensated image.

Long abstract


A-40  OligoDesign: Design of LNA oligonucleotides for gene expression arrays
Niels Tolstrup1, Peter S. Nielsen and Sakari Kauppinen
1tolstrup@exiqon.com, Exiqon
Correspondence address: tolstrup@exiqon.com

OligoDesign is a webservice for the design of DNA/LNA mixmer oligonucleotides. The OligoDesign software features recognition and filtering of the target sequence by genome-wide BLAST analysis. It includes routines for prediction of melting temperature, self-annealing and secondary structure for LNA substituted oligonucleotides. The OligoDesign program is freely accesible at http://lnatools.com/

Long abstract


A-41  Hierarchical Clustering of Gene Expression Data with the Agglomerative Information-Bottleneck Method
Byoung-Hee Kim1, Kyu-Baek Hwang2, Jung-Ho Chang, Byoung-Tak Zhang
1bhkim@bi.snu.ac.kr, Biointelligence Lab, Seoul National University; 2kbhwang@bi.snu.ac.kr, Biointelligence Lab, Seoul National University
Correspondence address: bhkim@bi.snu.ac.kr

By applying the double clustering with the agglomerative information-bottleneck method to NCI60 cell lines, the correspondence between gene expression patterns and the ostensible origins of the tumours was verified. By computing 'entropy', mutual information, and its variation for several stages of clustering, an appropriate number of clusters could be estimated.

Long abstract


A-42  A multivariate method for comparison of microarray data from different platforms
Aedin C Culhane1, Guy Perriére2, Desmond G. Higgins
1A.Culhane@ucc.ie, University College Cork; 2perriere@biomserv.univ-lyon.fr, Universite Claude Bernard
Correspondence address: A.Culhane@ucc.ie

We describe a powerful method for comparison and visualisation of gene expression data from different microarray platforms. Co-inertia analysis (CIA) is a multivariate method that identifies co-relationships in multiple datasets. The genes from each dataset, which define these trends, can be identified. Further details: http://bioinfo.ucc.ie.

Long abstract


A-43  Exact Power Under Independence for the False Discovery Rate in Gene Expression Array Experiments
Lawrence Hunter1, Deborah H. Glueck2, Anis Karimpour-Fard and Keith E. Muller
1Larry.Hunter@uchsc.edu, U. Colorado School of Medicine; 2, U. Colorado School of Medicine
Correspondence address: Larry.Hunter@uchsc.edu

The false discovery rate (Benjamini & Hochberg, 1995) is widely used for multiple comparison problems, including gene expression array studies. For independent, but not necessarily identically distributed test statistics, we derive the joint probability distribution of the number of total and false rejections, and thereby provide methods for exact small sample power and sample size.

Long abstract


A-44  Application of Stellar Photometry To The Analysis of Microarray Images
Mahyar Sabripour1, Christopher I. Amos2, Kevin Coombes
1msabripo@mdanderson.org, Department of Epidemiology, The University of Texas M. D. Anderson Cancer Center; 2camos@request.mdacc.tmc.edu, Department of Epidemiology, The University of Texas M. D. Anderson Cancer Center
Correspondence address: msabripo@mdanderson.org

Improvements in quantifying spots can directly impact the identification of genes critical to the development and progression of cancer. We utilize a stellar photometric model, the Moffat function, to analyze cDNA membrane microarray images. In the current setting, we fit the Moffat function to cDNA spots on the array.


Long abstract


A-45  An automatic and unbiased GA for finding the most discriminant gene sets on Microarray
Han-Yu Chuang1, Hwa-Sheng Chiu2, Huai-Kuang Tsai, Cheng-Yan Kao
1r90002@csie.ntu.edu.tw, Bioinfo Lab., Department of CSIE, National Taiwan University; 2r91031@csie.ntu.edu.tw, Bioinfo Lab., Department of CSIE, National Taiwan University
Correspondence address: cykao@csie.ntu.edu.tw

A multi-objective genetic algorithm based approach, combining univariate and multivariate techniques, was proposed to find optimal gene sets for sample classification on gene expression data automatically and unbiased. Eight genes with 93% LOOCV accuracy of KNN were selected to be the optimal predictive gene sets for Colon cancer dataset.

Long abstract


A-46  MGraph: graphical models for microarray data analysis
Junbai Wang1, Ola Myklebost2, Eivind Hovig, Norwegian Radium Hospital, j.e.hovig@labmed.uio.no
1junbaiw@radium.uio.no, Norwegian Radium Hospital; 2olam@radium.uio.no, Norwegian Radium Hospital
Correspondence address: junbaiw@radium.uio.no

MGraph is a MATLAB toolbox, which applies graphical models to solve problems in microarray data analysis. MGraph with its graphical interface allows user to predict genetic regulatory networks by a graphical gaussian model, and to quantify the effects of different experimental treatment conditions on gene-expression profiles by graphical log-linear model.

Long abstract


A-47  Elucidating Patterns within Patterns: A Post-Processing Step in Promoter Sequence Analysis
Jessica Mar1, Alvis Brazma2
1jess@ebi.ac.uk, European Bioinformatics Institute; 2brazma@ebi.ac.uk, European Bioinformatics Institute
Correspondence address: jess@ebi.ac.uk

SPEXS is an algorithm that extracts statistically overrepresented patterns in a set of sequences. SPEXS output generally contains too many significant patterns for a user to survey in detail, hence a post-processing step designed to group these patterns into key clusters is helpful. We present approaches to isolate these clusters.

Long abstract


A-48  Infectomic Analysis of Cryptococcus Infections Using DNA Microarray
Ambrose Jong1, Timothy Triche2, Steven H-M Chen, Sheng-He Huang
1ajong@chla.usc.edu, Childrens Hospital Los Angeles/University of Southern California; 2ttriche@chla.usc.edu, Childrens Hospital Los Angeles/University of Southern California
Correspondence address: shhuang@hsc.usc.edu

Our laboratories have been keenly interested in infectomic analysis of transcription profiles in human BMEC infected with C. neoformans. We performed a time-course study of C. neoformans infection using oligonucleotide microarrays. We have found dynamic changes in transcription profiles of cytokines that are important for pathogenesis of Cryptococcus meningitis.

Long abstract



New Frontiers
L-1  HyBrow: A Hypothesis Space Browser
Nigam Shah1, Stephen Racunas2, Nina V. Fedoroff
1nigam@psu.edu, Penn State University; 2sar147@psu.edu, Penn State University
Correspondence address: nigam@psu.edu

HyBrow is a prototype computer system comprising an event-based ontology for biological processes, an associated database and programs to perform hypothesis evaluation using a wide variety of available data. We demonstrate the feasibility of HyBrow, using the galactose metabolism gene network in Saccharomyces cerevisiae as our test system, for ranking alternative hypotheses in an automated manner.

Long abstract


L-2  Phylogenetic footprinting of co-expressed genes by Tree-Gibbs sampling
Stefan Van Yper 1, Olivier Thas 2, Jean-Pierre Ottoy and Wim Van Criekinge
1Stefan@biomath.rug.ac.be, Department of Applied Mathematics, Biometrics and Process Control, Ghent University; 2olivier.thas@rug.ac.be, Department of Applied Mathematics, Biometrics and Process Control, Ghent University
Correspondence address: Stefan@biomath.rug.ac.be

Using site/motif Gibbs sampling, transcription factor binding sites can be found by analysing either the promoter sequences of co-expressed genes or the promoter sequences of orthologous genes. Tree-Gibbs sampling combines both data sources in one algorithm. This way additional information is available, resulting in improved, both in speed and accuracy, motif finding

Long abstract


L-3  Statewide Bioinformatics in Kentucky
Eric Rouchka1, Nigel Cooper2
1eric.rouchka@louisville.edu, University of Louisville; 2nigelcooper@louisville.edu, University of Louisville
Correspondence address: eric.rouchka@louisville.edu

Abbreviated Abstract The KBRIN bioinformatics core is attempting to create a Kentucky-wide network of bioinformatics expertise. This venture has led to the identification of knowledge- and compute-based resources. The core seeks to improve bioinformatics knowledge through research and the creation of bioinformatics courses, certificates, and degrees. The core web site is: http://www.kbrin.louisville.edu/about/bioinform_core.html

Long abstract


L-4  Developing Analysis and Visualization Tools for Lead Discovery
Dimitri Petrov1, Shumei Jiang2, Andrey Santrosyan, Hayk Asatryan, Kaisheng Chen, Chris Benner, Robert Downs, John Isbell, Yingyao Zhou
1dpetrov@gnf.org; 2sjiang@gnf.org
Correspondence address: zhou@gnf.org

Genomics Institute of the Novartis Research Foundation (GNF) is developing data analysis and visualization tools on top of a web-based informatics system for its lead discovery biomedical research. Recent tools include: dose-response data fitting and visualization, structural similarity-based compound hierarchical clustering, ring component-based compound diversity analysis, LCMS data visualization, etc.

Long abstract


L-5  Using Web Services as part of the 2D gel analysis workflow
Nataliya Sklyar1, Matthias Berth 2, Dirk Lewerentz
1sklyar@informatik.uni-leipzig.de, University of Leipzig; 2berth@decodon.com , DECODON GmbH
Correspondence address: sklyar@informatik.uni-leipzig.de

We present the integration of web services into Delta2D, an end-user 2D gel analysis application. With Delta2D's web services plugins, users can access data from external sources and display it alongside the image analysis results in a uniform way. We describe the plugin architecture and first experiences in its use.

Long abstract


L-6  A DNA-based Theorem Proving by Resolution Refutation
Ji-Yoon Park1, In-Hee Lee2, Young-Gyu Chai, Byoung-Tak Zhang
1scolaswhite@hotmail.com, Dept. of Biochemistry and Moleluar Biology, Hanyang University; 2ihlee@bi.snu.ac.kr, Biointelligence Laboratory School of Computer Science and Engineering, Seoul National University
Correspondence address: scolaswhite@hotmail.com

Theorem proving is a method involving logical reasoning and has a variety of applications, including diagnosis and decision-making. Resolution refutation is a general technique of proving the validity of a theorem given a set of axioms and rules. To prove theorem proving, DNA molecular reaction is implemented by resolution refutation.

Long abstract


L-7  G-language Genome Analysis Environment Version 2
Yohei Yamada1, Kazuharu Arakawa2, Ryo Hattori, Yusuke Kobayashi, Hayataro Kouchi, Atsuko Kishi, Masaru Tomita
1skipper@g-language.org, Keio University, Department of Environmental Information; 2gaou@g-language.org, Keio Institute for Advanced Biosciences
Correspondence address: skipper@g-language.org

Version 2 of the G-language Genome Analysis Environment is developed to gain further speed and integrity through the object-oriented architecture. Powered by the front-end "inspire" based on Flash/HTML and the database system "bluebird", the new environment aims to achieve greater flexibility and integrity.

Long abstract


L-8  Integrated data Modeling of Protein Structures by using a fact constellation model based on a XML Mediated Warehouse System
RongHua Li1, Sung-Hee Park2, Kwang Su Jeong, Keun Ho Ryu
1lrh@dblab.chungbuk.ac.kr, Chungbuk University; 2shpark@dblab.chungbuk.ac.kr, Chungbuk University
Correspondence address: ksjeong@dblab.chungbuk.ac.kr

This paper describes integrated protein structure modeling by using a fact constellation model and represents this modeling to XML in order to store and query highly complex protein data based on a XML mediated warehouse system. It performs complex queries employed during analyzing process by using XML query processing.

Long abstract


L-9  Using XML-RPC for Distributed BLAST -Desterilizing idle resources-
Yong Wook Kim1, Keun Woo Lee, Hee Won, Yong-Ho In
1yongari@bioinfomatix.com, Bioinfomatix, Inc.
Correspondence address: yongari@bioinfomatix.com

Usually, personal computers use the MS Windows operating system, but the computing power is used for simple work. We try to use these available resources as member resources of a clustering system. XML-RPC provides the straight forward way for distributed BLAST and heterogeneous operating systems to be used as a member of the distributed system for BLAST.

Long abstract


L-10  MineLink: A novel information integration framework for Life Sciences
Tanveer Syeda-Mahmood1, Bhooshan Kelkar2
1stf@almaden.ibm.com, IBM Almaden Research; 2bkelkar@us.ibm.com, IBM Life Sciences
Correspondence address: stf@almaden.ibm.com

MineLink is a novel information integration framework that can pull together life sciences data and analytic applications from disparate sources. It specifies a design methodology for automatically integrating individual components, be they data sources, processors, data miners or visualization components without the need for explicit programming. It addresses both syntactic and semantic aspects of information integration.

Long abstract


L-11  TransMiner: Biotransformation Prediction
Minesh Upadhyaya1, Imran Shah2, Daniel McShan, Weiming Zhang
1minesh.upadhyaya@uchsc.edu, UCHSC; 2imran.shah@uchsc.edu, UCHSC
Correspondence address: minesh.upadhyaya@uchsc.edu

We developed TransMiner, a symbolic computational approach for inferring novel biochemical functions. We have developed a novel sub-graph isomorphism-based algorithm to search the detailed representations of known biotransformations to induce biocatalytic "rules". These symbolic biocatalytic rules represent the simplified functions of enzymes and can be used to infer novel biochemistry.

Long abstract


L-13  Gene-Protein networks in Drosophila Melanogaster
Inigo San Gil 1, Kevin White2, Joel Bader, Tong-Ruei Li
1inigo.sangil@yale.edu, Yale University; 2kevin.white@yale.edu, Yale University
Correspondence address: inigo.sangil@yale.edu

The poster shows a gene network based on a map of interactions between proteins and genes in Drosophila melanogaster. The map is based on cross correlations of genome wide time courses of gene expression and yeast to hybrid interactions. Results show a new rich network of connections between genes and proteins.

Long abstract


L-14  Performing in silico experiments on the Grid using myGrid
Robert Stevens1, Tom Oinn2, Peter Li
1robert.stevens@cs.man.ac.uk, University of Manchester; 2tmo@ebi.ac.uk, European Bioinformatics Institute
Correspondence address: robert.stevens@cs.man.ac.uk

myGrid aims to exploit Grid technology & provide high-level services and middleware that make it suitable for bioinformatics. myGrid uses resource discovery and workflow enactment services that allow scientists to perform in silico experiments over bioinformatics resources. Services are provided to support the scientific method, notably provenance management, change notification & personalization.


Long abstract



Phylogeny and Evolution
G-1  The Evaluation of Different Approaches to Infer Positive Selection Sites
Li Jia1, Tao Jiang2, Michael Clegg
1lijia@cs.ucr.edu, University of California; 2jiang@cs.ucr.edu, University of California
Correspondence address: lijia@cs.ucr.edu

It is important to infer positive selection sites associated with a given gene family. Different approaches have been proposed to detect positive selection at single amino acid sites. The performance of these approaches was evaluated so that researchers should be able to apply an appropriate method to their research when certain circumstances are met.

Long abstract


G-2  CompMapper: An Automatic Pipeline to Define Conserved Segments between Genomes Systematically
Fu Lu1, Zhenyuan Wang2, Xiangqun Holly Zheng, Wenyan Zhong, Fei Zhong, Richard Mural
1fu.lu@celera.com, Celera Genomics; 2jack.wang@celera.com, Celera Genomics
Correspondence address: fu.lu@celera.com

To overcome the limitations of comparative mapping using orthologous genes, we have developed a new paradigm and systematic approach to define conserved synteny between human and mouse directly from genomic sequence. The automatic pipeline should be applicable to compare any species with complete or draft genome sequence and within an appropriate phylogenetic distance.

Long abstract


G-3  Evolutionary analysis of long terminal repeats of human endogenous retroviruses
Artamonova I.1
1irena@humgen.siobc.ras.ru, Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry RAS, Miklukho-Maklaya 16-10, Moscow, 117997, Russia
Correspondence address: irena@humgen.siobc.ras.ru

We analyzed locations of HERV-K LTRs in the human genome. Their distribution is not uniform among human chromosomes and among different regions of one chromosome. The majority of HERV-K LTRs are clustered. Positions of LTR clusters correlate with the Giemsa segmentation of human chromosomes. Most clusters are observed GC-rich regions.

Long abstract


G-5  Development of SCAR marker for the Discrimination of Bang-Poong species of Herbal Medicine
Mi Young Lee1, Byong Seob Ko2, Seong Mi Hong, Jung Eun Kim, Sung Jin Lee
1mylee2020@hanmail.net, Korea Institute of Oriental Medicine; 2bsko@kiom.re.kr, Korea Institute of Oriental Medicine
Correspondence address: mylee2020@hanmail.net

Bang-Poong species is one of the most important species of herbal medicine. We have applied a molecular approach to developing SCAR markers. We show that this methodology can be applied to dried herbal medicine.

Long abstract


G-6  Novel Heterogeneous Maximum Likelihood Methods for The Detection of Adaptive Evolution.
Jennifer Commins1, Dr. James O. McInerney 2
1jennifer.commins@may.ie, NUI, Maynooth; 2james.o.mcinerney@may.ie, NUI, Maynooth
Correspondence address: jennifer.commins@may.ie

Maximum Likelihood methods are popular for analysing sequences to detect adaptive evolution. We have designed new methods for robustly inferring the evolutionary history of extant sequences and identifying signatures of adaptive evolution, performing analyses of multiple sequence alignments in ways closer to biological reality than existing methods. Available at http://bioinf.may.ie/likelihood.

Long abstract


G-7  Modelling change in codon substituion, using serially sampled sequence data.
Matthew Goode1, Allen Rodrigo2
1m.goode@auckland.ac.nz, University of Auckland; 2a.rodrigo@auckland.ac.nz, University of Auckland
Correspondence address: m.goode@auckland.ac.nz

We describe a method for modeling changes in codon substitution in populations where evolution can be measured, e.g. rapidly evolving viral populations. Our model extends Neilsen and Yang's codon model to allow parameters associated with selection and proportion of selected sites to vary over time.

Long abstract


G-8  Evolution of Toll genes from the perspective of transcriptional regulatory regions
Rajakumar Sankula1, Narayanan Perumal2, Lang Li
1rsankula@iupui.edu, School of Informatics, Indiana University Purdue University Indianapolis ; 2nperumal@iupui.edu, School of Informatics, Indiana University Purdue University Indianapolis
Correspondence address: nperumal@iupui.edu

Phylogenetic analysis of Toll gene evolution across insects, plants, and mammals using their transcriptional regulatory regions has been performed. This analysis is based on a unique approach employing the frequency of “evolutionarily” informative transcription factor binding sites. Interestingly, the resultant phylogeny produced results similar to that of protein-based phylogeny.

Long abstract


G-11  PyPop: A framework for large-scale population genomics analysis
Alex Lancaster1, Mark P. Nelson2, Richard M. Single; Diogo Meyer; Glenys Thomson
1alexl@socrates.berkeley.edu, UC Berkeley; 2, UC Berkeley
Correspondence address: alexl@socrates.berkeley.edu

PyPop (Python for Population Genetics) is a suite of programs for the analysis of multi-locus population genetic data, outputs are stored in XML and can be transformed into other data formats. PyPop will be made freely available under the GNU GPL at: http://allele5.biol.berkeley.edu/pypop/

Long abstract


G-12  Bayesian Population Genetics and the Human History of China
Michael Black 1, Cheryl Wise 1, Wei Wang, Alan Bittles
1m.black@ecu.edu.au, Centre for Human Genetics, Edith Cowan University, Perth, Western Australia
Correspondence address: m.black@ecu.edu.au

The history of China is a record of migration, population admixture and community endogamy. While comparing current Bayesian and "Classical" methodologies, these factors were prevalent in the genetic structure of eight Chinese ethnic populations. It’s concluded that Bayesian models comprising multiple-system data and historical factors are required for future studies.

Long abstract


G-13  A hybrid clustering approach to genome-scale recognition of protein families
Timothy J. Harlow1, J. Peter Gogarten2, Mark A. Ragan
1t.harlow@imb.uq.edu.au, Institute for Molecular Bioscience, University of Queensland; 2gogarten@uconn.edu, University of Connecticut
Correspondence address: m.ragan@imb.uq.edu.au

We develop a hybrid approach to recognizing protein families among multi-genomic datasets based on Markov and single-linkage clustering of normalised pairwise BLASTP bit scores. We present results for all proteins from 114 microbial genomes, and illustrate its utility by recognizing orthologs and paralogs of rotary motor ATP synthetase F1 subunits.

Long abstract


G-14  Computing accurate phylogenies from gene-order data
Jijun Tang1, Bernard M.E. Moret2
1jtang@cs.unm.edu, University of New Mexico; 2moret@cs.unm.edu, University of New Mexico
Correspondence address: jtang@cs.unm.edu

DCM-GRAPPA is a method for phylogeny recontruction based on gene-order data; it scales gracefully to one thousand genomes, using a day or two of computation and producing highly accurate results (within 1%). GRAPPA and DCM-GRAPPA are available in source form at http://www.cs.unm.edu/~moret/GRAPPA/

Long abstract


G-15  EVOLVE: a toolkit for statistical molecular evolutionary analysis of genomes
Gavin Huttley1, Alex Isaev2, Andrew Butterfield, Edward lang, Cath Lawrence
1gavin.huttley@anu.edu.au, ANU; 2Alexander.Isaev@maths.anu.edu.au, ANU
Correspondence address: gavin.huttley@anu.edu.au

The number of genes and species for which data are now available present an opportunity for statistically powerful examinations of molecular evolutionary processes. We will present a description of the functionality and performance of EVOLVE (cbis.anu.edu.au/software), a high performance computing package for phylogeny-based maximum likelihood modeling and hypothesis testing.

Long abstract


G-16  Analyses using novel Markov models of substitution support a significant role for germline methylation in male biased evolution.
Matthew J Wakefield1, Gavin A Huttley2, Alexander Isaev, Andrew Butterfield, Edward Lang & Cath Lawrence
1Matthew.Wakefield@kangaroo.genome.org.au, Centre for Bioinformation Science, The Australian National University; 2Gavin.Huttley@anu.edu.au, Centre for Bioinformation Science, The Australian National University
Correspondence address: matthew.wakefield@kangaroo.genome.org.au

We have constructed a novel Markov model of dinucleotide substitution including parameters for methylation and strand using the EVOLVE toolkit. Our analysis supports a significant contribution of differential methylation in the germline elevating male mutation rates: clearly demonstrating the utility of EVOLVE (http://cbis.anu.edu.au/software/) in developing new sequence evolution models.


Long abstract


G-17  BOSS: Boxes of Sequence Similarity
Robert Flegg1, Malcolm Simons2
1robert.flegg@med.monash.edu.au, GeneType Pty. Ltd., Fitzroy Vic 3065, Australia and Victorian Bioinformatics Consortium, PO Box 53, Monash University, Clayton Vic 3800, Australia; 2mjsimons@optusnet.com.au,
Correspondence address: robert.flegg@med.monash.edu.au

Alignments display a mosaic pattern due to recombination. Existing programs use a broad window to find major events. This leads to not recognising the finer detail where multiple events occur at different points in different lineages. BOSS uses pairwise comparison and a short sliding window to analyse this mosaic structure.

Long abstract


G-18  Determining the Eukaryote Phylogeny
Gayle Philip1, James McInerney2
1gayle.k.philip@may.ie, National University of Ireland, Maynooth; 2james.o.mcinerney@may.ie, National University of Ireland, Maynooth
Correspondence address: gayle.k.philip@may.ie

The relationship of nematodes to arthropods and vertebrates can be described by the Coelomata and Ecdysozoa hypotheses. Our aim was to test these hypotheses by finding the supertree that best described the relationship of orthologous, single gene family trees from ten eukaryotic taxa. Our results support the traditional Coelomata hypothesis.

Long abstract


G-19  Evolutionary analysis of single nucleotide polymorphism distribution in duplicated gene pairs of Arabidopsis thaliana
Brad Chapman1, Andrew Paterson2
1chapmanb@uga.edu, University of Georgia; 2paterson@uga.edu, University of Georgia
Correspondence address: chapmanb@uga.edu

Whole genome duplication has played a major role in the structuring of the Arabidopsis thaliana genome. We examined single nucleotide polymorphism (SNP) variation in duplicate genes retained after these duplication events. We compare SNP accumulation in duplicates and singletons with respect to their effect on protein evolution.

Long abstract


G-20  MICROSATELLITE REPEATS IN PLANTS
Chandri N Yandava1, Roger Pennell2, Kenneth Feldmann, Peter Mascia, Richard Flavell, William Kimmerly
1cyandava@ceres-inc.com, Ceres Inc; 2rpennell@ceres-inc.com, Ceres Inc
Correspondence address: cyandava@ceres-inc.com

The classes AAC, AAG present in higher number in Arabidopsis, whereas AAT AGG and CCG are more abundant in rice. Repeats with A and T bases (except AAT) are more frequent in Arabidopsis, repeats with G and C bases are more in rice, as rice is high in GC composition.

Long abstract


G-21  Phylogeny of DNA Methyltransferases recognizing GATC and related DNA sequences.
Richard D. Morgan1
1morgan@neb.com, New England Bioloabs
Correspondence address: morgan@neb.com

DNA modification plays important roles in nucleic acid metabolism. Methyltransferases modifying the related DNA sequences GATC and GANTC are close homologs. We present a phylogeny of these enzymes as an example of how DNA sequence recognition has evolved, and predict how to evolve further specificities in vitro

Long abstract


G-22  Freeing Phylogenies from Alignments
Michael Höhl1, Isidore Rigoutsos2, Mark Ragan
1m.hoehl@imb.uq.edu.au, Institute for Molecular Bioscience, The University of Queensland, Brisbane, QLD 4072, Australia.; 2rigoutso@us.ibm.com, Computational Biology Center, IBM Thomas J Watson Research Center
Correspondence address: m.hoehl@imb.uq.edu.au

To free phylogenies from alignments we present two approaches based on pattern discovery using TEIRESIAS: the first one computes distances from patterns, where distance is defined analogous to distances on alignments. The second approach transforms patterns into character data, meaning that we do not have to explicitly extract relevant properties.


Long abstract


G-23  Genome phylogenies based on the mean normalized BLASTP score
Robert G. Beiko1, Robert L. Charlebois2, Mark A. Ragan
1rbeiko@science.uottawa.ca, Dept. of Biology, University of Ottawa, 30 Marie Curie, Ottawa, ON, K1N 6N5, Canada; Institute for Molecular Bioscience, The University of Queensland, Brisbane, Australia; 2rlcharl@neurogadgets.com, GenomeAtlantic, 1721 Lower Water St., Suite 401, Halifax, NS, B3J 1S5, Canada; Dept. of Biology, University of Ottawa, 30 Marie Curie, Ottawa, ON, K1N 6N5, Canada; NeuroGadgets Inc., www.neurogadgets.com; Evolutionary Biology Program, Canadian Institute for Advanced Research
Correspondence address: rbeiko@science.uottawa.ca

Individual genes often fail to reproduce similar trees, sometimes with disparities so serious as to question the entire concept of organismal phylogeny for prokaryotes. However, trees produced from bulk genomic signal display topologies that are largely consistent with accepted taxonomy, suggesting that the underlying mode of prokaryotic evolution is clonal.

Long abstract


G-24  Evidence that rice, and other cereals, are ancient aneuploids
Klaas Vandepoele1, Cedric Simillion2, Stephane Rombauts and Yves Van de Peer
1klpoe@gengenp.rug.ac.be, University of Gent, VIB dep. Plant Systems Biology; 2cesim@gengenp.rug.ac.be, University of Gent, VIB dep. Plant Systems Biology
Correspondence address: strom@gengenp.rug.ac.be

By analyzing the genome of Oryza sativa (rice), we show that a substantial fraction (15%) of all rice genes is found in duplicated segments. However, detailed analysis shows that rice is not an ancient polyploid as previously suggested, but an ancient aneuploid that has experienced partial genome duplication, approximately 70 million years ago.

Long abstract


G-25  Phylogeny and Evolution of Human Cathepsins
Veronika Stoka1, Vito Turk2
1veronika.stoka@ijs.si , J. Stefan Institute; 2Vito.Turk@ijs.si, J. Stefan Institute
Correspondence address: veronika.stoka@ijs.si

Cathepsins are proteolytic enzymes of lysosomal origin. According their catalytic mechanism they can be classified as cysteine, serine or aspartic protases. In this work we investigated the phylogeny and evolution of human cathepsins.

Long abstract


G-26  EVA: Examining foodborne virus evolution using the Enteric Virus Analysis tool
Graham Etherington1, Ian Roberts2, Jo Dicks, Vic Rayward-Smith
1Graham.Etherington@bbsrc.ac.uk, John Innes Centre; 2Ian.Roberts@bbsrc.ac.uk, Institute of Food Research
Correspondence address: Graham.Etherington@bbsrc.ac.uk

EVA (Enteric Virus Analysis) is an analysis tool that brings together existing and novel computational techniques within a single integrated software environment. Here we describe the design of EVA and its use in examining the evolution of emerging group of foodborne viruses.

Long abstract


G-27  LUMBERJACK: A Heuristic Tool For Sequence Alignment Exploration And Phylogenetic Inference
Carolyn J. Lawrence1, Christian M. Zmasek2, R. Kelly Dawe, Russell L. Malmberg
1carolyn@plantbio.uga.edu, Department of Plant Biology, University of Georgia Athens; 2czmasek@gnf.org, Genomics Institute of the Novartis Research Foundation
Correspondence address: czmasek@gnf.org

LumberJack is a phylogenetic tool intended to facilitate sampling treespace to find likely tree topologies quickly and to map phylogenetic signal onto regions of an alignment in a heuristic manner.

Long abstract


G-28  Identifying and genetic relationships within the Arisaema determined using microsatellite markers
Byong Seob Ko1, Mi Young Lee2, Seong Jin Lee, Jeong Eun Kim, Seong Mi Hong, Young seung Ju
1bsko@kiom.re.kr, Korea Institute of Oriental Medicine; 2mylee2020@hanmail.net, Korea Institute of Oriental Medicine
Correspondence address: mylee2020@hanmail.net

Microsatellite technology rapidly reveals high polymorphic fingerprints and thus determines the genetic markers. In combination with oligonucleotides of arbitrary sequence, 5' anchored oligonucleotides based on simple sequence repeats were used in PCR to produce Arisaema DNA fingerprints.

Long abstract



Predictive Methods
H-1  Classification of Virus Risk Types Using Kernel-Based Classifiers
Je-Gun Joung1, Sirk June Augh2, Byoung-Tak Zhang
1jgjoung@bi.snu.ac.kr, Graduate Program in Bioinformatics, Seoul National University, Korea; 2sjaugh@cbit.snu.ac.kr, Center for Bioinformation Technology, Seoul National University, Korea
Correspondence address: jgjoung@bi.snu.ac.kr

Classification of virus risk types is important to understand the mechanisms in infection. We propose a machine learning approach to classify HPV risk types. Our approach is based on the kernel method that provides efficient computation. In our experiments, the string kernel-based classifier predicted four unknown HPV types exactly.

Long abstract


H-2  Studies of the transcriptional regulation of the genes coding for the novel IL28A,B and IL29 protein family: Illustration of an in silico approach applicable on a genomic scale
William Krivan1, Brian Fox2, Emily Cooper, Teresa Gilbert, Frank Grant, Betty
Correspondence address: krivan@zgi.com

We use the novel IL28A,B and IL29 protein family to illustrate an approach to the computational identification and characterization of putative transcriptional regulatory regions. Our technique consists of a combination of phylogenetic footprinting and detection of statistically significant clusters of binding sites and can be applied on a genomic scale.

Long abstract


H-3  Prediction of Protein Function from Primary Structure
Paul J. Tan1, Vladimir Brusic2, Asif M. Khan, Judice L.Y. Koh, Seng-Hong Seah
1tjtan@i2r.a-star.edu.sg, I2R; 2vladimir@i2r.a-star.edu.sg, I2R
Correspondence address: tjtan@i2r.a-star.edu.sg

An approach was developed for predicting the presence of a specific functional effect for active peptides. It involved multiple steps: a) collection of protein sequences from multiple sources, b) data cleaning and functional annotation, c) definition of basic structure-function unit groups, and d) prediction of protein function by an intelligent agent.


Long abstract


H-4  HMM Frameworks for Nuclear Receptor Binding Sites
Albin Sandelin1, Wyeth Wasserman2
1albin.sandelin@cgb.ki.se, Karolinska Institutet; 2wyeth@cmmt.ubc.ca, Univeristy of British Colombia
Correspondence address: albin.sandelin@cgb.ki.se

Nuclear Receptors (NR) control diverse programs of gene expression. These transcription factors bind in homo- and heterodimeric forms to complex target sequences. Due to variable spacing and orientation of half-sites, standard profile models are inadequate. We construct an HMM framework for the prediction and classification of NR binding sites.


Long abstract


H-6  The model representation of the mode of action of combination therapy of chloroquine,puritine an ascobic acid
Onimisi Hassan Bello1
1hassanbello2001@yahoo.com, bima tutorial outfit
Correspondence address: hassanbello2001@yahoo.com

The action of the combine therapy of chloroquine,puritine and ascobic acid has baffled biochemist for some times.but it is believe to potentiate its action based on the pertubation of the lipid bilayer surrounding the cells. but further studies to reveal the genetic conditions has not yielded much results.

Long abstract


H-7  Species-Specific Protein Sequence and Fold Optimizations
Michel Dumontier1, Katerina Michalickova2, Christopher W.V. Hogue
1micheld@mshri.on.ca, Department of Biochemistry, University of Toronto, Toronto, Ontario, Canada M5S 1A8, Samuel Lunenfeld Research Institute, Mount Sinai Hospital, 600 University Ave.,Toronto, Ontario, Canada M5G 1X5; 2katerina@mshri.on.ca, Department of Biochemistry, University of Toronto, Toronto, Ontario, Canada M5S 1A8, Samuel Lunenfeld Research Institute, Mount Sinai Hospital, 600 University Ave.,Toronto, Ontario, Canada M5G 1X5
Correspondence address: micheld@mshri.on.ca

The ability of each and every organism to adapt its particular environmental niche is of fundamental importance to its survival and proliferation. We have identified species-specific protein sequence and fold optimizations, which we exploited to generate predictive scoring functions. These scoring functions will be used in future species-specific protein identification and optimization experiments.

Long abstract


H-8  An evolving approach to finding Schemas for protein secondary structure prediction
Huang,Hsiang Chi1
1illusion@iii.org.tw, Institute for Information Industry
Correspondence address: illusion@iii.org.tw

A genetic algorithm has been applied to predict building schemas of protein secondary structure. Although the average Q3 of this research is not the highest score among researches, some fundamental and useful building schemas of protein secondary structure have been found.

Long abstract


H-9  Inhibitors of Glycogen Synthase Kinase-3beta and Cyclin-Dependent Kinases Modelled by 3D-QSAR Using a Novel Alignment Method Based on Electrostatic Potentials.
Mahindra Makhija 1, Erik Helmerhorst2
1M.Makhija@exchange.curtin.edu.au, Western Australian Biomedical Research Institute, School of Biomedical Sciences, Curtin University of Technology, Bentley, WA 6102, Australia; 2E.Helmerhorst@curtin.edu.au, Western Australian Biomedical Research Institute, School of Biomedical Sciences, Curtin University of Technology, Bentley, WA 6102, Australia
Correspondence address: M.Makhija@exchange.curtin.edu.au

The ability of paullones and aloisines to inhibit glycogen synthase kinase-3NULL and cyclin-dependent kinases was well predicted by 3D-QSAR using CoMSIA in conjunction with a novel alignment method based on electrostatic potentials. The predictive value of this approach may lead to the development of better therapeutics for Alzheimer’s disease.

Long abstract


H-10  REMmatch program: finding potential hormone responsive elements
Nikolai Aksenov1, Alex Lyakhovich2
1nikolay.aksenov@plantphys.umu.se, Umea University; 2alexlyak@umich.edu, University of Michigan
Correspondence address: nikolay.aksenov@plantphys.umu.se

REMmatch program is designed for preliminary screening of any sequence database for searching potential hormone responsive elements (HRE) and is comparable with other known programs TESS, Match etc.). REMmatch is available at http://www.math.wisc.edu/~karp/REMmatch.exe

Long abstract


H-11  Prediction of New Regulatory Properties For Proteins Sharing Different Functional Motifs
Alex Lyakhovich1, Anatoly Karp2
1alexlyak@umich.edu, University of Michigan; 2, University of Wisconsin-Madison
Correspondence address: alexlyak@umich.edu

We suggest a new algorithm that allows prediction of additional regulatory functions for the proteins containing different structural motifs. This algorithm was successfully applied for a set of proteins containing ubiquitin motifs where we could also show their regulation by certain protein kinases.

Long abstract


H-12  Predicting Synthetic Lethality
Sharyl L. Wong1, Lan O. Zhang2, Amy H. Tong, Debra S.
Correspondence address: sharyl_wong@student.hms.harvard.edu

We successfully predicted synthetic lethal gene pairs in Saccharomyces cerevisiae. Using probabilistic decision trees, we integrated multiple data types including correlated mRNA expression, physical interaction, protein function, and sequence homology. Our predictions may help identify redundant genes and pathways and may may better our understanding of genetic robustness.

Long abstract


H-13  PSORT-B: A Web-based Tool for Bacterial Subcellular Localization Prediction
Jennifer L. Gardy1, Cory A. Spencer2, Fiona S.L. Brinkman
1jlgardy@sfu.ca, Dept. of Molecular Biology and Biochemistry, Simon Fraser University; 2cspencer@sprocket.org, Dept. of Molecular Biology and Biochemistry, Simon Fraser University
Correspondence address: jlgardy@sfu.ca

We present PSORT-B (http://www.psort.org) - a subcellular localization predictor with a measured accuracy of 97% for Gram-negative bacteria. Issues including the handling of proteins resident at multiple localizations, the importance of the “unknown” result and the development of a Gram-positive version are discussed. Selected whole genome analysis is also presented.

Long abstract


H-14  Low-budgetary scheme for differentiation and DNA quantity investigation in blood lymphocytes of patients with chronical tonsillitis
Boris V.Shilov1, Dmitry A.Dolgun2
1bvshilov@hotbox.ru, SSMU; 2, SSMU
Correspondence address: bvshilov@hotbox.ru

The nucleus belonging to any type of the cells was determined to estimate of atypical lymphocytes in patients with chronic tonsillitis, to analyze the vital cycle those cells on the base of classification of transformed cells stained according to Romanovsky-Gimza. Image segmentation process was adapted to condition of low-budgetary science

Long abstract


H-16  New features for microRNA gene finding
Uwe Ohler1, Chris Burge2, David Bartel
1ohler@mit.edu, MIT; 2cburge@mit.edu, MIT
Correspondence address: ohler@mit.edu

MicroRNAs are a class of tiny RNA genes excised from precursor hairpin structures. We identified a highly specific conserved motif upstream of precursors that might be involved in miRNA transcription, and describe how much this motif, plus features of general upstream and downstream conservation, aid in miRNA gene finding.

Long abstract


H-17  Long time scale simulations of Molecular Systems
Benjamin Gladwin1, Dr Thomas Huber2
1gladwin@maths.uq.edu.au, University of Queensland Department of Mathematics.; 2huber@maths.uq.edu.au, University of Queensland Department of Mathematics.
Correspondence address: gladwin@maths.uq.edu.au

Modelling large bio-molecules is still primarily limited to the simulation of short timeframes, which in many cases are not biologically significant. The goal of this project is to use the optimisation of Hamiltonian paths to enable calculation of the behaviour of molecular systems over large time frames.

Long abstract


H-18  Identification of PKC Phosphorylation Sites on AC7 using a Directed Bioinformatics Approach.
Eric J. Nelson1, John VanHoven2, Vlad Verkhusha, Tonny deBeer, and Boris Tabakoff.
1eric.nelson@uchsc.edu, Univ. of Colorado; 2john.vanhoven@uchsc.edu, Univ. of Colorado
Correspondence address: eric.nelson@uchsc.edu

A bioinformatics approach is presented that can bypass in some instances the traditional means of phosphopeptide mapping a radiologically labeled target protein to identify PKC phosphorylation sites. This directed bioinformatics approach utilizes comparative sequence analysis, molecular modeling, and machine learning techniques to assist in the discovery of PKC phosphorylation sites.

Long abstract


H-19  Evaluating the Predictability of RNA Pseudoknots
J. Reeder1, R. Giegerich2
1robert@techfak.uni-bielefeld.de, Bielefeld University; 2jreeder@techfak.uni-bielefeld.de, Bielefeld University
Correspondence address: robert@techfak.uni-bielefeld.de

We define a new class of pseudoknots and present algorithms for thermodynamic folding of RNA secondary structures including such pseudoknots. Their time/space complexity is O(n^4) and O(n^2). We also compute the best structure guaranteed to contain a pseudoknot, and the most tightly knotted substructure. An extensive evaluation of Pseudobase is performed.

Long abstract


H-20  A sequence-independent strategy for the prediction of prokaryotic promoters
Pierre-Etienne Jacques1, Sebastien Rodrigue2, Jocelyn Beaucher, Jean-François Jacques, Luc Gaudreau, Jean Goulet and Ryszard Brzezinski
1pierre-etienne.jacques@hermes.usherb.ca, Universite de Sherbrooke; 2, Universite de Sherbrooke
Correspondence address: pierre-etienne.jacques@hermes.usherb.ca

Our strategy is based on the biological fact which show that promoters are localized in the upstream regulatory regions of genes. The possibility for a particular sequence to be a bona fide promoter can be evaluated from its mismatch distribution amongst the various areas of the genome.

Long abstract


H-21  NetOGlyc 3.0: Prediction of mucin type O-glycosylation sites from sequence and sequence-derived features.
Karin Julenius1, Ramneek Gupta, Kristoffer Rapacki, Lars Juhl Jensen, Søren Brunak
1kj@cbs.dtu.dk, Center for Biological Sequence Analysis, BioCentrum-DTU
Correspondence address: kj@cbs.dtu.dk

NetOGlyc 3.0 is a predictor of mucin type O-glycosylation sites, predicting from the protein sequence alone. NetOGlyc 3.0 shows much better generalization behaviour (the ability of the network to correctly predict for completely new examples) than its predecessor and is available at www.cbs.dtu.dk/services/NetOGlyc/

Long abstract


H-22  Genomics of Vertebrate Splicing Regulatory Elements
Gene Yeo1, Shawn Hoon2, Chris Burge
1geneyeo@mit.edu, MIT; 2shawnh@fugu-sg.org, IMCB, Singapore
Correspondence address: geneyeo@mit.edu

We find differences in the distribution and conservation of vertebrate splicing regulatory elements and relevant trans-factors given the availability of large-scale genomic data for Homo Sapiens, Mus musculus and Fugu rubripes.

Long abstract


H-23  Support Vector Machine Approach for Cancer Detection using Amplified Fragment Length Polymorphism (AFLP) Screening Method
Waiming KONG1, Lawrence THAM2, Kee Yew Wong, Patrick Tan, Keng Wah CHOO
1KONG_WAI_MING@nyp.gov.sg, Nanyang Polytechnic; 2wm_tham@hotmail.com, Nanyang Polytechnic
Correspondence address: kongwm@hotmail.com

We investigated on the novel use of Amplified Fragment Length Polymorphism screening in the diagnosis and classification of cancers using a set of 58 gastric tumor and 16 normal genomic DNA samples. The result shows that SVM can be used to differentiate cancer from non-cancer tissues with high accuracy.

Long abstract


H-25  An automated protocol for membrane protein prediction and annotation
Melissa J. Davis1, Zheng Yuan, Shane Fashang Zhang and Rohan D. Teasdale
1m.davis@imb.uq.edu.au, Institute of Molecular Biosciences
Correspondence address: m.davis@imb.uq.edu.au

In order to annotate the membrane organization of whole-proteome datasets, we have developed a consensus annotation protocol automated for high through-put analysis. This protocol predicts the presence of transmembrane domains, gpi-modifications and signal peptides. These features are combined to generate a prediction of membrane organization.

Long abstract


H-26  A Mathematical Model for Protein Folding
Yi Fang1, Warren Kaplan2
1yi@maths.anu.edu.au, CBiS, ANU; 2w.kaplan@garvan.org.au, Garvan Institute
Correspondence address: yi@maths.anu.edu.au

We mimic the major geometric features of the native structures of globular proteins: compactness, hydrophobic core, and smaller surface area. We hypothesize that the native structure of a globular protein should minimize all above three geometric features simultaneously and coherently among all conformations satisfying a relaxed steric condition.


Long abstract


H-27  Reparametrizing loop entropy weights: effect on DNA melting curves
Ralf Blossey1, Enrico Carlon2
1blossey@bioinf.uni-sb.de, IRI; 2carlon@lusi.uni-sb.de, IRI
Correspondence address: blossey@bioinf.uni-sb.de

We report an analysis of melting curves for genomic DNA. Our in-house software employs novel estimates for the weights of loop entropy factors. As test-cases, we studied D. Discoideum and synthetic sequences inserted in a linearized plasmid to compare with experiment. We find that the cooperativity parameter may be one order of magnitude larger than its consensus value.

Long abstract


H-28  Computational Analysis of Homeodomain Protein Interaction Interfaces
Christopher Warren1, Mary Brezinski, Aseem Ansari
1clwarren2@wisc.edu, University of Wisconsin - Madison
Correspondence address: clwarren2@wisc.edu

Ubx and Exd are homeodomain transcriptional factors in Drosophila that cooperatively bind DNA to regulate cell differentiation. Using FADE, we discovered an interaction between these proteins that is nearly absent in the human homologs. Through chemical analysis we find that this interaction may increase binding in fly, but don't expect this in human.

Long abstract


H-29  Identifying Bacterial Outer Membrane Proteins using Frequent Subsequences - A Data Mining Approach
Rong She1, Fei Chen2, Ke Wang, Martin Ester, Jennifer L. Gardy, Fiona S.L. Brinkman
1rshe@cs.sfu.ca, School of Computing Science, Simon Fraser University; 2fchena@cs.sfu.ca, School of Computing Science, Simon Fraser University
Correspondence address: jlgardy@sfu.ca

Outer membrane proteins (OMPs) of bacteria are medically important as drug targets. We developed two OMP predictors based on frequent subsequences studied in data mining. Both significantly outperformed the state-of-the-art method and one also produced explicit patterns of OMPs that can be used for further biological analysis.

Long abstract


H-30  A New Hybrid Haplotype Inference Method based-on Maximum Likelihood Estimation
Ho-Youl Jung1, Gil-Mi Ryu2, Jee-Yeon Heo, Ju-Young Lee, Hyo-Mi Kim, Jong-Keuk Lee, Chan Park, Bermseok Oh, and Kuchan Kimm
1hyjung@ngri.re.kr, National Genome Research Institute; 2gmryu@ngri.re.kr, National Genome Research Institute
Correspondence address: hyjung@ngri.re.kr

This article presents a hybrid method that can identify the individual's haplotype from the given genotypes. Our method combines statistical and computational approaches in order to increase the accuracy. The individuals' haplotypes are resolved by considering the MLE (maximum likelihood estimation) in the process of computing the frequencies of the common haplotypes.

Long abstract


H-31  Prediction of snoRNAs in Human Genome
Sagara Jun-Ichi1, Asai Kiyoshi2, Nakamura Shugo, Kenmochi Naoya
1jun@ni.aist.go.jp, CBRC; 2, CBRC
Correspondence address: jun@ni.aist.go.jp

We predict snoRNAs in human genome using several methods for sequence analysis. We also develop a Predicted Human Intron database produced from exons predicted by Gene Decoder which is a gene finding technology based on HMMs. We show the results of prediction of snoRNAs and the databases of human introns.

Long abstract


H-32  PIVS: Protein-protein interaction inference and visualization system using sequence-based homology search with DIP and BIND
Ki-Bong Kim1, Mi-Kyung Lee2, Seo Hwajung
1kbkim@bioinfo.smallsoft.co.kr, SmallSoft Co., Ltd.; 2mklee@bioinfo.smallsoft.co.kr, SmallSoft Co., Ltd.
Correspondence address: kbkim@bioinfo.smallsoft.co.kr

We developed the system, PIVS, which is very useful for predicting the function and interaction of unknown protein sequence as well as for visualizing its protein-protein interaction map. In addition, it offers integral genomic and motif/domain-related information concerning unknown input protein sequence.

Long abstract


H-33  Correlated Feature Extraction for Classification of Microarray and Mass Spectroscopy Data
Christopher Bowman1, Richard Baumgartner2, Ray Somorjai
1Christppher.Bowman@nrc-cnrc.gc.ca, Institute for Biodiagnostics; 2Richard.Baumgartner@nrc-cnrc.gc.ca, Institute for Biodiagnostics
Correspondence address: Christopher.Bowman@nrc-cnrc.gc.ca

We present a novel correlation based technique for unsupervised feature extraction in large datasets. The algorithm selects features based on their redundancy and unlike PCA, preserves the spatial information in the data, allowing one to easily interpret the extracted features. We demonstrate that classification accuracy on the reduced feature data is comparable to that on the original data.

Long abstract


H-34  Approaches for Predicting Protein-Protein Interaction Residues from Amino Acid Sequences
Changhui Yan1, Vasant Honavar2, Drena Dobbs
1chhyan@iastate.edu, Iowa State University, IA, USA; 2honavar@cs.iastate.edu, Iowa State University, IA, USA
Correspondence address: chhyan@iastate.edu

We have used support vector machines and Naive Bayes methods to classify protein surface residues into interface residues and non-interface residues based on the sequence neighbors of target residues. The results showed that both methods are able to successfully discover and use sequence neighbor features predictive of functional properties to identify interface residues.

Long abstract


H-35  Determination of sub-cellular localization of membrane proteins
Kevin C Miranda1, Rajith Aturaliya2, Melissa Davis, Zheng Yuan, Cameron Flegg and Rohan Teasdale
1k.miranda@imb.uq.edu.au, University of Queensland; 2r.aturaliya@imb.uq.edu.au, University of Queensland
Correspondence address: k.miranda@imb.uq.edu.au

The RIKEN Representative Protein Set was subdivided into groups based on signal peptide and transmembrane domain combination. Functional analysis using InterPro and SCOP was performed on three sub-groups: type I and II transmembrane and secreted proteins. The sub-cellular localization of type II proteins were computationally predicted and tested in vivo.

Long abstract


H-36  A Statistical Model of Protein Sequences in Interaction Networks and Its Solution via Gibbs Sampling
David J. Reiss1, Benno Schwikowski2, Andrew F. Siegel, Stanley Fields
1dreiss@systemsbiology.org, Institute for Systems Biology; 2benno@systemsbiology.org, Institute for Systems Biology
Correspondence address: dreiss@systemsbiology.org

We describe a novel statistical model of protein sequences and interaction networks that utilizes discriminative and informed priors, and apply it via a Gibbs sampling algorithm to the experimental SH3 domain-peptide interaction network of Tong et al (2001). Our results reveal that such interaction networks can, to a large degree, be modelled by this technique.

Long abstract


H-37  Ligand specificity of proteases and Kinases: an applicationto IC50 prediction on a large scale
Shandar Ahmad1, Koji Kitajima2, Akinori Sarai
1shandar@bse.kyutech.ac.jp, Department of Biochemical Enigineering and Science, Kyushu Institute of Technology, Japan; 2,
Correspondence address: sarai@bse.kyutech.ac.jp

We have attempted neural-network-based predictions of inhibition coefficient (IC50) from the SMILES of ligands for kinases and proteases in the protein-ligand interaction database, ProLINT. Our method is useful for a large scale filtering of ligands in drug-design. We have also attempted to develop ligand fragment-signature for proteins in ProLINT.

Long abstract


H-38  EPP: Eukaryotic Promoter Prediction system using an efficient training approach
Sang-Soo Yeo1, Sung-Kwon Kim2, Jung-Won Rhee, Kyoung-Rak Na
1ssyeo@alg.cse.cau.ac.kr, Chung-Ang University, Seoul, Republic of Korea; 2skkim@cau.ac.kr, Chung-Ang University, Seoul, Republic of Korea
Correspondence address: ssyeo@alg.cse.cau.ac.kr

EPP is a eukaryotic promoter prediction system. In EPP, after training set is divided into many clusters, each cluster is separately trained to make a decision model. This approach enhances the sensitivity and specificity of EPP. EPP is available at http://epp.cau.ac.kr


Long abstract


H-39  Multiplexed SBE primer design for highly polymorphic loci.
Greg Tyrelle1, Daniel Di Giusto2, Garry C. King
1greg@kinglab.unsw.edu.au, UNSW; 2daniel@kinglab.unsw.edu.au, UNSW
Correspondence address: daniel@kinglab.unsw.edu.au

Single base extension (SBE) is the most widely used SNP genotyping method. When multiplex SBE (MSBE) is applied to highly polymorphic regions, hybridisation to variable DNA may occur. We have developed an MSBE primer design algorithm that increases the coverage of MSBE for these regions by up to 20%.

Long abstract


H-40  Genefiler: High throughput genetic analysis. Raw data to analysed results with one click
Paul Matthews1, I Findlay2, D Mouradov, BK Mulcahy
1paul@agrf.org.au, AGRF; 2Ian@agrf.org.au, AGRF
Correspondence address: paul@agrf.org.au

Current genotyping analysis packages provide basic genetic marker interpretation. However, many applications often require further manual specialised analysis, which is labour intensive and expensive. We developed Genefiler, software providing massive genotyping capability, giving diagnostic results from raw data, comprehensive project management, intuitive GUI and a flexible modular format.

Long abstract


H-41  QSAR Analysis of Transcription Factors
Akinori Sarai1, Samuel Selvaraj2, Michael M. Gromiha, Hidetoshi Kono
1sarai@bse.kyutech.ac.jp, KIT; 2sel_emi@yahoo.co.uk, Bharathidasan University
Correspondence address: sarai@bse.kyutech.ac.jp

We have analyzed relationship between structure and function (activity) of transcription factors, based on two approaches: a knowledge-based approach, utilizing structural data of protein-DNA complexes, and computer simulations. We have examined the roles of structural deformation of DNA and cooperativity in target recognition, and predicted targets in yeast genome.

Long abstract


H-42  Improved Approach to Protein Identification Using Peptide Mass Fingerprint
Won-A Joo1, Kap-Soon Noh2, Chan-Wha Kimm
1wajoo0824@hanmail.net, Graduate School of Life Sciences and Biotechnology; 2, Graduate School of Life Sciences and Biotechnology
Correspondence address: cwkim@korea.ac.kr

Peptide mass fingerprint (PMF) has been a useful method for rapid and high-throughput protein identification. In our study, we compared software used frequently to identify the proteins of Homo sapiens and Halobacterium salinarum. These attempts could provide more effective algorithm for protein identification of each species using PMF.

Long abstract


H-43  Optimizing the location and the number of the maximal scoring subsequences with constrained segment lengths with MaxSubSeq
Piero Fariselli1, Pier Luigi Martelli2, Ivan Rossi and Rita Casadio
1piero@lipid.biocomp.unibo.it, Department of Biology CIRB, University of Bologna; 2 Department of Biology CIRB, University of Bologna
Correspondence address: piero@lipid.biocomp.unibo.it

We describe a general dynamic programming-like algorithm (MaxSubSeq) specifically designed to optimise the number and length of segments with constrained length in a protein sequence. Our algorithm is independent of the underling predictive method and is available through the web interface at http://gpcr.biocomp.unibo.it

Long abstract


H-44  Finding transcription regulatory elements, using transcription factor data base and genome comparison.
Hiroshi Mizushima1, Kozo Kawahara2, Mitsuru Takatsu, Teruhiko Yoshida
1hmizushi@ncc.go.jp, National Cancer Center Research Institute; 2kkawahara@w-fusion.co.jp, World Fusion Co.
Correspondence address: hmizushi@ncc.go.jp

We have developed a Web based system for searching transcriptional regulatory elements in Human Genome Sequences using Transcription Factor Database (TFDB: which we have been maintaining) and Genome comparison. Combined with Microarray data, we will give some common regulatory elements between similar expression genes.

Long abstract


H-45  PathMiner: de novo Metabolic Pathway Synthesis
McShan, D.1, Upadhyaya, M.2, Imran Shah
1Daniel.McShan@uchsc.edu, UCHSC; 2Minesh.Upadhayaya@uchsc.edu, UCHSC
Correspondence address: Daniel.McShan@uchsc.edu

This poster presents PathMiner, a computational framework for exploring metabolic pathways. To make inferences about pathways we abstract metabolism as a state-space in which compounds are points and biotransformations are state-transitions. In this poster, we discuss applications of PathMiner to two quite different biological problems.

Long abstract


H-46  Polymorphism Prediction
Angela Baldo1, J. Labate2
1abaldo@pgru.ars.usda.gov, USDA ARS PGRU; 2jl265@cornell.edu, USDA ARS PGRU
Correspondence address: abaldo@pgru.ars.usda.gov

Single nucleotide polymorphism (SNP) distribution has been demonstrated as nonrandom in the genomes of animals and plants. While genetic variation is traditionally expected in noncoding regions, additional genetic feature have been correlated with SNPs. We investigate whether this information might be used to predict regions of polymorphism.

Long abstract


H-47  On the Correspondence between Scoring Matrices and Binding Site Sequence Distributions
Jan E. Gewehr1, Jan T. Kim2, Thomas Martinetz
1gewehr@bio.informatik.uni-muenchen.de, Institute for Computer Science, Ludwig-Maximilians-University Munich, Theresienstr. 39, D-80333 Munich, Germany; 2kim@inb.uni-luebeck.de, Institute for Neuro- and Bioinformatics, University of Luebeck, Seelandstr. 1a, D-23569, Germany
Correspondence address: gewehr@bio.informatik.uni-muenchen.de

Using maximum likelihood estimation, we analyze the correspondence between popular scoring matrix classifiers for binding site prediction and specific probability distributions of binding site sequences. For unknown distributions, the binding matrix is a good choice since it achieves maximal specificity under the constraint that all known sequences are classified correctly.

Long abstract


H-48  In silico detection of CpG-island in plants
Stephane Rombauts1, Kobe Florquin2, Rouze Pierre and Yves Van de Peer
1strom@gengenp.rug.ac.be, University of Gent, dep. Plant Systems Biology; 2koflo@gengenp.rug.ac.be, University of Gent, dep. Plant Systems Biology
Correspondence address: strom@gengenp.rug.ac.be

CpG-islands are considered evolutionary remnants linked to the regulation of genes. Compared to animal systems, plants have a higher number of genes encoding DNA-methyltransferases that show a broader specificity. In this work we explored the compositional landscape surrounding promoters that would enable the identification of distinct CpG or CpNpG-islands.

Long abstract


H-49  Predicting Chemical Carcinogenesis: Problem Representation Governs Model Performance
Douglas W. Bristol1
1bristol@niehs.nih.gov, NIH-NIEHS
Correspondence address: bristol@niehs.nih.gov

The predictive performance and comprehensibility of 54 chemical-carcinogenicity models, generated by three Predictive-Toxicology Challenges, was evaluated using ROC convex-hull analysis. Models from problem representations that used a mix of attributes, reflecting interactions between chemical and biological features, clearly outperformed those derived using only attributes of chemical structure or biological features.

Long abstract


H-50  Predicting accuracy of comparative gene finders using evolutionary models
Vladimir Pavlovic 1, Lingang Zhang 2, Charles Cantor and Simon Kasif
1vladimir@cs.rutgers.edu, Rutgers University 2 zlg@bu.edu, Boston University
Correspondence address: vladimir@cs.rutgers.edu

Comparative computational analysis can lead to improved identification of genes, while relying on a given pair of genomes, such as human and mouse. We propose a formal way to select an optimal pair of genomes by linking Markov models of molecular evolution to comparative HMMs and studying their prediction accuracy.

Long abstract


H-51  The Prion Paradox: Infection or PolymerizationNULL
Jan C. Biro1
1Jan.biro@kbh.ki.se, Homulus Informatics
Correspondence address: jan.biro@kbh.ki.se

There is a weak but significant similarity between the prion protein (PrP) and some transcription factors and Zn-finger proteins. A molecular model of the Cu++-binding monomeric (normal, cytoplasmic) PrPC and the Cu++-stabilized polymeric (scrapie - pathogenic) PrPSc is presented and is called the CUPRION model.

Long abstract



Sequence Comparison
I-1  The distance function for computing the continuous distance of biopolymer sequences
G.H. Hakobyan1, T.V. Margaryan2
1gaghakob@ysu.am, YSU; 2
Correspondence address: gaghakob@ysu.am

In some applications of sequence comparison theories the actual items to be compared are not successions of discrete elements, but "continuous" functions of a continuous argument. The present paper is aimed to construct a "continuous" distance function with the help of the given "distance" matrix D.

Long abstract


I-2  Species-Specific Substitution Matrices
Michel Dumontier1, Christopher W.V. Hogue2
1micheld@mshri.on.ca, Department of Biochemistry, University of Toronto, Toronto, Ontario, Canada M5S 1A8, Samuel Lunenfeld Research Institute, Mount Sinai Hospital, 600 University Ave.,Toronto, Ontario, Canada M5G 1X5; 2hogue@mshri.on.ca, Department of Biochemistry, University of Toronto, Toronto, Ontario, Canada M5S 1A8, Samuel Lunenfeld Research Institute, Mount Sinai Hospital, 600 University Ave.,Toronto, Ontario, Canada M5G 1X5
Correspondence address: micheld@mshri.on.ca

We derived and tested novel species-specific substitution matrices (SSSMs) for sequence alignment. Our results show increased alignment accuracy in the 20-30% sequence identity range, but decreased alignment length as compared to popular sequence alignment programs such as PSI-BLAST (using BLOSUM) using CASA’s SCOP based alignment test sets.

Long abstract


I-3  Secondary structure interpretation of genetic sequence variation in Plasmodium falciparum cell surface antigens
Stanley Adoro1, Roseangela Nwuba, Chiaka Anumudu, Mark Nwagwu
1stanleyadoro@hotmail.com, University of Ibadan
Correspondence address: stanleyadoro@hotmail.com

We have analyzed genetic and amino acid residue variation of Plasmodium falciparum cell surface antigens (merozoite surface proteins 1 and 2; circumsporozoite protein; stevor and rifin) in the context of their known or predicted secondary structures. Locations of structural motifs suggest the presence of functional domains or antigenic epitopes.

Long abstract


I-4  IMPROVING SEQUENCE ASSEMBLIES USING HIGH-QUALITY OVERLAPS
Michael Roberts1, James Yorke2, Brian Hunt, Wayne Hayes, Aleksey Zimin, Cevat Ustun, Paul Havlak
1tri@ipst.umd.edu, University of Maryland; 2yorke@ipst.umd.edu, University of Maryland
Correspondence address: wayne@cs.toronto.edu

Finishing a genome costs as much as initial assembly. Since initial assembly gets about 95% of it, gaining just a few extra percent initially can save tens of millions of dollars in finishing costs. We present computational techniques for improving "overlaps" result in up to 5% additional sequence during initial assembly.

Long abstract


I-5  The Bielefeld University Bioinformatics Server
Alexander Sczyrba1, Jan Krueger2, Robert Giegerich
1asczyrba@techfak.uni-bielefeld.de, Bielefeld University, Germany; 2jkrueger@techfak.uni-bielefeld.de, Bielefeld University, Germany
Correspondence address: asczyrba@techfak.uni-bielefeld.de

The Bielefeld University Bioinformatics Server (BiBiServ), http://bibiserv.techfak.uni-bielefeld.de, supports Internet-based collaborative research and education in bioinformatics. Currently, 15 software tools and various educational media are available. These include tools from different areas such as Genome Comparison, Alignments, Primer Design, RNA Structures, and Evolutionary Relationships. In 2002 approximate 14.000 users per month used BiBiServ services with rising tendency.

Long abstract


I-6  Evolutionary significance of G1/S checkpoint among Eukaryotes
Keng Hwa Tan1, Pawan Dhar2
1eric@bii.a-star.edu.sg, BioInformatics Institute; 2pk@bii.a-star.edu.sg, BioInformatics Institute
Correspondence address: eric@bii.a-star.edu.sg

G1/S checkpoint plays a pivotal role during early stages of the cell cycle. Progression through G1/S boundary is controlled by a series of regulators. To explore the roles of these regulators and identify targets of CDKs, bioinformatics analysis are being done on DNA and protein sequences from different eukaryotic organisms.

Long abstract


I-7  Efficient algorithms for sequence comparison and overlap detection - Correcting errors in shotgun sequence reads
Martti T. Tammi1, Erik Arner2, Ellen Kindlund, Björn Andersson
1martti.tammi@cgb.ki.se, Center for Genomics and Bioinformatics, Karolinska Institutet; 2erik.arner@cgb.ki.se, Center for Genomics and Bioinformatics, Karolinska Institute
Correspondence address: martti.tammi@cgb.ki.se

We developed a rapid approximate pattern matching algorithm and a linear time algorithm for multiple alignment construction, with no previous pairwise matching of sequences required. These are implemented in a program for shotgun sequence error correction able to correct 99% of sequencing errors. This is significantly better than previous methods.

Long abstract


I-8  Parallel Implementation of Hmm-pfam on EARTH platform Using THREADED-C
Weirong Zhu1, Yanwei Niu2, Jizhu Lu, Guang R. Gao
1weirong@capsl.udel.edu, University of Delaware; 2niu@capsl.udel.edu, University of Delaware
Correspondence address: weirong@capsl.udel.edu

A parallel HMM-pfam is implemented on EARTH - an event-driven fine-grain multi-threaded program execution model. It demonstrated significant performance improvement over another PVM based version. On a cluster of 128 dual-CPU nodes, the execution time of a representative testbench is reduced from 15.9 hours to 4.3 minutes.

Long abstract


I-9  Human and Mouse Genome Comparison Using Genome-Wide Unique Sequences
Ben-Yang Liao1, Yu-Jung Chang2, Jan-Ming Ho and Ming-Jing Hwang
1liaoby@gate.sinica.edu.tw, Institute of Biomedical Sciences, Academia Sinica, Taipei, Taiwan; 2yjchang@iis.sinica.edu.tw, Institute of Information Science, Academia Sinica, Taipei, Taiwan
Correspondence address: yjchang@iis.sinica.edu.tw

We used coexistent genome-wide unique sequences in human and mouse genomes to recognize their homologous regions. The resulting syntenic map revealed more than 400 conserved segments and covering more than 90% of both genomes. This alignment-free method is capable of comparing two mammalian genomes within hours on one personal computer.

Long abstract


I-10  Long-range correlation in protein sequences and its implication
Kazuhito Shida1, Makoto Ikeda2, Atsuo Kasuya
1shida@cir.tohoku.ac.jp, CIR Tohoku University; 2ikeda@imr.edu, CIR Tohoku University
Correspondence address: shida@cir.tohoku.ac.jp

Long-range correlations in the amino-acid sequences of natural proteins are extracted from GanBank sequences. Some bigrams with more than one letter gaps turned out to be clearly over-represented. This data may improve the assessment of alignment quality, phylogeny analyses, and perhaps database searches.

Long abstract


I-11  Implementing the Smith-Waterman Algorithm on a Reconfigurable Computer
Gianpaolo Gioiosa1, David Kearney2
1GIOGY001@students.unisa.edu.au, University of South Australia; 2David.Kearney@unisa.edu.au, University of South Australia
Correspondence address: GIOGY001@students.unisa.edu.au

The Smith-Waterman algorithm is a dynamic-programming algorithm that finds the optimal alignment between two biological sequences. The algorithm was implemented on a field programmable gate array (FPGA) in order to investigate the advantages and disadvantages of using reconfigurable computers and associated tools for computationally intensive sequence analysis applications.

Long abstract


I-12  Multiple alignments of sequences and structures using T-Coffee
Orla OSullivan 1, Desmond Higgins2, Cedric Notredame
1ojos@student.ucc.ie, University College Cork; 2 University College Cork
Correspondence address: ojos@student.ucc.ie

T-Coffee is a novel method for multiple sequence alignment that allows you to combine heterogeneous sources of data to produce very accurate alignments. In this poster we look at the effects of mixing sequence and structural information using two structural alignment programs, SAP and FUGUE, in combination with T-Coffee.

Long abstract


I-13  Correlation between antisense activity and RNA secondary structure
Li Liao1, Zhongwei Li2
1lliao@cis.udel.edu, University of Delaware; 2zli@fau.edu, Florida Atlantic University
Correspondence address: lliao@cis.udel.edu

Correlation between activity of antisense oligonucleotides and local structural features of target RNAs is studied. Statistical analysis showed that high activity of antisense oligonucleotides is more likely to occur in regions of target RNA having hairpin or multi-branched loops with flanking stems.

Long abstract


I-14  Alexa: an improved EST and genomic sequence alignment tool
Miao Zhang1, Warren Gish2
1mzhang@sapiens.wustl.edu, Department of Genetics and Department of Biomedical Engineering, Washington University; 2gish@watson.wustl.edu, Department of Genetics , Washington University
Correspondence address: mzhang@sapiens.wustl.edu

To better align EST and genomic sequence, we developed a tool named Alexa, which incorporates a splice site model into the recursive dynamic programming equation. For reduced memory consumption and increased speed, Alexa can be guided by an input file produced by WU-BLASTN. Alexa is available at http://sapiens.wustl.edu/alexa.

Long abstract


I-15  ASAD-A Sequence Attribute Display tool
Keith Satterley1
1keith@wehi.edu.au, The Walter and Eliza Hall Institute of Medical Research
Correspondence address: keith@wehi.edu.au

ASAD is A Sequence Attribute Display tool. It is written as a set of Excel macros. It builds on the familiar Excel interface to allow for flexible and efficient display of attributes (hydrophobicity etc.) using similar colours and styles for one sequence or multiple aligned sequences. ASAD is available at ftp://ftp.wehi.edu.au/pub/biology/ASAD.

Long abstract


I-16  Detection of false positive results from PSI-BLAST
N. Faux1, M. Cameron2, M. Garcia de la Banda, J.C. Whisstock
1noel.faux@med.monash.edu.au, Department of Biochemistry and Molecular Biology. Monash University; 2mcam@csse.monash.edu.au, School of Computer Science and Software Engineering. Monash University
Correspondence address: noel.faux@med.monash.edu.au

PSI-BLAST is a sensitive and fast database search algorithm, however false positive results can taint the final results. We are currently investigating ways of detecting when the matrix generated by PSI-BLAST no longer describes the original query sequence or family.

Long abstract


I-17  Use of motif analysis programs to identify putative regulatory elements in the orthologous human promoters of co-regulated bovine genes
Amonida Zadissa1, John McEwan2, Chris Brown
1amonida@sanger.otago.ac.nz, Biochemistry Department, Otago University; 2john.mcewan@agresearch.co.nz, AgResearch
Correspondence address: amonida@sanger.otago.ac.nz

We aim to identify putative regulatory elements in the orthologous human promoters of co-regulated bovine genes. Promoters were extracted and Motif prediction was performed by MEME. The MatInspector Professional software and BioPerl programs were used to assess the results. Novel putative elements were identified in the human promoters.

Long abstract


I-18  LAGAN2: Probabilistic Global Alignment of DNA Under Multiple Conservation Models
Chuong B. Do1, Michael Brudno2, Serafim Batzoglou
1chuong.do@stanford.edu, Stanford University; 2brudno@cs.stanford.edu, Stanford University
Correspondence address: chuong.do@stanford.edu

LAGAN2 is a probabilistic method for limited area global nucleotide alignment of distantly related species. The algorithm incorporates multiple conservation models for protein-coding, non-coding, and unconstrained alignments, approximate logarithmic gap penalties, and Hidden Markov Model based training of alignment parameters through expectation-maximization. Additional information is available at http://lagan.stanford.edu.

Long abstract


I-19  Frequency enumeration of DNA subsequences from large-scale sequences using linear codes
Yoichi Takenaka1, Hideo Matsuda2
1takenaka@ist.osaka-u.ac.jp, Department of Bioinformatic Engineering, Osaka Univ. Japan; 2matsuda@ist.osaka-u.ac.jp, Department of Bioinformatic Engineering, Osaka Univ. Japan
Correspondence address: takenaka@ist.osaka-u.ac.jp

Frequency enumeration of the DNA subsequence is one of the important techniques. The algorithm is simple, but it takes enormous memory space. We propose an enumerate method that uses linear codes to reduce the memory space. The method with 2-ary (31,26) Hamming code requires 1/60 of the usual memory space.

Long abstract


I-20  SeqFreq: A Statistical Repetitive Motif Discovery Tool
Roger Craig1, Li Liao2, Javier Garcia-Frias, Adam Marsh
1rcraig@eecis.udel.edu, University of Delaware; 2lliao@cis.udel.edu, University of Delaware
Correspondence address: lliao@cis.udel.edu

SeqFreq is a repetitive motif discovery tool for finding repeats in both intragenomic and intergenomic sequences. SeqFreq uses a numerical suffix tree method to enumerate repetitive n-mers. Currently, intragenomic n-mer repeats of multiple bacterial genomes and their statistical distributions have been analyzed.

Long abstract


I-21  DSC : Efficient Primer design algorithm with partial order graphs
Yu-Cheng Huang1, Ming-Hui Jin 2, Cheng-Yan Kao
1r91021@csie.ntu.edu.tw, NTU. Computer Science and Information Engineering Department; 2jinmh@db.csie.ntu.edu.tw, Bioinformatics Research Center, National Taiwan University, Taiwan
Correspondence address: jinmh@db.csie.ntu.edu.tw

A novel method called DSC (Difference String Comparison) is proposed for speeding up the primer finding procedure. DSC method presents a partially ordered graph of DNA sequences and also reserves all information of sequences.

Long abstract


I-22  Analysis of corona virus genome sequences
Jingchu Luo1
1luojc@pku.edu.cn, Centre of Bioinformatics, Peking University
Correspondence address: luojc@pku.edu.cn

ClustalW analysis of 27 corona virus genome sequences reveals that 9 SARS virus sequences are identical. Multiple sequence alignment was also performed to the different groups of corona virus genome sequence and coding product to find conservative and divergent regions. All the analysis results are available at ftp://ftp.cbi.pku.edu.cn/pub/sars/analysis/.

Long abstract


I-23  BlastNP: A new sequence similarity searching and visualization method
Jan C. Biro1
1jan.biro@kbh.ki.se, Homulus Informatics
Correspondence address: jan.biro@kbh.ki.se

An alternative method to TblastX has been developed, known as blastNP. Nucleic acids in database and query sequences were translated into overlapping protein-like sequences (overlappingly translated sequences, OTSs) before searching with blastP. Thus, each nucleic acid sequence is represented by a single “protein like” sequence (instead of three reading frames).

Long abstract


I-24  A Shannon entropy-based filter improves the detection of high quality profile-profile alignments in remote-homologous searching.
Emidio Capriotti1, Ivan Rossi2, Piero Fariselli and Rita Casadio
1emidio@biocomp.unibo.it, CIRB Biocomputing Unit & BioDec srl; 2ivan@biocomp.unibo.it, CIRB Biocomputing Unit & BioDec srl
Correspondence address: ivan@biocomp.unibo.it

An analysis of the quality of the profile--profile alignments generated by a BASIC-like algorithm highlights that Shannon entropy can be used to filter out most of the bad high-scoring alignments, enhancing its reliability in the detection of remote homology. When entropy-filtering is used, the best-scoring alignments are comparable to that obtained by the CE structural alignment algorithm.

Long abstract



Structural Biology
J-1  Analyzing Protein Structure-Function Correlations Using Statistical Geometry
Majid Masso1, Losif Vaisman2
1mmasso@gmu.edu, George Mason University; 2ivaisman@gmu.edu, George Mason University
Correspondence address: mmasso@gmu.edu

An approach based on computational geometry is used to elucidate structural changes in a HIV-1 protease monomer caused by dimerization and inhibitor binding. A comprehensive mutational analysis of HIV-1 protease is also performed using this method and reveals a strong structure-function correlation.

Long abstract


J-2  http://globplot.embl.de - Exploring protein sequences for globularity and disorder
Rune Linding1, Rob Russell2, Victor Neduva, Toby Gibson
1linding@embl.de, EMBL; 2russell@embl.de, EMBL
Correspondence address: linding@embl.de

We present here a new tool for the discovery of unstructured, or disordered regions within proteins. GlobPlot http://globplot.embl.de is a web service that allows the user to plot the tendency within the query protein for order/globularity and disorder.

Long abstract


J-3  Protein unfolding governed by geometric and steric principles
Howard J Feldman1, Christopher WV Hogue2
1feldman@mshri.on.ca, Samuel Lunenfeld Research Institute, Mount Sinai Hospital; 2hogue@mshri.on.ca, Samuel Lunenfeld Research Institute, Mount Sinai Hospital
Correspondence address: feldman@mshri.on.ca

Using a novel physics-based approach, complete protein unfolding pathways for five distinct folds were computed and compared with published molecular dynamics and in-vitro experiments. Agreement was good and computation reduced by three orders of magnitude, suggesting that unfolding pathways of small globular proteins are largely constrained by geometry and sterics.

Long abstract


J-4  Semi-Automated Homology Modeling Using A Modified TRADES Algorithm
Michel Dumontier1, Howard J. Feldman2, Christopher W.V. Hogue
1micheld@mshri.on.ca, Department of Biochemistry, University of Toronto, Toronto, Ontario, Canada M5S 1A8, Samuel Lunenfeld Research Institute, Mount Sinai Hospital, 600 University Ave.,Toronto, Ontario, Canada M5G 1X5; 2feldman@mshri.on.ca, Department of Biochemistry, University of Toronto, Toronto, Ontario, Canada M5S 1A8, Samuel Lunenfeld Research Institute, Mount Sinai Hospital, 600 University Ave.,Toronto, Ontario, Canada M5G 1X5
Correspondence address: micheld@mshri.on.ca

We present a modified version of the TRADES algorithm for homology modeling of protein sequences given weak similarity to experimentally determined protein structure that generates realistic, all-atom models of non-idealized geometry as it incorporates backbone dependent rotamers, reasonable bond lengths, bond angles, torsion angles, minimized electrostatics and van der Waals forces.

Long abstract


J-5  Role of Long-range Interactions in the Transition and Folded Native states of Two-state Proteins
M. Michael Gromiha1, S. Selvaraj2
1michael-gromiha@aist.go.jp, CBRC, AIST, Tokyo, Japan;
2 Department of Physics, Bharathidasan University, Tiruchirapalli, India
Correspondence address: michael-gromiha@aist.go.jp

We have proposed a novel parameter, long-range order (LRO) for a protein from the knowledge of long-range contacts in protein structure. LRO correlates very well with experimental protein folding rates. The short and medium-range non-bonded energy, long-range contacts, and helical/strand tendency are the major determinants for transition state structures of two-state proteins.

Long abstract


J-6  Homology modeling of dihydropteroate synthase from plasmodium falciparum
T de Beer1, F Joubert2, AI Louw
1tjaart.de.beer@tuks.co.za, University of Pretoria; 2fjoubert@postino.up.ac.za, University of Pretoria
Correspondence address: tjaart.de.beer@tuks.co.za

A homology model was constructed for the DHPS-PPPK bifunctional enzyme of Plasmodium falciparum. This enzyme plays a vital part in folate synthesis and is targeted by current, failing therapies. The LUDI/ACD and the NCI database was scanned against the models to identify new potential inhibitors.


Long abstract


J-7  3D circle fitting of leucine-rich repeat proteins
Purevjav Enkhbayar1, Norio Matsushima2, Mitsuru Osaki
1penkh@chem.agr.hokudai.ac.jp, Hokkaido University; 2matusima@sapmed.ac.jp, Sapporo Medical University
Correspondence address: penkh@chem.agr.hokudai.ac.jp

Three dimensional circle fitting using atomic coordinate was performed to all known structures of leucine-rich repeat (LRR)-containing proteins. The analysis results indicate that there is a regular relationship between the radius of the LRR arc and the rotation angle about the central axis of the arc per repeating unit.

Long abstract


J-8  Protein Folding Simulations by Combining Tabu Search with Genetic Algorithms Based on HP Model
Tianzi Jiang1, Qinghua Cui2, Guihua Shi, Songde Ma
1jiangtz@nlpr.ia.ac.cn, Chinese Academy of Sciences; 2qhcui@nlpr.ia.ac.cn, Chinese Academy of Sciences
Correspondence address: jiangtz@nlpr.ia.ac.cn

A hybrid algorithm combining genetic algorithm and tabu search is presented. The hybrid algorithm can be successfully applied to protein folding based on a hydrophobic-hydrophilic model. A protein structure database is also created. The results indicate that in all cases our method works better than genetic algorithm or tabu search alone.

Long abstract


J-9  Relationship of the enzyme functional classes with the statistical attributes of their secondary structures
Rekha Iyer1, Sudhir Kumar2
1rekha_siyer@yahoo.com, Department of Biology, Arizona State University; 2s.kumar@asu.edu, Center for Evolutionary Functional Genomics, Arizona Biodesign Institute,and the Department of Biology, Arizona State University
Correspondence address: rekha_siyer@yahoo.com

In order to explore the relationship of protein function with secondary structural attributes, we compared simple statistical attributes of secondary structural elements in three major enzyme classes. We tested the usefulness of these inferences in predicting the enzyme functional class using neural networks. The strategy presented may help facilitate broad functional annotation of unknown proteins.

Long abstract


J-10  The analysis and prediction of protein-protein interacting sites.
Asako Koike1, Toshihisa Takagi2
1akoike@ims.u-tokyo.ac.jp, Human Genome Center, The Institute of Medical Science, Univ. of Tokyo; 2takagi@ims.u-tokyo.ac.jp, Human Genome Center, The Institute of Medical Science, Univ. of Tokyo
Correspondence address: akoike@ims.u-tokyo.ac.jp

We developed a prediction method for protein interaction sites using sequence profiles and accessible surface area of neighboring residues, surface patches, other physicochemical characteristics, and support vector machines. The relationship between the prediction accuracy and the characteristic of protein-protein interaction sites is discussed.

Long abstract


J-11  Incorporating Sequence and Biochemical Information in TOPS models - For Biologically Significant Pattern Matching and Pattern Discovery in Protein
Mallika Veeramalai1, David Gilbert2, David R Westhead
1mallika@dcs.gla.ac.uk, Bioinformatics Research Centre, Dept. of Computing Science, University of Glasgow; 2drg@dcs.gla.ac.uk,
Correspondence address: mallika@dcs.gla.ac.uk

Incorporating sequence and biochemical features in TOPS (Topological Models of Protein Structures) is significant for pattern matching and pattern discovery in protein structures. Interesting results would be valuable for efforts to predict protein structure and function from sequences. These problems remain key challenges of direct relevance to projects in structural and functional genomics. TOPS is available at http://www.tops.leeds.ac.uk

Long abstract


J-12  A bioinformatic toolbox for postprocessing of MASCOT results and its application to the proteome of Halobacterium salinarum
Carolina Garcia-Rizo1, Cristian Klein2, Pfeiffer, Siedler, Oesterhelt
1rizo@biochem.mpg.de, Max-Planck Institute Biochemistry; 2klein@biochem.mpg.de, Max-Planck Institute Biochemistry
Correspondence address: rizo@biochem.mpg.de

We present a bioinformatic toolbox for postprocessing of proteomic data obtained by peptide fingerprint analysis. This toolbox (i) increases the reliability of protein identifications, (ii) detects additional annotation information and (iii) corrects or validates start codon assignments by gene finders. The toolbox was developed for and applied to the proteome of Halobacterium salinarum strain R1 (http://www.halolex.mpg.d)

Long abstract


J-13  Theoretical prediction of the feasibility of identifing membrane proteins by MALDI-TOF
Carolina Garcia-Rizo1, Cristian Klein2, Pfeiffer, Siedler, Oesterhelt
1rizo@biochem.mpg.de, Max-Planck Institute Biochemistry; 2klein@biochem.mpg.de, Max-Planck Institute Biochemistry
Correspondence address: rizo@biochem.mpg.de

In the set of proteins of H. salinarum identified by MALDI-TOF peptide fingerprints, membrane proteins are severely underrepresented. Besides the technical problems encountered in their analysis by 2D gel electrophoresis, an ‘inherent data analysis problem’ is found by statistical analysis. The degree to which this effect aggravates problems in detection ratio of membrane proteins varies from organism to organism.

Long abstract


J-14  Correlated errors of neural network predictions improve fold recognition.
Dariusz Przybylski1, Burkhard Rost2
1dudek@cubic.bioc.columbia.edu, Columbia Univeristy; 2rost@columbia.edu, Columbia University
Correspondence address: dudek@cubic.bioc.columbia.edu

We present a fold recognition method that uses predicted secondary structure and solvent accessibility in a way that significantly improves both specificity and sensitivity of fold assignments compared to PSI-BLAST. The method can readily be used for high quality/throughput database annotations.

Long abstract


J-15  Protein class recognition with neural networks
Vadim Valuev1
1valuev@bionet.nsc.ru, Institute of Cytology and Genetics
Correspondence address: gease@mail.ru

There exist several classes of protein structure that are determined by predominance of an element of secondary structure. Their recognition, starting from aminoacid composition, gives some insight into the nature of this classification. Class recognition methods are built by means of neural networks. The accuracy of recognition has reached 75%.

Long abstract


J-16  Quantifying similarities in proteins using features based on hydropathy distribution along the protein sequence
Josef Panek1
1j.panek@imb.uq.edu.au, University of Queensland
Correspondence address: j.panek@imb.uq.edu.au

A computational approach is presented to explore relationships between functional specifity of proteins and hydropathy distribution in proteins. The approach uses features based on hydrophobicity of amino acids to model the hydropathy distribution in proteins. A feature space is employed to identify functional protein families and the features that are specific for the families.

Long abstract


J-17  Protein structure comparison based on profiles of topological motifs
Juris Viksna1, David Gilbert2, Gilleain Torrance
1jviksna@cclu.lv, Institute of Mathematics and Computer Science, University of Latvia; 2drg@brc.dcs.gla.ac.uk, Bioinformatics Research Centre, Department of Computing Science, University of Glasgow
Correspondence address: drg@brc.dcs.gla.ac.uk

We present a new approach to protein structure comparison using the existing, on graph representations based, TOPS database. Instead of comparisons based on single patterns, we use profiles of patterns, which allows in an indirect way to capture "negative" information. This leads to a significant increase in prediction accuracy.

Long abstract


J-18  Into protein universe - a global representation of the protein fold space
Jingtong Hou1, Gregory E. Sims2, Chao Zhang, Sung-Hou Kim
1JTHou@lbl.gov, UC Berkeley; 2gsims1997@yahoo.com, UC Berkeley
Correspondence address: JTHOU@LBL.GOV

A global view of the “protein structure universe” was constructed. Such a representation reveals a high-level organization of the fold space that is intuitively interpretable, offering an interesting perspective on both the demography and the evolution of protein structures. The protein fold space is available at http://pro.lbl.gov/~jingtong/foldspace.

Long abstract


J-19  IDPharmo: A Virtual High Throughput Screening System Based on Stuctural Biology and Cheminformatics.
Jeong Hyeok Yoon1, Jee Young Lee 2, Won Seok Oh, Doo Ho Cho, Jae Min Shin
1yoon@idrtech.com, IDRTech. Inc; 2jyoung@idrtech.com, IDRTech. Inc
Correspondence address: yoon@idrtech.com

IDPharmoTM integrates programs that can discover the potential lead compounds using protein structures and 3D chemicals database in silico. This tool was evaluated for discovery of potential lead compound of various disease targets such as HIV-RT, PDE5, HCV pol and DPP IV. We could get new chemical entities whose biological activities are from IC50 50uM to 1uM

Long abstract


J-20  Differences in dynamics of dimeric and monomeric human prion protein revealed by molecular dynamics simulations
Chie Motono1, Masakazu Sekijima2, Satoshi Yamasaki,Kiyotoshi Kaneko,and Yutaka Akiyama
1c-motono@aist.go.jp, Computational Biology Research Center, National Institute of Advanced Industrial Science and Technology; 2sekijima@cbrc.jp, Computational Biology Research Center , National Institute of Advanced Industrial Science and Technology
Correspondence address: c-motono@aist.go.jp

We performed molecular dynamics simulations on monomeric and dimeric forms of human prion protein at various conditions (at 300K, 500K, acidic pH, or with a mutation D178N) for 10 ns to investigate the differences in the dynamics of each form. Most differences resulted from additional inter-subunit interactions of dimeric HuPrP.

Long abstract


J-21  Database of Three-Dimensional Exon Structures, SEDB
Chesley Leslin1, Valentin Ilyin2, Alex Abyzov, Grigory Makarevich
1chesley.leslin@verizon.net, Theoretical Molecular Biology and Bioinformatics Lab, Northeastern University, Boston, Massachusetts; 2valentin.ilyin@verizon.net, Theoretical Molecular Biology and Bioinformatics Lab, Northeastern University, Boston, Massachusetts
Correspondence address: chesley.leslin@verizon.net

Expeditious mapping of exon borders and intron phase data onto structurally similar proteins presents a novel approach to studying protein structure evolution. We present SEDB, which allows researchers to examine and quickly identify where exon/intron boundaries are located and how these borders have possibly shaped current protein structure. http://glinka.bio.neu.edu/~cleslin/SEDB/SEDB.html

Long abstract


J-22  Contribution of Interhelical Weak Interactions to the Regulation of Protein-Gated Electron Transfer in the Membrane Milieu
Ilan Samish1, Haim J. Wolfson2, Avigdor Scherz
1ilan.samish@weizmann.ac.il, Weizmann Institute of Science; 2, Tel Aviv University
Correspondence address: ilan.samish@weizmann.ac.il

The mechanism of protein-gated electron transfer between two quinones of photosystem II was investigated. Sequence and structural conservation of a high-packing motif including an intersubunit H-bond, combined with in-silico and in-vivo combinatorial mutagenesis and biophysical characterization of the H-bond donor suggests that reversible dissociation of the bond regulates the gating.

Long abstract


J-23  Towards modulating protein-protein interactions: Clustering protein surfaces to identify biologically-relevant structural space to focus molecular design
Stephen Long1, Mark Smythe2, Peter Adams, Darryn Bryant and Tran Trung Tran
1sml@maths.uq.edu.au, School of Physical Sciences, Institute for Molecular Biosciences, The University of Queensland; 2M.Smythe@protagonist.com.au, Institute for Molecular Bioscience, The University of Queensland and Protagonist Pty. Ltd.
Correspondence address: sml@maths.uq.edu.au

Identifying small molecules that modulate protein-protein interactions continues to be a major challenge for drug discovery. From a database of homologous protein-protein interactions, datasets were extracted representing the interaction region of pairs of proteins of these complexes. The aim of this research is to cluster features of each of these datasets.

Long abstract


J-24  Comparing Protein Structures with Constraints
Su-Hyun Lee1, Jin-Hong Kim2, Geon-Tae Ahn, and Myung-Joon Lee
1suhyun@sarim.changwon.ac.kr, Changwon National University, South Korea; 2avenue@ulsan.ac.kr, University of Ulsan, South Korea
Correspondence address: suhyun@sarim.changwon.ac.kr

S4E, a protein structure comparison system using constraint technology, searches common substructures of secondary structure elements between two proteins given in PDB format. For fast comparison, we developed an efficient algorithm for constructing the compatibility graphs and applying constraint programming to finding maximal common subgraphs.


Long abstract


J-25  Molecular basis of ligand recognition in the human glucocorticoid receptor
Johannes R.G. von Langen1, Stephan Diekmann2, Alexander Hillisch
1langen@imb-jena.de, IMB-Jena; 2diekmann@imb-jena.de, IMB-Jena
Correspondence address: langen@imb-jena.de

The glucocorticoid receptor selectively recognises cortisol with high affinity. We build a homology model of this receptor on the basis of the progesterone receptor x-ray structure and simulated binding of different steroids (estradiol, progesterone, testosterone, aldosterone and cortisol). Using molecular dynamics simulations and free energy calculations we were able reveal the molecular basis of ligand recognition.

Long abstract


J-26  Prediction of disulfide connectivity patterns in proteins
Shih-Chieh Chen1, Chi-Hung Tsai2, Huai-Kuang Tsai, Cheng-Yan Kao
1r90039@csie.ntu.edu.tw, Bioinfo Lab., Department of CSIE, National Taiwan University; 2d90008@csie.ntu.edu.tw, Bioinfo Lab., Department of CSIE, National Taiwan University
Correspondence address: cykao@csie.ntu.edu.tw

We propose an approach to predict the protein disulfide connectivity directly from the sequence. The proposed approach trained a SVR model and predicted the disulfide connectivity by Gabow’s algorithm. The experiments showed the proposed method has an accuracy of 62%, which is promising to locate the disulfide bridges in proteins.

Long abstract


J-27  Prediction of RNA Secondary Structures with XNAfold
Yanming Zou1, Alan J. Hillier2, P. Scott Chandry
1y.zou@pgrad.unimelb.edu.au, University of Melbourne; 2Alan.Hillier@foodscience.afisc.csiro.au, University of Melbourne
Correspondence address: y.zou@pgrad.unimelb.edu.au

XNAfold is a JAVA-C hybrid program that takes an RNA sequence as input and predicts RNA secondary structure. The minimum free energy structure predicted by XNAfold matched the experimentally determined structure for 77 out of 133 different RNA molecules. XNAfold is freely available at http://www.student.unimelb.edu.au/~yanmz/index.html

Long abstract


J-28  Inside the beta sheet
Charlotte Deane1
1deane@stats.ox.ac.uk, Oxford University
Correspondence address: deane@stats.ox.ac.uk

Beta sheets are one of the two common repeating elements found in protein structures. Despite their importance in structure they are not well or fully understood. Here we investigate the properties of beta sheets in order to better predict protein structure/folding and understand amyloid (and general aggregate) formation.

Long abstract


J-29  Prevalence and Molecular Characterisation of Campylobacter spp. from Free-Living Animals and Dairy Cattle
Bijay Adhikari1, Joanne H Connolly2, Per Madie, Peter R Davies
1bijayadhikari@hotmail.com, Department of Livestock Services, Kathmandu, Nepal; 2J.H.Connolly@massey.ac.nz, Institute of Veterinary, Animal and Biomedical Sciences, Massey University, Palmerston North, New Zealand
Correspondence address: bijayadhikari@hotmail.com

Of the 290 samples collected, only Campylobacter jejuni was isolated. Highest isolation rate was from dairy cows (54%), followed by urban sparrows (40%), farm sparrows (38%), rodents (11%) and flies (9%). Molecular charecterisation with PFGE of campylobacters provided 22 restriction patterns, of which 7 were common to more than one source.

Long abstract


J-30  Modelling the main cysteine proteinase of the SARS virus
Shoba Ranganathan1, Victor Joo Chuan Tong2
1shoba@bic.nus.edu.sg, National University of Singapore; 2victor@bic.nus.edu.sg, National University of Singapore
Correspondence address: shoba@bic.nus.edu.sg

We present a structural model of the SARS main cysteine proteinase (MPro), based on the X-ray structure of the transmissible gastroenteritis (corona)virus. From the SARS genome, peptides representing real MPro substrates have been docked into the active site. MPRo, with its docked ligands provides important clues for designing proteinase inhibitors.

Long abstract


J-31  A New Efficient Conformation Search Method for ab initio Protein Folding
Jae-Min Shin1, Dai Sig Im2, Byungkook Lee
1jms@idrtech.com, IDR Tech. Inc.; 2idscom@idrtech.com, IDR Tech. Inc.
Correspondence address: jms@idrtech.com

Window Growth Evolutionary Algorithm (WGEA) has been developed for protein 3D structure prediction. In WGEA, locally favored structures, populated during initial search stages, are likely to survive and give more offspring structures to give final folded protein structures. By using RMS as a scoring function, WGEA successfully refolds many small proteins.

Long abstract


J-32  The Evolutionary Search for an RNA Common-Structural Grammar
Jin-Wu Nam1, Je-Gun Joung2, Byoung-Tak Zhang
1jwnam@bi.snu.ac.kr, Graduate Program In Bioinformatics, Seoul National University; 2jgjoung@bi.snu.ac.kr, Graduate Program In Bioinformatics, Seoul National University
Correspondence address: jwnam@bi.snu.ac.kr

We developed a system which could learn automatically the common grammar of RNA secondary structure. In this research, genetic programming has been applied to evolve function trees which were able to be converted into RNA structural grammars. We show results of learning common-structural grammar of tRNA and RNA pseudoknots.

Long abstract



Systems Biology
K-1  Regulation of Cytokines and G-protein gene expression by Cholera toxin
Zafar Nawaz1, Bukhtiar H Shah2
1zafarn1@hotmail.com, University of karachi; 2, Aga khan University
Correspondence address: zafarn1@hotmail.com

The molecular mechanism of Cholera toxin (CTX) action on Gas and Gaq, inflammatory cytokines and Nitric oxide gene regulation was studied in mice intestinal epithelial cells.

Long abstract


K-2  Shc-dependent pathway is redundant but dominant in MAPK cascade activation by EGF receptors: a computational result
Yunchen Gong1, Xin Zhao2
1ygong@po-box.mcgill.ca, McGill University; 2zhao@macdonald.mcgill.ca, McGill University
Correspondence address: ygong@po-box.mcgill.ca

MAPK is activated by EGFR via Shc-dependent and Shc-independent pathways. Exploring a mathematical model revealed redundancy and dominance of the Shc-dependent pathway. Its dominance results from the majority consumption of the common precursor. Results imply that organisms may use the longer pathway rather than the shorter alternative pathway for signal transduction.

Long abstract


K-3  The Biology of Ageing e-Science Integration and Simulation (BASIS) System
Darren Wilkinson1, Tom Kirkwood2, Richard Boys, Colin Gillespie, Carole Proctor, Daryl Shanley
1d.j.wilkinson@ncl.ac.uk, University of Newcastle upon Tyne, UK; 2tom.kirkwood@ncl.ac.uk, Newcastle upon Tyne, UK
Correspondence address: d.j.wilkinson@ncl.ac.uk

The BASIS project (www.basis.ncl.ac.uk ) will significantly extend the scope of current integrative models of the ageing process and it will make these models widely accessible. Accessibility will be achieved by developing a GRID node where investigators can explore models and run simulations for themselves.

Long abstract


K-4  Protein Feature Based Identification of Cell Cycle Regulated Proteins
Ulrik de Lichtenberg1, Thomas Skøt Jensen2, Lars Juhl Jensen, Anders Fausbøll, Søren Brunak
1ulrik@cbs.dtu.dk, Center for Biological Sequence Analysis, BioCentrum-DTU, The Technical University of Denmark; 2skot@cbs.dtu.dk, Center for Biological Sequence Analysis, BioCentrum-DTU, The Technical University of Denmark
Correspondence address: ulrik@cbs.dtu.dk

DNA microarrays have been used extensively to identify cell cycle regulated genes, however, the overlap in the sets of genes identified by different groups is surprisingly small. We show that cell cycle regulated proteins can be identified via certain features, including protein phosphorylation, glycosylation, subcellular location and instability/degradation.

Long abstract


K-5  Gene-O-Matic: Regulatory Network Simulation in Multicellular Organisms
Ute Platzer1, Rolf Lohaus2, Hans-Peter Meinzer
1u.platzer@dkfz.de, German Cancer Research Centre;
Correspondence address: u.platzer@dkfz.de

Gene-O-Matic is a graphical tool for the construction and simulation of regulatory networks in multicellular organisms. A developmental process simulated successfully is embryonic development of the worm Caenorhabditis elegans. Gene-O-Matic is availabe on request from the authors, or by e-mail to cello@dkfz.de. The project's website can be found at http://mbi.dkfz-heidelberg.de/projects/cellsim/.

Long abstract


K-6  Theoretical analysis of mutations and evolution of gene networks
Alexander V. Ratushny1, Vitaly A. Likhoshvai2, Yuri G. Matushkin, Nikolay A. Kolchanov
1ratushny@bionet.nsc.ru, Institute of Cytology and Genetics, SBRAS; 2likho@bionet.nsc.ru, Institute of Cytology and Genetics, SBRAS
Correspondence address: ratushny@bionet.nsc.ru

The mathematical model simulating cholesterol biosynthesis in a cell and its exchange with blood plasma cholesterol was used for computer analysis of a mutational portrait and evolution of this gene network. The graphic interface of the gene network and its computer dynamic model can be accessed at http://wwwmgs.bionet.nsc.ru/mgs/gnw/gn_model/.


Long abstract


K-7  Discovering novel regulatory controls of budding yeast cell cycle by Reverse Engineering and Bayesian Network Modeling
Yan Sun1, Pawan Dhar2
1sunyan@bii.a-star.edu.sg, Bioinformatics Institute, 21 Heng Mui Keng Terrace, Singapore; 2pk@bii.a-star.edu.sg, Bioinformatics Institute, 21 Heng Mui Keng Terrace, Singapore
Correspondence address: sunyan@bii.a-star.edu.sg

In this study, a Reverse Engineering and Bayesian Network Modeling (REBNM) approach has been used for inferring the cell cycle regulatory network from high-throughput gene expression data. The REBNM analyzer uses prior biological knowledge and supervised classification scheme for functionally grouping genes for downstream processing by Bayesian network modeling.

Long abstract


K-8  Functional topology in a network of protein interactions
Natasa Przulj1, Dennis Wigle2, Igor Jurisica
1natasha@cs.toronto.edu, Department of Computer Science, University of Toronto, Canada; 2, Department of Surgery, University of Toronto, Canada
Correspondence address: natasha@cs.toronto.edu

We systematically analyzed the S.cerevisiae protein-protein interaction network using graph theoretic tools to determine structure-function relationships. Constructed computational models describe and predict the properties of functional groups, protein complexes, signaling pathways, lethal and viable mutations, and proteins participating in genetic interactions. Our models offer insight into the complex wiring underlying cellular function.

Long abstract


K-9  Cellware: A Modeling and Simulation tool for Large Scale Biological Systems
Sandeep Somani1, Chee Meng2, Li Ye, Anand Sairam, Zhu Hao,Mandar Chitre, Pawan Dhar
1ssomani@bii.a-star.edu.sg, Bioinformatics Institute; 2cheemeng@bii.a-star.edu.sg, Bioinformatics Institute
Correspondence address: ssomani@bii.a-star.edu.sg

A software tool for in-silico modeling and simulation of large scale biological networks is being developed. Stress is on using distributed computing and grid computing technologies and novel algorithmic development for meeting the computational challenge of simulating large scale networks.

Long abstract


K-10  Parameter Estimation for Biochemical Pathways using Swarm Algorithm
Tan Chee Meng1, Sandeep Somani2
1cheemeng@bii.a-star.edu.sg, Bioinformatics Institute; 2ssomani@bii.a-star.edu.sg, Bioinformatics Institute
Correspondence address: cheemeng@bii.a-star.edu.sg

Biological systems exhibit high robustness and operate at a broad range of kinetic parameters, it is important that optimization techniques capture this important property of cells. In this study, we present an optimization technique called SWARM that is capable of detecting multiple optimal solutions.

Long abstract


K-11  A Novel Bayesian Network Model for the Study of Genetic Regulatory Networks
Y. Zeng1, J. Garcia-Frias2
1zeng@eecis.udel.edu, University of Delaware; 2jgarcia@eecis.udel.edu, University of Delaware
Correspondence address: zeng@eecis.udel.edu

We propose the use of Bayesian networks (BNs) with continuous valued variables, modeled by Student distributions, to simulate the cellular regulatory mechanism. Experimental results show the robustness of the proposed approach, which outperforms previous existing schemes based on BNs with either discrete variables or continuous variables with Gaussian distribution.

Long abstract


K-12  Population Genetic Structure in the South Pacific: Prospects for Identifying Disease Susceptibility Genes in New Zealanders with Polynesian Ancestry
Rodney Lea1, Geoffrey Chambers2
1Rod.Lea@vuw.ac.nz, Institute of Molecular Systematics; 2Geoff.Chambers@vuw.ac.nz, Institute of Molecular Systematics
Correspondence address: Rod.Lea@vuw.ac.nz

Understanding population genetic structure has important implications for identifying disease susceptibility genes. Our study will utilises multi-locus genotype data from unlinked microsatellite markers to describe the genetic structure in different Polynesian (Maori, Samoan, Tongan), European and admixed populations.

Long abstract


K-13  Data Preprocessing Facilitates Metabolic Pathway Identification from Time Profiles
Eberhard O. Voit1, Jonas S. Almeida2
1VoitEO@MUSC.edu, Medical University of S. Carolina; 2almeidaj@musc.edu, Medical University of S. Carolina
Correspondence address: VoitEO@MUSC.edu

The identification of metabolic pathway structure is a challenging problem that must be solved for the analysis of metabolic time profiles. Twofold data preprocessing significantly speeds up this identification. First, we model and smooth the data with an artificial neural network, and second, we replace differentials with estimated slopes.

Long abstract


K-14  Towards more biological mutation operators in models of gene regulation
James Watson1, Nicholas Geard2, Janet Wiles
1jwatson@itee.uq.edu.au, University of Queensland; 2nic@itee.uq.edu.au, University of Queensland
Correspondence address: jwatson@itee.uq.edu.au

Gene regulation is often studied through models of directed graphs. Mutation operators applied to such networks impose limitations on how the models evolve. A method to extract a regulation network from an artificial nucleotide sequence is presented, and the impact of sequence-level mutations on network-level structure is discussed.

Long abstract


K-15  BioPACS: BioPathway Automatic Convert System for Genomic Object Net
Masao Nagasaki 1, Atsushi Doi2, Hiroshi Matsuno, Satoru Miyano
1masao@ims.u-tokyo.ac.jp, Human Genome Center, University of Tokyo; 2atsushi@ib.sci.yamaguchi-u.ac.jp, Faculty of Science, Yamaguchi University
Correspondence address: matsuno@sci.yamaguchi-u.ac.jp

For the modeling and simulation of a biopathway, suitable information selection from public biopathway databases, such as KEGG and BioCyc, would be useful. We have developed a method to transform these pathway databases so that the converted biopathways can run on Genomic Object Net (http://www.GenomicObject.Net).

Long abstract


K-16  Models and Simulations in Systems Biology
Joao Carlos Marques Magalhaes1, Cedric Gondro2
1jcmm@bio.ufpr.br, Federal University of Parana; 2cgondro@pobox.une.edu.au, University of New England
Correspondence address: genetics@sigex.com.br

Computational models that simulate complex adaptive systems offer an alternative where analytical handling is untenable. A relatively small set of objects and simple rules, computationally implemented via techniques such as genetic algorithms or genetic programming can replicate such complexity. An example of this approach is available for download from http://www.sigex.com.br/genetics.

Long abstract


K-17  Stochastic Neural Network Models for Gene Regulatory Networks
Tianhai Tian1, Kevin Burrage2
1tian@maths.uq.edu.au, ACMC, University of Queensland; 2kb@maths.uq.edu.au, ACMC University of Queensland
Correspondence address: tian@maths.uq.edu.au

Stochastic models are presented by introducing stochastic processes into neural network models for studying the genome dynamics. Poisson random variables are used to represent chance events in the processes of synthesis and degradation. Using an example network, we show how to study robustness and stability properties of gene expression patterns.

Long abstract


K-18  Metabolic comparison of the in-silico phenotype-genotype relationship of Pseudomonas putida and Pseudomonas aeruginosa
Vitor A.P. Martins dos Santos 1, Miguel Godinho de Almeida 2, Jeremy S. Edwards2, Kenneth N. Timmis1
1vds@gbf.de, National Centre for Biotechnology Research; 2University of Delaware
Correspondence address: vds@gbf.de

We report on an in-silico representation of Pseudomonas putida and Pseudomonas aeruginosa that describes their metabolic capacities within the scope of their environmental constraints. Using annotated genome sequence data, biochemical information and strain-specific knowledge, we analysed the cellular behaviour of this micro-organism under a wide range of conditions relevant for both human health.

Long abstract


K-19  Functional Analysis of Mammalian Cell Cycle using A Computational Model of Hybrid Petri Net.
Shuji Kotani1, Takashi Yoshioka2, Kaoru Takahashi, Akihiko Konagaya
1shuji.kotani@nifty.ne.jp, RIKEN Genomic Sciences Center ; 2yoshiokatks@nttdata.co.jp, NTT Data Corp
Correspondence address: shuji.kotani@nifty.ne.jp

To describe the molecular mechanism of the mammalian cell cycle, we developed a new computational model by using Hybrid Petri Nets. Using the model, we presume how the change of gene product influences the cell cycle progression. This model maybe useful for exploring the relationship between gene functions and diseases.

Long abstract


K-20  CADLIVE for constructing a yeast cell cycle network
Hiroyuki Kurata1
1kurata@bse.kyutech.ac.jp, Kyushu Institute of Technology
Correspondence address: kurata@bse.kyutech.ac.jp

CADLIVE is a powerful software suit with GUI for constructing a large-scale map of complicated biochemical reaction networks. We constructed a biochemical map of the budding yeast cell cycle, which consists of 184 molecules and 152 reactions, and integrated postgenomic data into the map to predict novel pathways.

Long abstract


K-21  BSTLab: A Matlab Toolbox for Biochemical Systems Theory
John Schwacke1, Eberhard O. Voit2
1schwacke@musc.edu, Medical University of South Carolina; 2voiteo@musc.edu, Medical University of South Carolina
Correspondence address: schwacke@musc.edu

To facilitate application of Biochemical Systems Theory (BST), we have begun development of BSTLab, a Matlab-based toolbox implementing functions common to BST-based studies. The toolbox automates common computations, permits expansion and customization, and includes functions needed to reformulate and transport models between Matlab and the Systems Biology Markup Language (SBML).

Long abstract


K-22  Comparative Metabolic Flux Analysis by MetaFluxNet
Dong-Yup Lee1, Hongsoek Yun2, Sang-Yup Lee and Sunwon Park
1dylee@pse.kaist.ac.kr, KAIST; 2hsyun@pse.kaist.ac.kr, KAIST
Correspondence address: dylee@pse.kaist.ac.kr

MetaFluxNet is a program package for managing information on the metabolic reaction network and for quantitatively analyzing metabolic fluxes in an interactive and customized way. Using the feature of the comparative metabolic flux analysis supported in MetaFluxNet, one can design and evaluate various metabolically engineered in silico strains interactively. MetaFluxNet is available at http://mbel.kaist.ac.kr.

Long abstract


K-23  Ontologies in CellML: A Versatile Method to Describe Cellular Models
Poul Nielsen1, Matt Halstead2, Autumn Cuellar, Michael Dunstan, David Bullivant, Peter Hunter
1p.nielsen@auckland.ac.nz, University of Auckland; 2matt.halstead@auckland.ac.nz, University of Auckland
Correspondence address: p.nielsen@auckland.ac.nz

CellML is an XML-based exchange language used to describe the underlying mathematics and topology of a wide variety of biological models. Knowledge implicitly associated with a model, however, is not normally included in the CellML representation. In order to address this problem facilities to include ontologies have been added to CellML.

Long abstract


K-24  Bayesian inference for stochastic models of genetic networks
Richard Boys1, Darren Wilkinson2, Tom Kirkwood, Wan Ng
1richard.boys@ncl.ac.uk, University of Newcastle upon Tyne; 2d.j.wilkinson@ncl.ac.uk, University of Newcastle upon Tyne
Correspondence address: richard.boys@ncl.ac.uk

This poster describes the detailed stochastic techniques used to model regulatory networks, and the computational tools needed for simulation and analysis. An overview is given of the modern computationally intensive statistical techniques which can in principle be used for carrying out Bayesian inference for the parameters underlying these network models.

Long abstract


K-25  Is Segmentation a Robust Gene Network
Mark Reimers1
1mark.reimers@cgb.ki.se, Karolinska
Correspondence address: mark.reimers@cgb.ki.se

A major question in evolutionary developmental biology is whether conserved developmental networks are robust. We address this with a simulation study and an evolutionary comparison of gene networks.

Long abstract


K-26  On Metabolic Pathway Reconstruction from Gene Expression Data
Cedric Gondro1, Brian P. Kinghorn2
1genetics@sigex.com.br, University of New England; 2bkinghor@une.edu.au, University of New England
Correspondence address: genetics@sigex.com.br

This work aims to infer metabolic pathways and other biological processes from data generated in kinetically simulated microarray experiments using evolutionary algorithms. A preliminary test has shown correct reconstruction of lac operon model parameters derived from simulated expression data collected following a perturbation in the level of lactose.

Long abstract


K-27  Modelling the Role of Small RNAs in Gene Regulation
Nicholas Geard1, Janet Wiles2
1nic@itee.uq.edu.au, The University of Queensland; 2j.wiles@itee.uq.edu.au, The University of Queensland
Correspondence address: nic@itee.uq.edu.au

Small functional RNA molecules have been discovered to play an important role in the regulation of gene transcription. The abstract model presented here uses a sequence-matching paradigm to generate regulatory networks that utilise multiple levels of transcriptional control to increase their computational power.

Long abstract


K-28  High level properties of genetic regulatory network
Kai Willadsen1, Janet Wiles2
1kaiw@itee.uq.edu.au, University of Queensland; 2, University of Queensland
Correspondence address: kaiw@itee.uq.edu.au

Abstract models of gene regulation date back to the development of the Random Boolean Network model in 1969. This class of models aims to investigate emergent properties of genetic regulatory networks with a view to better understanding high-level characteristics of the behaviours that these systems display.

Long abstract


K-29  Emergent Models in Complex System Simulations of Genetic and Biochemical Networks
Henk Stolk1, Kevin Gates2, Jim Hanan
1hjs@maths.uq.edu.au, University of Queensland; 2keg@maths.uq.edu.au, University of Queensland
Correspondence address: hjs@maths.uq.edu.au

Complex systems are simulated by interacting software agents at various hierarchical levels. Emergent macro-level models are derived from micro-level properties and behavior, for example by genetic programming algorithms. Micro-level models are also derived from macro-level behavior. The methodology can relate micro-level genetic and biochemical networks to macro-level gene expression patterns.

Long abstract


K-30  ISAWB : a Windows-based Integrated Workbench for Managing Sequence Information in Small Scale
Hongseok Tae1, Hyeweon Nam2, Daesang Lee, Kiejung Park
1hstae@smallsoft.co.kr, Dept. of Microbiology, Kyungpook National University; 2hwnam@smallsoft.co.kr, Information Technology Institute, SmallSoft Co., Ltd.
Correspondence address: hstae@smallsoft.co.kr

ISAWB is a small workbench for managing sequence data and analyzing them to get processed information with Windows GUI. It contains data management features and eleven analysis module tools suggested by a Korean MOST project while each module is composed of a few objects in C++ classes, for reusability.

Long abstract


K-31  Multi-algorithm, multi-timescale cell simulation using E-Cell3
Kouichi Takahashi1, Kazunari Kaizu2, Bin Hu, Yohei Yamada, Masaru Tomita
1shafi@sfc.keio.ac.jp, Institute for Advanced Biosciences, Keio University; 2t00220kk@sfc.keio.ac.jp, Institute for Advanced Biosciences, Keio University
Correspondence address: yoyo@sfc.keio.ac.jp

The integration of sub-cellular models running on different types of algorithms poses a significant computational challenge. A heat-shock response model combining the Gillespie-Gibson stochastic algorithm and deterministic equations, and a multi-timescale model with multiple ODE components have been constructed. An implementation of the method is available at http://www.e-cell.org/.

Long abstract


K-32  SPACE-BLAST: Linux Cluster based Biological Sequence Parallel Processing system Summarized by Gene Ontology
Mihwa Park1, Jaewoo Kim2, Hyungsuk Won, Seungsik Yoo
1bfpark@posdata.co.kr, POSDATA; 2jaewoo@posdata.co.kr, POSDATA
Correspondence address: bfpark@posdata.co.kr

SPACE-BLAST (Super PArallel Computer Engine for BLAST) is a high performance bioinformatics system that implements the NCBI’s BLAST system with low cost Linux cluster based parallel processing to search DNA sequencing at high speed. Also, Gene Ontology is applied to summarize massive amount of the BLAST search results. SPACE-BLAST is available at at http://space-blast.posdata.co.kr.

Long abstract


K-33  Transcription factor prediction in Bacillus subtilis using stochastic differential equations
Michiel de Hoon1, Sascha Ott2, SunYong Kim, Seiya Imoto, Satoru Miyano
1mdehoon@ims.u-tokyo.ac.jp, University of Tokyo; 2ott@ims.u-tokyo.ac.jp, university of Tokyo
Correspondence address: mdehoon@ims.u-tokyo.ac.jp

We model gene regulation in Bacillus subtilis using stochastic differential equations, including both linear and nonlinear interactions. Using synthetic data, we found that about fifty time points are needed to determine a gene network reliably. Combining several time-course experiments yielded a highly significant prediction for the sigW transcription factor.

Long abstract


K-34  Simulation Studies on Regulatory Network Inference
Ilana Saarikko1, Tero Aittokallio2, Mikko Katajamaa, Riitta Lahesmaa, and Mats Gyllenberg
1ilana.saarikko@btk.utu.fi, Turku Centre for Biotechnology, Univ of Turku, Department of Mathematics, Univ of Turku, 2 Turku Centre for Computer Science, Univ of Turku
Correspondence address: ilana.saarikko@btk.utu.fi

Methods such as Bayesain networks have been proposed to predict gene regulatory networks from expression data. The aim of this study is to simulate data corresponding the expression levels of genes in IL-4 signalling pathway and thus asses the amount of material needed to infer the initial network structure.

Long abstract


K-35  Automatic generation of cell-wide pathway model from complete genome
Kazuharu Arakawa1, Yohei Yamada2, Hiromi Komai, Kosaku Shinoda, Yoichi Nakayama, Masaru Tomita
1gaou@g-language.org, Institute for Advanced Biosciences, Keio University; 2skipper@g-language.org, Keio University
Correspondence address: gaou@sfc.keio.ac.jp

Genome-based E-Cell Modeling System realizes a fully automatically conversion of the genome sequence data into a qualitative virtual cell model, linking information from major public database such as GenBank, EMBL, SWISS-PROT, KEGG, ARM, Brend, and WIT, by a combined method of annotation, homology, and orthology.

Long abstract


K-36  Social Network Analysis of DNA Microarray Data
Jung Hun Ohn1, Tae Su Chung2, Jihoon Kim, Mingoo Kim, Jihun Kim, Hye Won Lee, Ji Yeon Park, Ju Han Kim
1jhohn2@snu.ac.kr, SNUBI Seoul National University Biomedical Informatics; 2epiai@korea.com, SNUBI Seoul National University Biomedical Informatics
Correspondence address: satyrs1@snu.ac.kr

Genomic interactions are quite complicated and special skills are needed to understand the network. Application of social affiliation network analysis on the yeast gene expression compendium dataset of hundreds of systematic perturbations shows the core-peripheral and the significant intermediary genes by network density and centrality indices.

Long abstract


K-38  BioPAX Data Exchange Ontology for Biological Pathway Databases
BioPAX Group1
1pax@cbio.mskcc.org, BioPAX
Correspondence address: jluciano@biopathways.org

BioPAX (http://biopax.org) is a new community-based effort to develop a data exchange format for biological pathway data. The ontology will be able to represent data from diverse sources such as aMAZE, BIND, BioCyc, and WIT. Initial implementations of BioPAX will be available in both OWL and XML Schema.


Long abstract


K-39  Identification of metabolic characteristics through metabolic flux analysis using MetaFluxNet
Tae Yong Kim1, Soon Ho Hong2, Sang Yup Lee
1kimty@webmail.kaist.ac.kr, Metabolic and Biomolecular Engineering National Research Laboratory, Department of Chemical and Biomolecular Engineering, Korea Advanced Institute of Science and Technology; 2totenkof@mail.kaist.ac.kr, Metabolic and Biomolecular Engineering National Research Laboratory, Department of Chemical and Biomolecular Engineering, Korea Advanced Institute of Science and Technology
Correspondence address: kimty@webmail.kaist.ac.kr

The variation of intracellular metabolic flux distributions of E. coli was estimated by MetaFluxNet under various conditions, and the response of metabolic fluxes were evaluated. MFA results were applied to identify the metabolic characteristics of E. coli.

Long abstract


K-40  Recent Developments in the Systems Biology Markup Language
Michael Hucka1, Andrew Finney2, Benjamin Bornstein, Bruce Shapiro, John Doyle, Hiroaki Kitano
1mhucka@caltech.edu, Caltech; 2a.finney@herts.ac.uk, University of Hertfordshire
Correspondence address: mhucka@caltech.edu

SBML (http://www.sbml.org) is an XML-based format for representing biochemical network models and is intended to serve as a common exchange language for computational models in systems biology. This poster will summarize recent developments in SBML, including the new SBML Level 2.

Long abstract


K-41  Molecular characterization of NKR-p1 receptor in peripheral blood of pig
Banshi Sharma1
1banshisharma@yahoo.com, CVL
Correspondence address: banshisharma@yahoo.com

The strategy of reversese transcriptase for obtaining cDNA has been utilized. RT-PCR gives 661 bp PCR product in pig.The PCR product is cloned in pPCR-script TM Amp SK(+) cloning vector.

Long abstract


K-42  Genome-scale reconstruction of the Mus musculus metabolic network
Kashif Sheikh1, Lars Nielsen2
1kashifs@cheque.uq.edu.au, University of Queensland, St. Lucia; 2Lars.Nielsen@uq.edu.au, University of Queensland, St. Lucia
Correspondence address: kashifs@cheque.uq.edu.au

Here is presented the first genome-scale reconstruction of the metabolic network of a mammalian cell. The recently published sequencing of the full genome of the common laboratory mouse, Mus musculus, was used as the basis of the reconstruction. It may be used as an in silico template for analysis of phenotypic functions.

Long abstract


K-43  In-Silico View of the Proteome Dynamics of the Eukaryote Cell Cycle
Thomas Skøt Jensen1, Ulrik de Lichtenberg2, Lars Juhl Jensen, Anders Fausbøll and Søren Brunak
1skot@cbs.dtu.dk, Center for Biological Sequence Analysis; 2ulrik@cbs.dtu.dk, Center for Biological Sequence Analysis
Correspondence address: skot@cbs.dtu.dk

Many genes have been identified that are periodically expressed during the eukaryote cell cycle. By integrating temporal DNA microarray data and protein features, we demonstrate that co-expressed, periodic genes encode proteins which share combinations of features, and provide an overview of the proteome dynamics during the eukaryote cell cycle.

Long abstract


K-44  Vector PathBlazer: A New Pathway Analysis And Visualization Tool
Valery Reshetnikov1, Artur Karpov2, David Pot, Feodor Tereshchenko
1resh@informaxinc.com, InforMax; 2art@informaxinc.com, InforMax
Correspondence address: feodor@informaxinc.com

Vector PathBlazer is a bioinformatics tool, providing the ability to build, visualize and analyze biological pathways, using metabolic and protein-protein interaction data from multiple sources, right on the desktop. It combines publicly maintained reaction information from KEGG, DIP, and BIND, with an easy-to-use interface for adding proprietary pathways data.

Long abstract