Special Session Details
Organizer(s): Juni Palmgren
Karolinska Institutet, Stockholm University, Sweden
Date: Monday, June 29
Start Time: 10:45 a.m. – 12:40 p.m.
Room: K1
The population based disease and other registries in the Nordic countries offer unique opportunity to study common multi-factorial diseases (cancer, cardiovascular disease, diabetes etc) from a human population perspective. We illustrate how large groups of patients and healthy individuals coupled with modern bio-technology, e-epidemiology and bio-computing can shed light on disease mechanisms and on the interplay between molecular biology, heredity, lifestyle and the environment.
Institute for Molecular Medicine Finland, Helsinki, Finland and Wellcome Trust Sanger Center, Hinxton, UK
The on-going wave of genome-wide association (GWA) studies is rapidly increasing our knowledge about variation in DNA and its association to various complex disease and traits. The successes of these studies rely heavily on key advances in technology (unbiased and relative cheap high-throughput genotyping), design and analysis (large scale population-based study designs with control for confounding and false positive risks) and computation (reliable algorithms for genotyping, genotype imputation, and association testing and estimation).
We will review some of the recent success stories using massive European cohorts with special focus on searching for variants associated with genetic risk of coronary heart disease and circulating serum lipids. We will also discuss the statistical and computational challenges in the next wave of large-scale genome-wide studies, including the data and analysis harmonisation efforts in European and global consortia.
Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm
Over the years, many studies have attempted to identify individuals at risk of life threatening diseases. Epidemiological studies have traditionally focused on environmental or life-style factors and risk factors have turned out to be weakly related to the risk of developing disease. So-called linkage studies have identified some genetic alterations that dramatically increase the risk of breast cancer, but these alterations are rare. More common genetic alterations or polymorphisms have been linked to different common disorders but until recently few definite associations have been found, because few alterations could be studied simultaneously and most studies have been severely underpowered.
Recent technological advances now allow common genetic variation throughout the human genome to be evaluated for association with disease. These “genome-wide” studies have identified novel genetic loci for breast cancer, demonstrating that these common cancers have a polygenic basis. They indicate that many more such loci can be identified, given sufficiently large studies, and that, in combination and together with lifestyle/environmental risk factors, may provide powerful predictors of individual risk.
Lack of statistical power in studies conducted so far means, on the one hand, that relevant interactions may have gone undetected, while on the other hand too much attention may have gone to false-positive findings that could not be replicated in subsequent studies.
I will present some of the large scale efforts in Europe that constitute excellent platforms to increase our insight into both the effects of modifiable risk factors in genetically susceptible subgroups and the biological mechanisms underlying these associations.
Dept of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm.
There may be genes that have no effect on the mean expression level of a trait, but, depending on environment, have a greater or lesser variance of expression. It is not difficult to think of molecular mechanisms, e.g. promoters of different binding efficiency, to explain the existence of such variability genes. MZ twins offer a unique opportunity to test for the presence of such genes if a measured genotype is available. This may be particularly informative if the phenotype is measured longitudinally. Examples of indications of this type of gene-environment interaction in trajectories of cognitive change will be given.
Dept of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm
The development of new technologies within medical research has during the last decade lead to possibilities for researchers to assemble more detailed genetically and molecular information of biological samples. For the first time in medical history the technical prerequisites exist for integrating this information with demographic and lifestyle data on healthy individuals and connecting these data with information from administrative registers and hospital databases. The challenge lies in creating an infrastructure which includes modern information and communication technologies for high throughput genomic data together with modern communication technologies to assembly and organize demographic and lifestyle information. Adequate design, statistical analyses and interpretation of the masses of complex data from different sources constitutes an added challenge.
The new emerging IT technologies also simplify procedures regarding biobanking. The unique Swedish national health care registry together with the high Internet access within the Swedish society introduces possibilities for population-based prospective studies in Sweden in a both cost-effective and feasible manner. At present these studies are possibly in only a few countries around the globe. Here the use of e-epidemiology is an important step.
E-epidemiology is the science underlying the acquisition, maintenance and application of epidemiological knowledge and information using digital media such as the Internet, mobile phones, digital paper, digital TV. E-epidemiology also refers to the large-scale science that will increasingly be conducted through distributed global collaborations enabled by the Internet.
We conclude by emphasizing the importance of developing user friendly and coherent tools that cover the entire pipeline from raw data, via harmonisation, quality control and normalisation procedures to high level analytic modelling.
↑ TOP
Special Session 2: Orthology Inference
Organizer(s): Erik Sonnhammer
Stockholm Bioinformatics Centre, Sweden
Michael Ashburner
Cambridge University, England
Date: Monday, June 29
Start Time: 2:15 p.m. – 4:10 p.m.
Room: K1
Orthologs are genes in different organisms that originate from a single gene in the last common ancestral species. Therefore, orthologs are more likely to have the same function than non-orthologous homologs. It is vital to many researchers in the ISMB community to accurately and comprehensively identify orthologs between all organisms that have been completely sequenced, currently some 100 eukaryotes and 800 prokaryotes. Correct and comprehensive orthology links between organisms is important to many aspects of protein annotation and comparative genomics. Inference of protein function by similarity to experimentally characterised proteins gets a quality stamp when derived from orthologs. Establishing orthology is further essential e.g. for building species trees and for analysing lineage-specific family expansions. This special session will bring together experts in the ortholog field with the goal to highlight the most recent advances.
To date, at least 15 ortholog databases exist. The reason for this diversity is that different research groups have focused on different species, different methodology, and different resolution. Some have prioritised sensitivity while others have minimized the error rate. In other words, there is a trade-off between coverage and specificity, and the available ortholog databases provide a wide range of solutions to this problem. Several comparative studies of ortholog datasets have been published (Alexeyenko et al., 2006; Hulsen et al., 2006; Chen et al., 2007; Dolinski and Botstein, 2007; Kuzniar et al., 2008). Although these reviews give a general picture of the advantages and disadvantages of each method, the lack of standards and reference data sets prevented a comprehensive and objective comparison.
Orthology inference is highly relevant to the Model Organism Databases (MODs) as each of these need to cross-reference orthologous proteins in different organisms as part of their annotation process. At present, several MODs, e.g. FlyBase and Wormbase, have implemented pipelines to incorporate ortholog links.
The session will include presentations from the foremost ortholog databases as well as from representatives from MODs and GO, the gene ontology. This mix is intended to highlight the potential to use a controlled vocabulary for orthology data which would enhance its usage by MODs and other ortholog-related research. The session will stimulate an agreement between the producers of ortholog data sets on an exchange format for these data and an agreement to use common input protein data sets. The present absence of these standards greatly hinders the incorporation of ortholog data by the MODs and databases such as the Gene Ontology.
References
Andrey Alexeyenko, Julia Lindberg, Asa Perez-Bercoff, Erik L.L. Sonnhammer
Overview and comparison of ortholog databases
Drug Discovery Today: Technologies; 2006, 3:137-143
Chen F, Mackey AJ, Vermunt JK, Roos DS.
Assessing performance of orthology detection strategies applied to eukaryotic genomes.
PLoS ONE. 2007, 2:e383.
Dolinski K, Botstein D.
Orthology and functional conservation in eukaryotes.
Annu Rev Genet. 2007, 41:465-507
Hulsen T, Huynen MA, de Vlieg J, Groenen PM.
Benchmarking ortholog identification methods using functional genomics data.
Genome Biol. 2006, 7:R31
Kuzniar A, van Ham RC, Pongor S, Leunissen JA.
The quest for orthologs: finding the corresponding gene across genomes.
Trends Genet. 2008, 24:539-551
Presenters:
Erik Sonnhammer, Stockholm Bioinformatics Centre
http://sonnhammer.sbc.su.se/
“InParalogs and InParanoid”
Suzanna Lewis, Berkeley University
http://berkeleybop.org/people.html
”PAINT: The GO Workbench for Protein Family Annotation”
Albert Vilella, EBI
http://www.ebi.ac.uk/~avilella/
“EnsemblCompara GeneTrees: Complete, duplication aware phylogenetic trees in vertebrates”
Yuri Wolf, NCBI
http://www.ncbi.nlm.nih.gov/CBBresearch/Koonin/members.html#yuri
"Construction of clusters of orthologous genes for archaeal genomes"
Avril Coghlan, Cork University
http://www.ucc.ie/microbio/ac.html
“TreeFam”
Paul Thomas, SRI
http://www.ai.sri.com/esb/
“Panther”
Special Session 3: Membrane proteins – from biosynthesis to drug design
Organizer(s): Gunnar von Heijne
Stockholm University, Sweden
Date: Tuesday, June 30
Start Time: 10:45 a.m. – 12:40 p.m.
Room: K1
Membrane proteins account for ~30% of all protein-encoding genes in typical genomes but represent more than half of all drug targets. Bioinformatics and computational chemistry are integrated parts of current efforts to understand the structure of membrane proteins and to exploit them for use in the pharma and biotech industries. The Special Session will try to capture recent excitement in the field generated by the first high-resolution structures of the machinery that catalyzes the assembly of membrane proteins into cellular membranes, the coming-of-age of computational chemistry techniques for studying the basic physical chemistry of protein-lipid interactions, advances in structure prediction of membrane proteins, and the marriage of experimental and computational approaches in drug discovery and design as applied to membrane proteins.
Presenters:
Jochen Zimmer, Harvard Medical School: The molecular mechanism of membrane protein assembly.
Jochen Zimmer is a senior postdoc with Tom Rapoport, a world-leader in the field of protein secretion and membrane protein assembly. Jochen has determined the first high-resolution structure of the SecYEG protein translocation channel in a complex with the motor protein SecA.
Erik Lindahl, Center for Biomembrane Research, Stockholm University (ERC Starting Grant recipient): Energetics of protein-lipid interactions probed by molecular dynamics simulations.
Erik Lindahl is one of the top young scientists who apply mocelular dynamics to calculate free energies of protein-lipid interactions.
Patrick Barth, University of Washington: Membrane protein structure prediction using Rosetta.
The Rosetta program is the gold standard for protein structure predcition, and, thanks to the work of Patrik Barth in David Baker’s lab, can now be applied to membrane proteins.
Thue Schwartz, Panum Institute, Copenhagen University: Drug design and drug discovery for GPCR targets.
Thue Schwartz has a long experience from working on GPCRs and is a co-founder of a company that specializes in GPCR-targeted drug development.
↑ TOP
Special Session 4: Advances and Challenges in Computational Biology hosted by PLoS Computational Biology
Organizer(s): Barb Bryant
PLoS Committee
Date: Wednesday, July 1
Start Time: 10:45 a.m. – 12:40 p.m.
Room: K1
Advances and Challenges in Computational Biology, hosted by PLoS Computational Biology
Computation is now an integral component of life sciences research and education. What are the newest advances and challenges facing the emergent discipline of computational biology and what are the promises if these challenges can be met? To address this question leading scientists drawn from the PLoS Computational Biology Editorial Board provide their individual perspectives from their own research areas, followed by a panel discussion.
PLoS Computational Biology presents 3 talks from members of its Editorial Board, followed by a panel discussion, highlighting recent scientific advances made possible by computation and mathematics in three different fields. Addressing themes such as drivers of research directions, scientific impact, reproducibility, model validation, co-operation amongst researchers, the need for tools to enable sharing (such as common description languages), and interfacing with scientists who are not focused on computational approaches, this session aims to provide a wide-reaching overview common issues faced by computational biologists in different areas.
RIKEN Brain Science Institute
Wakoshi, Saitama, Japan
http://www.cnpsn.brain.riken.jp/cnpsnhomewiki/index.php/Abigail_Morrison
Adam Arkin
University of California, Berkeley
Berkeley, CA, USA
http://genomics.lbl.gov/
Donna Slonim
Tufts University
http://www.cs.tufts.edu/~slonim
Abigail Morrison
I am interested in the interaction of cortical dynamics and synaptic plasticity to generate structure in the brain, and in the interaction of structure, dynamics and plasticity to realize cortical functions. As these interests mean I spend a lot of time simulating spiking neuronal networks, I am also interested in simulation technology, particularly in efficient integration techniques and parallelisation strategies.
I am currently a Research Scientist in the Computational Neuroscience Group at RIKEN Brain Science Institute.
Adam Arkin
I work on the reverse and forward engineering of cellular networks. In the former I use evolutionary analysis, comparative functional genomics and detailed mathematical modeling to understand the structure and dynamics of cellular networks and how evolutionary forces shape them for both flexibility and optimal function. We apply these studies to environmental microbes as well as B. subtilis, E. coli and viruses such as HIV. In the latter work I aim to use principles learned from our reverse engineering work along with good engineering practice to engineering new complex function into bacterial and viruses. We work on developing standardized genetic parts sets that allow rapid design and implementation of complex pathways in cells; standard protocols for characterization of these systems; and standards for their modeling and prediction. We are applying this approach to synthetic design of systems for biofuel production, and treatment of disease with engineered bacteria and viruses.
I am a professor of bioengineering at U.C. Berkeley, and a faculty scientist at the Lawrence Berkeley National Laboratory.
Donna Slonim
My research focuses on applying computational genomics methods to advance our understanding of human health and disease. I am currently most active in the field of human development, using genomic data to improve our ability to monitor the health of the living human fetus and to suggest potential therapies for developmental disorders. My other translational research projects have included cancer and Alzheimer's disease. For these efforts to succeed, we need a better understanding of how genes and proteins work together to carry out molecular functions in specific contexts. I therefore also study the functional and structural properties of molecular networks.
I am an Associate Professor of Computer Science at Tufts University, an Associate Professor of Pathology at Tufts University School of Medicine, and a member of the Genetics faculty at the Sackler School of Graduate Biomedical Sciences.
↑ TOPSpecial Session 5: Next Steps in eQTL Analysis: Gaining Insight at the Systems Level
Organizer(s): Andreas Beyer
Biotechnology Center
TU Dresden, Germany
Date: Wednesday, July 1
Start Time: 2:15 p.m. – 4:10 p.m.
Room: K1
Expression quantitative trait loci (eQTL) provide a unique concept for the analysis of expression regulation at the genomic level. eQTL utilize the fact that gene expression varies in genetically different individuals. Hence, correlating the genotypic patterns at many chromosomal locations (loci) with the expression of a certain gene allows for the detection of regulators of gene expression.
This conceptually simple idea has recently gained considerable momentum since genome-wide expression measurement became routine. Cheaper genotyping options allow for the studying of eQTL in yeast, mouse, rats, plants and humans. Based on these datasets it should now be possible to detect regulators of transcription at a genomic level. Most importantly, eQTL may provide the missing link for explaining associations between genetic loci and important (disease) phenotypes at the molecular level.
However, the computational detection and analysis of eQTL has been challenging due to the sheer size of the datasets as well as due to statistical problems associated with noise and small effect sizes.
This session will introduce recent developments aimed to overcome many of these problems. These novel computational methods integrate very heterogeneous sources of data using advanced statistical models and graph theory. Leading researchers of the field will introduce approaches for the integration of pathway and protein interaction network data with eQTL, the joint analysis of eQTL and disease association data, as well as the use of large community resources. The first presentation will briefly introduce the concept of eQTL itself to allow also non-experts to follow the session.
Presenters:
Ritsert Jansen:
Ritsert Jansen published the first paper presenting the idea of genetical genomics (i.e. eQTL) in 2001.
He continues to be a pioneer in the field applying eQTL analysis e.g. in Arabidopsis thaliana.
http://www.rug.nl/staff/r.c.jansen/index
Andrew Su:
Andrew Su is leading experimental efforts for the measurement of eQTL in diverse inbred mouse lines (Mouse Diversity Panel). Such data are significantly more complex than the standard eQTL derived from recombinant inbred mouse lines. Yet, they are much richer in information; hence, Andrew Su is also pioneering bioinformatics efforts for detecting and analyzing eQTL originating from the Mouse Diversity Panel.
http://www.gnf.org/technology/computational-sciences-and-informatics/computational-biology.htm
Eric Schadt:
Eric Schadt published the first eQTL study in mouse in 2003. Since then he has been one of the major driving forces of the field, pushing the limits of the computational analysis.
http://www.rii.com/about/executives.html
Robert W. Williams:
Rob Williams established the GeneNetwork platform and he has continuously further developed the analysis of eQTL.
His insight knowledge particularly of the BXD mouse panel has been instrumental for many research projects in the field.
http://www.nervenet.org/people/rob_cv.html
↑ TOP
Special Session 6: Regulatory Genome Architecture and Noncoding Mutations in Human Disease
Organizer(s): Gabriela Loots
Lawrence Livermore National Laboratory (LLNL)
Livermore, California, USA
Ivan Ovcharenko
National Institutes of Health (NIH)
Bethesda, Maryland, USA
Date: Thursday, July 2
Start Time: 10:45 a.m. – 12:40 p.m.
Room: K1
The emerging field of noncoding genomics (disease-causing variation in non-genic regions of the genome) originates from a successful combination of computational developments, experimental biology studies, and technological advances. These efforts combined are aimed at the identification and functional characterization of gene regulatory elements in the human genome and the functional impact of population variation within them. Whole genome association studies detecting disease-associated mutations in non-genic regions of the human genome have indicated not only the importance of variation in noncoding regions (with subtle variations having dramatic impacts on human health), but also the widespread distribution of noncoding variations associated with phenotypic abnormalities. To date, noncoding mutations has been linked to over 50 single or multi-gene disorders, including cancer, AIDS, and heart disease, to name a few. For example, one of the recent discoveries was the identification of a common noncoding mutation in the intron of the RET gene strongly associated with the Hirschprung disease (Emisson ES et al., Nature 2005). Another study reported an ideal association of the blue/brown eye color with a noncoding mutation residing in an intron of a gene neighboring the gene which regulation has been disrupted (Eiberg et al., Human Genetics 2008). In addition to the rapidly growing field of noncoding disease association experimental studies, multiple computational biology directions are being explored to facilitate the discovery of disease-causing noncoding mutations and to characterize disrupted gene regulatory pathways. In particular, there are several groups developing genome data mining techniques to demarcate the gene regulatory architecture of the human genome, study the evolutionary divergence and innovation of regulatory elements, carry out population genetic studies of mutations in promoters and distant regulatory elements, and develop computational approaches to quantify the likelihood of disease association and causality for noncoding mutations using their genetic features. The speakers of this Special Session will describe their successful cross-discipline studies aimed at deciphering the gene regulatory architecture of the human genome and assessing the impact of noncoding mutations on the human phenotype and disease.
Presenters:Veronica van Heyningen
MRC Human Genetics Unit
Edinburgh, UK
James P. Noonan
Department of Genetics
Yale University School of Medicine
New Haven, CT, USA
Stephen Montgomery
Wellcome Trust Sanger Institute
Hinxton, UK
Nadav Ahituv,
Department of Biopharmaceutical Sciences and Institute for Human Genetics,
University of California,
San Francisco, USA
MRC Human Genetics Unit, Edinburgh, UK
Most closely studied malformations arise as a result of mutations affecting early transcriptional regulators with key roles in development. Such genes are frequently highly conserved in evolution, and over time have acquired multiple spatiotemporally distinct functions which need to be tightly regulated. Complex regulatory control elements occupy significant genomic space and its components are susceptible to variation and mutation in disease. The most readily recognised regulatory variants are those where a functional null mutation is mimicked by regulatory change. Many such mutations arise through chromosomal rearrangement, for example in the regions around SOX9, leading to campomelic dysplasia, the SOX9 null phenotype, or breakpoints upstream of sonic hedgehog, SHH, causing holoprosencephaly. More subtle partial phenotypes may be elicited by deletion or mutation of specific regulatory elements. Examples of this include pre-axial polydactyly due to distant regulatory element change affecting the expression of SHH solely in the developing hand or foot, and more recently distant disruption of SOX9 expression associated with the cleft palate phenotype Pierre Robin Sequence. Following the advent of molecular array technology capable of recognising small copy-number variations, novel regulatory mutations may be revealed when phenotypes not linked to intragenic mutations are assessed using suitable genomic arrays. Much common-disease-associated variation maps to genomic regions expected to harbour regulatory elements for neighbouring genes. Identifying the causative changes and defining the effect of such change remains a complex task. Association studies for more common anomalies may also reveal links to regulatory variants at known loci generally associated with more severe developmental phenotypes – as recently observed for cleft lip and the Van der Woude syndrome gene IRF6.
The regulatory elements associated with specific genes are generally recognisable as evolutionarily highly conserved regions within introns of the transcription unit, and particularly in extragenic regions flanking it. Such readily identified conserved non-coding elements (CNEs) can be spread over as much as a megabase of flanking sequence. CNEs can be functionally tested using reporter transgenic studies. Nevertheless it is difficult to predict the phenotypes expected when a specific element is disrupted by deletion or sequence alteration. Identification of regulatory element mutations has therefore been difficult and conversely validation of regulatory region changes as causative for a particular phenotype has also been problematic. Discussion is needed on the design of techniques that may be used to predict phenotypes due to individual mutations in CNEs
1) Computational and Mathematical Biology, Genome Institute of Singapore, Singapore
2) Department of Genetics, Yale University School of Medicine, New Haven, CT
3) Genomics Division, Lawrence Berkeley National Laboratory, Berkeley, CA
4) Department of Comparative Medicine, Yale University School of Medicine, New Haven, CT
Identifying the molecular basis of human biological uniqueness is one of the fundamental challenges in human genetics. Although changes in gene regulation have long been thought to underlie the evolution of the morphological features that distinguish humans from other primates, evidence for human-specific developmental regulatory function has remained elusive. In vivo studies of conserved noncoding sequences (CNSs) have revealed them to be enriched in cis-regulatory transcriptional enhancers that confer specific expression patterns during development. We have identified a set of CNSs that show extreme human-specific evolutionary acceleration (HACNSs). Using mouse transgenic enhancer assays, we have shown that the most rapidly evolving conserved noncoding sequence in the genome, termed HACNS1, is an enhancer driving human-specific gene expression in the developing mouse limb, most notably in the mouse equivalent of the primordial thumb. In vivo analyses with synthetic enhancers indicate that 13 substitutions clustered in an 81-bp module within the enhancer are sufficient to confer human-specific limb expression. We are extending this strategy across the genome to identify and functionally characterize additional developmental enhancers that show human-specific activities. We are also using the HACNS1 enhancer as a test case to model the phenotypic impact of human-specific regulatory function by the generation of “humanized” mice in which the HACNS1 enhancer sequence is introduced into the mouse genome by homologous recombination. We are combining these genetic approaches with gene expression profiling of embryonic human, primate and mouse tissues and methods to identify the regulatory targets of enhancers in vivo, in order to develop an integrated understanding of how changes in developmental regulatory programs contributed to human evolution.
Howard Hughes Medical Institute, Departments of Pediatrics and Genetics, University of Pennsylvania, Philadelphia, PA 19104, U.S.A.
Individuals differ at the DNA sequence level; however, the effect of DNA sequence variants on phenotypes remains largely unknown. In this project, we are identifying DNA variants that influence gene expression in human cells at baseline and in response to perturbations. There is extensive variability in expression levels of many human genes, and there is a genetic component to this variation. We are treating the expression levels of genes as quantitative phenotypes, and using genetic mapping approaches to identify sequence variants that influence gene expression. The results are allowing us to identify cis- and trans-acting polymorphic regulators in the human genome. In this presentation, I will discuss our findings, and describe the regulatory landscape of human gene expression.
Wellcome Trust Sanger Institute, Hinxton, UK, CB10 1SA
Understanding genetic differences in gene expression is a fundamental component in building our understanding of the etiology of complex traits. To date, genome-wide investigations of which have largely focussed on understanding total gene expression differences between individuals in a population by quantification of transcripts via hybridization. The increasing availability of sequence information via next generation sequencing technology and its application to transcriptome sequencing (RNAseq) has improved our sensitivity to detect gene expression variation at the single-nucleotide level. This increased resolution has enabled us to investigate not only genetic differences in transcript abundance but in alternative splicing and transcription efficiency. Furthermore, our power to detect associations has been enhanced by extra genotyping information through allelic imbalance.
We report on our RNAseq analysis of 60 CEU transcriptomes which have also been sequenced as part of the 1000 genomes project. We identify enhanced dynamic range to array-based methodologies especially for low abundances. We will further report on unique features of this data set.
Department of Biopharmaceutical Sciences and Institute for Human Genetics, University of California, San Francisco, CA 94143, USA
While the annotation and characterization of the 2% of our genome that codes for genes has been extremely successful, the remaining 98% still remains a ‘wasteland’. One vital function that is clearly embedded in this wasteland is gene regulation. Using comparative genomics we can now identify evolutionary conserved noncoding sequences within this terrain that may have potential gene regulatory function, but determining that function is not straightforward. In addition, unlike genes where we know the genetic code and the consequences of nucleotide changes within it, in gene regulatory sequences we don’t have that knowledge. This knowledge is of extreme importance with a wide variety of clinical and molecular data supporting noncoding gene regulatory sequences to be an important cause of human diversity and disease. In order to obtain a better understanding of the gene regulatory code, we set out to deconstruct previously characterized gene regulatory elements using mouse knockout and knockin technology. As our substrate we took advantage of ultraconserved elements, sequences that are 100% identical between human and mouse, that were previously characterized as functional enhancer sequences (sequences that instruct gene promoters when and where to turn on). We specifically generated genetic manipulations within these elements for the sole purpose of obtaining a better understanding of the gene regulatory code that is embedded within them. Our manipulations include their complete removal, insertions and 3 base pair nucleotide changes. Our preliminary results demonstrate that while complete removal of these elements does not cause an apparent phenotype, a subtle change within them can lead to increased lethality, suggesting a gain-of-function type mechanism. By extensively analyzing in vivo the functional consequences following these manipulations we are increasing our understanding of the gene regulatory code and the implications of nucleotide changes and rearrangements within regulatory sequences upon human diversity and disease.
↑ TOP