Studies of the transcriptional regulation of the genes coding for the novel IL28A,B and IL29 protein family:
Illustration of an in silico approach applicable on a genomic scale
William Krivan1, Brian Fox2, Emily Cooper, Teresa Gilbert, Frank Grant, Betty
Haldeman, Katherine Henderson, Wayne Kindsvogel, Kevin Klucher, Gary
McKnight, Patrick O'Hara, Scott Presnell, Monica Tackett, David Taft, and
Paul Sheppard
1krivan@zgi.com, ZymoGenetics, Inc.; 2bfox@zgi.com, ZymoGenetics, Inc.
The novel IL28A,B and IL29 protein family consists of three non-allelic
human proteins, and homologous mouse proteins, which are distantly related
to interferons and IL-10. We use this protein family to illustrate an approach
to the computational identification and characterization of putative
transcriptional regulatory regions that consists of a combination of available
and novel techniques that can be applied on a genomic scale.
Insights into the regulatory mechanisms of the novel IL28A,B and IL29 protein
family may be gained from comparisons of their potential regulatory regions
with the regulatory regions of characterized cytokines as, for example, IFN-alpha,
beta, and gamma.
In metazoans, however, it is in general not feasible to study co-regulation of
paralogous genes by simply performing alignments of the upstream sequences, an
approach that has been successful for yeast and bacteria.
Comparisons of potential regulatory regions must reveal more subtle similarities
such as the individual transcription factor binding sites. However, the low binding
specificity of transcription factors results in a high rate of false predictions
in the analysis of genes from metazoan species.
The number of predicted sites can be reduced by about one order of magnitude to
a set more likely to have sequence-specific functions by means of phylogenetic
footprinting, a conservation-based filter based on the biological observation
that regulatory regions are often more highly conserved between species than other
non-coding regions.
Another technique that can be used for the selection of presumably functional motifs
is motivated by the observation that groups of transcription factors rather
than single factors are required for the function of regulatory regions and is
based on the hypothesis that statistical significance of clusters of sites is
correlated with biological function.
We illustrate the combined application of these techniques for the
characterization of putative regulatory regions of IL28A,B and IL29.