Identification of Regulatory Regions which
Confer Muscle-Specific Gene Expression
Wyeth W. Wasserman, James W. Fickett
Journal of Molecular Biology
278(1): 167-181 (April 24, 1998)
Abstract
For many newly sequenced genes, sequence analysis of the
putative protein yields no clue on function. It would be
beneficial to be able to identify in the genome the regulatory
regions that confer temporal and spatial expression patterns
for the uncharacterized genes. Additionally, it would be
advantageous to identify regulatory regions within genes of
known expression pattern without performing the costly and
time consuming laboratory studies now required. To achieve
these goals, the wealth of case studies performed over the
past 15 years will have to be collected into predictive
models of expression. Extensive studies of genes expressed
in skeletal muscle have identified specific transcription
factors which bind to regulatory elements to control gene
expression. However, potential binding sites for these
factors occur with sufficient frequency that it is rare for a
gene to be found without one. Analysis of experimentally
determined muscle regulatory sequences indicates that
muscle expression requires multiple elements in close
proximity. A model is generated with predictive capability for
identifying these muscle-specific regulatory modules.
Phylogenetic footprinting, the identification of sequences
conserved between distantly related species, complements
the statistical predictions. Through the use of logistic
regression analysis, the model promises to be easily
modified to take advantage of the elucidation of additional
factors, cooperation rules, and spacing constraints.