Gene Prediction by Pattern Recognition and Homology Search
Ying Xu and Edward C. Uberbacher
Proceedings of the
Fourth International Conference on Intelligent Systems for
Molecular Biology , edited by David J. States,
Pankaj Agarwal, Terry Gaasterland, Lawrence Hunter, &
Randall F. Smith (AAAI Press, 1996), pages 241-251.
Abstract
This paper presents an algorithm for combining pattern recognition-based exon
prediction and database homology search in gene model construction. The goal is to
use homologous genes or partial genes existing in the database as reference models
while constructing (multiple) gene models from exon candidates predicted by pattern
recognition methods. A unified framework for gene modeling is used for genes
ranging from situations with strong homology to no homology in the database. To
maximally use the homology information available, the algorithm applies homology
on three levels: (1) exon candidate evaluation, (2) gene-segment construction with a
reference model, and (3) (complete) gene modeling. Preliminary testing has been
done on the algorithm. Test results show that (a) perfect gene modeling can be
expected when the initial exon predictions are reasonably good and a strong
homology exists in the database; (b) homology (not necessarily strong) in general
helps improve the accuracy of gene modeling; (c) multiple gene modeling becomes
feasible when homology exists in the database for the involved genes.