Lab Home | Phone | Search | ||||||||
|
||||||||
The large number number of genomes being determined presents opportunities but also computational challenges. Annotation could ideally be done using a combined statistical alignment and annotation on a phylogenetic tree. However, this presents serious challenges and we have tried to solve this by approximating a full phylogeny by series of smaller phylogenies that is obtained by forcing some leaves to be internal nodes. The used genealogical structure will be an intermediate between a spanning tree [zero internal nodes] and a Steiner tree [maximal number of internal nodes], that we called SPANNOIDS but is previously known as k-restricted spanning trees in the computer science literature. This leads to a computational decomposition of calculations. The annotation part consist of a model that uses TKF92 as background model combined with an array of promotor specific models. An overall annotation will be obtained by combining annotation information from many components.
This approach is illustrated on simulated and real data. |