An anchor-based model for global multiple alignment of whole genome sequences

Loading...
Thumbnail Image

Date

Authors

Keywords

Degree Level

masters

Advisor

Degree Name

M. Sc.

Volume

Issue

Publisher

Memorial University of Newfoundland

Abstract

With the benefit of advanced biotechnology, large numbers of whole genome sequences have been compiled. Aligning whole genome sequences is a fundamentally different problem than aligning short sequences. Recently, intensive research activities have been devoted to this problem. We propose an anchor-based model for global multiple alignment of whole genome sequences. The model includes three main phases. Firstly, an enhanced suffix array method is employed to find anchors. Next, an exact chaining algorithm, which is based on the dynamic programming technique and the longest common subsequence idea, calculates an anchor-chain for the weighted anchors. Lastly, a progressive multiple alignment method is used to close the gaps between the anchors. The proposed chaining procedure is based on evolutionary theory and can align whole genome sequences not only for close homologs, but also distant species. Combined with the exact suffix array approach, this model can compute partially accurate solutions and generate a high-quality alignment result in terms of computation and biology.

Collections