Multiobjective characteristic-based framework for very-large multiple sequence alignment

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

In the literature, we can find several heuristics for solving the multiple sequence alignment problem. The vast majority of them makes use of flags in order to modify certain alignment parameters; however, if no flags are used, the aligner will run with the default parameter configuration, which, often, is not the optimal one. In this work, we propose a framework that, depending on the biological characteristics of the input dataset, runs the aligner with the best parameter configuration found for another dataset that has similar biological characteristics, improving the accuracy and conservation of the obtained alignment. To train the framework, we use three well-known multiobjective evolutionary algorithms: NSGA-II, IBEA, and MOEA/D. Then, we perform a comparative study between several aligners proposed in the literature and the characteristic-based version of Kalign, MAFFT, and MUSCLE, when solving widely-used benchmarks (PREFAB v4.0 and SABmark v1.65) and very-large benchmarks with thousands of unaligned sequences (HomFam).

Original languageEnglish
Pages (from-to)719-736
JournalApplied Soft Computing Journal
Volume69
Early online date2017
DOIs
Publication statusPublished - 2018

Keywords

  • Characteristic-based
  • Evolutionary algorithms
  • Multiobjective optimization
  • Multiple sequence alignment

Fingerprint Dive into the research topics of 'Multiobjective characteristic-based framework for very-large multiple sequence alignment'. Together they form a unique fingerprint.

Cite this