TY - JOUR
T1 - Multiobjective characteristic-based framework for very-large multiple sequence alignment
AU - Rubio-Largo, Álvaro
AU - Vanneschi, Leonardo
AU - Castelli, Mauro
AU - Vega-Rodríguez, Miguel A.
N1 - Rubio-Largo, Á., Vanneschi, L., Castelli, M., & Vega-Rodríguez, M. A. (2018). Multiobjective characteristic-based framework for very-large multiple sequence alignment. Applied Soft Computing Journal, 69, 719-736. [Advanced online publication on 27 June 2017]DOI: 10.1016/j.asoc.2017.06.022
PY - 2018
Y1 - 2018
N2 - In the literature, we can find several heuristics for solving the multiple sequence alignment problem. The vast majority of them makes use of flags in order to modify certain alignment parameters; however, if no flags are used, the aligner will run with the default parameter configuration, which, often, is not the optimal one. In this work, we propose a framework that, depending on the biological characteristics of the input dataset, runs the aligner with the best parameter configuration found for another dataset that has similar biological characteristics, improving the accuracy and conservation of the obtained alignment. To train the framework, we use three well-known multiobjective evolutionary algorithms: NSGA-II, IBEA, and MOEA/D. Then, we perform a comparative study between several aligners proposed in the literature and the characteristic-based version of Kalign, MAFFT, and MUSCLE, when solving widely-used benchmarks (PREFAB v4.0 and SABmark v1.65) and very-large benchmarks with thousands of unaligned sequences (HomFam).
AB - In the literature, we can find several heuristics for solving the multiple sequence alignment problem. The vast majority of them makes use of flags in order to modify certain alignment parameters; however, if no flags are used, the aligner will run with the default parameter configuration, which, often, is not the optimal one. In this work, we propose a framework that, depending on the biological characteristics of the input dataset, runs the aligner with the best parameter configuration found for another dataset that has similar biological characteristics, improving the accuracy and conservation of the obtained alignment. To train the framework, we use three well-known multiobjective evolutionary algorithms: NSGA-II, IBEA, and MOEA/D. Then, we perform a comparative study between several aligners proposed in the literature and the characteristic-based version of Kalign, MAFFT, and MUSCLE, when solving widely-used benchmarks (PREFAB v4.0 and SABmark v1.65) and very-large benchmarks with thousands of unaligned sequences (HomFam).
KW - Characteristic-based
KW - Evolutionary algorithms
KW - Multiobjective optimization
KW - Multiple sequence alignment
UR - http://www.scopus.com/inward/record.url?scp=85023618707&partnerID=8YFLogxK
U2 - 10.1016/j.asoc.2017.06.022
DO - 10.1016/j.asoc.2017.06.022
M3 - Article
AN - SCOPUS:85023618707
SN - 1568-4946
VL - 69
SP - 719
EP - 736
JO - Applied Soft Computing Journal
JF - Applied Soft Computing Journal
ER -