A hybrid MPI/OpenMP parallel implementation of NSGA-II for finding patterns in protein sequences

David L. González-Álvarez, Miguel A. Vega-Rodríguez, Álvaro Rubio-Largo

Research output: Contribution to journalArticlepeer-review

3 Citations (Scopus)

Abstract

Since the late 1970s, when the first DNA-based genome was sequenced, the field of biology is experiencing a significant growth in the amount of data that needs to be processed. Long ago it became impractical to analyze all this information manually, resulting in a great need for new techniques, algorithms and strategies to facilitate this work. Within the vast world of bioinformatics, we will focus on proteomics, more specifically, on the discovery of small repeated common patterns on sets of protein sequences that may represent some biological functionality. When we analyze a large number of sequences, the problem shows non-deterministic polynomial times, it implies that we could benefit from the combination of high-performance computing and computational intelligence techniques. In this paper, we address the discovery of repeated common patterns as a multiobjective optimization problem by means of a hybrid MPI/OpenMP approach which parallelizes a well-known multiobjective metaheuristic, the fast non-dominated sorting genetic algorithm (NSGA-II). Our main objective is to combine the benefits of shared-memory and distributed-memory programming paradigms to discover patterns in an accurate and efficient manner. Experiments conducted on six different datasets, comparisons with other well-known biological tools, and the obtained speed-ups and efficiencies show that our approach is able to achieve a significant performance in terms of parallel and biological results.

Original languageEnglish
Pages (from-to)2285-2312
Number of pages28
JournalJournal of Supercomputing
Volume73
Issue number6
DOIs
Publication statusPublished - 1 Jun 2017

Keywords

  • Bioinformatics
  • Evolutionary computation
  • Multiobjective optimization
  • Parallel computing
  • Proteins

Fingerprint

Dive into the research topics of 'A hybrid MPI/OpenMP parallel implementation of NSGA-II for finding patterns in protein sequences'. Together they form a unique fingerprint.

Cite this