TY - JOUR
T1 - Programming languages for data-Intensive HPC applications: A systematic mapping study
AU - Amaral, Vasco
AU - Norberto, Beatriz
AU - Goulão, Miguel
AU - Aldinucci, Marco
AU - Benkner, Siegfried
AU - Bracciali, Andrea
AU - Carreira, Paulo
AU - Celms, Edgars
AU - Correia, Luís
AU - Grelck, Clemens
AU - Karatza, Helen
AU - Kessler, Christoph
AU - Kilpatrick, Peter
AU - Martiniano, Hugo
AU - Mavridis, Ilias
AU - Pllana, Sabri
AU - Respício, Ana
AU - Simão, José
AU - Veiga, Luís
AU - Visa, Ari
N1 - This work is a result of activities from COST Action 10406 High -Performance Modelling and Simulation for Big Data Applications (cHiPSet), funded by the European Cooperation in Science and Technology.
FCT, Portugal for grants: NOVA LINCS Research Laboratory Ref. UID/ CEC/ 04516/ 2019);
INESC-ID Ref. UID/CEC/50021/2019;
BioISI Ref. UID/MULTI/04046/2103;
LASIGE Research Unit Ref. UID/CEC/00408/ 2019.
PY - 2020/3/1
Y1 - 2020/3/1
N2 - A major challenge in modelling and simulation is the need to combine expertise in both software technologies and a given scientific domain. When High-Performance Computing (HPC) is required to solve a scientific problem, software development becomes a problematic issue. Considering the complexity of the software for HPC, it is useful to identify programming languages that can be used to alleviate this issue. Because the existing literature on the topic of HPC is very dispersed, we performed a Systematic Mapping Study (SMS) in the context of the European COST Action cHiPSet. This literature study maps characteristics of various programming languages for data-intensive HPC applications, including category, typical user profiles, effectiveness, and type of articles. We organised the SMS in two phases. In the first phase, relevant articles are identified employing an automated keyword-based search in eight digital libraries. This lead to an initial sample of 420 papers, which was then narrowed down in a second phase by human inspection of article abstracts, titles and keywords to 152 relevant articles published in the period 2006–2018. The analysis of these articles enabled us to identify 26 programming languages referred to in 33 of relevant articles. We compared the outcome of the mapping study with results of our questionnaire-based survey that involved 57 HPC experts. The mapping study and the survey revealed that the desired features of programming languages for data-intensive HPC applications are portability, performance and usability. Furthermore, we observed that the majority of the programming languages used in the context of data-intensive HPC applications are text-based general-purpose programming languages. Typically these have a steep learning curve, which makes them difficult to adopt. We believe that the outcome of this study will inspire future research and development in programming languages for data-intensive HPC applications.
AB - A major challenge in modelling and simulation is the need to combine expertise in both software technologies and a given scientific domain. When High-Performance Computing (HPC) is required to solve a scientific problem, software development becomes a problematic issue. Considering the complexity of the software for HPC, it is useful to identify programming languages that can be used to alleviate this issue. Because the existing literature on the topic of HPC is very dispersed, we performed a Systematic Mapping Study (SMS) in the context of the European COST Action cHiPSet. This literature study maps characteristics of various programming languages for data-intensive HPC applications, including category, typical user profiles, effectiveness, and type of articles. We organised the SMS in two phases. In the first phase, relevant articles are identified employing an automated keyword-based search in eight digital libraries. This lead to an initial sample of 420 papers, which was then narrowed down in a second phase by human inspection of article abstracts, titles and keywords to 152 relevant articles published in the period 2006–2018. The analysis of these articles enabled us to identify 26 programming languages referred to in 33 of relevant articles. We compared the outcome of the mapping study with results of our questionnaire-based survey that involved 57 HPC experts. The mapping study and the survey revealed that the desired features of programming languages for data-intensive HPC applications are portability, performance and usability. Furthermore, we observed that the majority of the programming languages used in the context of data-intensive HPC applications are text-based general-purpose programming languages. Typically these have a steep learning curve, which makes them difficult to adopt. We believe that the outcome of this study will inspire future research and development in programming languages for data-intensive HPC applications.
KW - Big data
KW - Data-intensive applications
KW - Domain-Specific language (DSL)
KW - General-Purpose language (GPL)
KW - High performance computing (HPC)
KW - Programming languages
KW - Systematic mapping study (SMS)
UR - http://www.scopus.com/inward/record.url?scp=85076201522&partnerID=8YFLogxK
U2 - 10.1016/j.parco.2019.102584
DO - 10.1016/j.parco.2019.102584
M3 - Article
AN - SCOPUS:85076201522
VL - 91
JO - Parallel Computing
JF - Parallel Computing
SN - 0167-8191
M1 - 102584
ER -