TY - JOUR
T1 - Advanced Genetic Programming vs. State-of-the-Art AutoML in Imbalanced Binary Classification
AU - Frank, Franz
AU - Bacao, Fernando
N1 - info:eu-repo/grantAgreement/FCT/3599-PPCDT/DSAIPA%2FDS%2F0116%2F2019/PT#
info:eu-repo/grantAgreement/FCT/6817 - DCRRNI ID/UIDB%2F04152%2F2020/PT#
Frank, F., & Bacao, F. (2023). Advanced Genetic Programming vs. State-of-the-Art AutoML in Imbalanced Binary Classification. Emerging Science Journal, 7(4), 1349-1363. https://doi.org/10.28991/ESJ-2023-07-04-021--- Data Availability Statement: Publicly available datasets were analyzed in this study. This data can be found here: https://github.com/joaopfonseca/mlresearch. ---This work was supported by a grant of the Portuguese Foundation for Science and Technology (“Fundação para a Ciência e a Tecnologia”), DSAIPA/DS/0116/2019, and project UIDB/04152/2020—Centro de Investigação em Gestão de Informação (MagIC)
PY - 2023/8/1
Y1 - 2023/8/1
N2 - The objective of this article is to provide a comparative analysis of two novel genetic programming (GP) techniques, differentiable Cartesian genetic programming for artificial neural networks (DCGPANN) and geometric semantic genetic programming (GSGP), with state-of-the-art automated machine learning (AutoML) tools, namely Auto-Keras, Auto-PyTorch and Auto-Sklearn. While all these techniques are compared to several baseline algorithms upon their introduction, research still lacks direct comparisons between them, especially of the GP approaches with state-of-the-art AutoML. This study intends to fill this gap in order to analyze the true potential of GP for AutoML. The performances of the different tools are assessed by applying them to 20 benchmark datasets of the imbalanced binary classification field, thus an area that is a frequent and challenging problem. The tools are compared across the four categories average performance, maximum performance, standard deviation within performance, and generalization ability, whereby the metrics F1-score, G-mean, and AUC are used for evaluation. The analysis finds that the GP techniques, while unable to completely outperform state-of-the-art AutoML, are indeed already a very competitive alternative. Therefore, these advanced GP tools prove that they are able to provide a new and promising approach for practitioners developing machine learning (ML) models.
AB - The objective of this article is to provide a comparative analysis of two novel genetic programming (GP) techniques, differentiable Cartesian genetic programming for artificial neural networks (DCGPANN) and geometric semantic genetic programming (GSGP), with state-of-the-art automated machine learning (AutoML) tools, namely Auto-Keras, Auto-PyTorch and Auto-Sklearn. While all these techniques are compared to several baseline algorithms upon their introduction, research still lacks direct comparisons between them, especially of the GP approaches with state-of-the-art AutoML. This study intends to fill this gap in order to analyze the true potential of GP for AutoML. The performances of the different tools are assessed by applying them to 20 benchmark datasets of the imbalanced binary classification field, thus an area that is a frequent and challenging problem. The tools are compared across the four categories average performance, maximum performance, standard deviation within performance, and generalization ability, whereby the metrics F1-score, G-mean, and AUC are used for evaluation. The analysis finds that the GP techniques, while unable to completely outperform state-of-the-art AutoML, are indeed already a very competitive alternative. Therefore, these advanced GP tools prove that they are able to provide a new and promising approach for practitioners developing machine learning (ML) models.
KW - Genetic Programming
KW - Automated Machine Learning
KW - AutoML
KW - Imbalanced Binary Classification
UR - https://github.com/joaopfonseca/ml-research
UR - http://www.scopus.com/inward/record.url?scp=85168519835&partnerID=8YFLogxK
U2 - 10.28991/ESJ-2023-07-04-021
DO - 10.28991/ESJ-2023-07-04-021
M3 - Article
SN - 2610-9182
VL - 7
SP - 1349
EP - 1363
JO - Emerging Science Journal
JF - Emerging Science Journal
IS - 4
ER -