TY - JOUR
T1 - Machine Learning for the Prediction of Ionization Potential and Electron Affinity Energies Obtained by Density Functional Theory
AU - Pereira, Florbela
N1 - Funding Information:
info:eu-repo/grantAgreement/FCT/6817 - DCRRNI ID/UIDB%2F50006%2F2020/PT#
FP gratefully acknowledges FCT for an Assistant Research Position (CEECIND/01649/2021).
We thank Chemaxon Ltd. for access to JChem and Marvin. The author thanks João Aires‐de‐Sousa (NOVA School of Science and Technology, NOVA University of Lisbon) for many interesting discussions and suggestions about different aspects of this work
Publisher Copyright:
© 2023 Wiley-VCH GmbH.
PY - 2023/4/25
Y1 - 2023/4/25
N2 - Quantum chemical (QC) calculations based on density functional theory (DFT) provide increasingly accurate estimates of various properties, but with a relatively high computational cost. Machine learning (ML) techniques can be envisaged to extract new knowledge from these large volumes of data, creating empirical models to fast predict QC calculations in new situations. Here, ML algorithms were explored for the fast estimation of ionization potential (IP) and electron affinity (EA) energies calculated by DFT using the B3LYP and PBE0 with 6–31G** basic set on molecular descriptors generated from DFT-optimized geometries. A database of 9,410 and 9,627 small organic structures for IP and EA energies modelling were used, respectively. Several ML algorithms such as random forest, support vector machines, deep learning multilayer perceptron networks, and light gradient-boosting machine were screened. The best performance was achieved with a consensus regression model predicted an external test set of 972 and 963 small organic molecules achieving a mean absolute error up to 0.23 eV and 0.32 eV for modelling IP and EA energies, respectively.
AB - Quantum chemical (QC) calculations based on density functional theory (DFT) provide increasingly accurate estimates of various properties, but with a relatively high computational cost. Machine learning (ML) techniques can be envisaged to extract new knowledge from these large volumes of data, creating empirical models to fast predict QC calculations in new situations. Here, ML algorithms were explored for the fast estimation of ionization potential (IP) and electron affinity (EA) energies calculated by DFT using the B3LYP and PBE0 with 6–31G** basic set on molecular descriptors generated from DFT-optimized geometries. A database of 9,410 and 9,627 small organic structures for IP and EA energies modelling were used, respectively. Several ML algorithms such as random forest, support vector machines, deep learning multilayer perceptron networks, and light gradient-boosting machine were screened. The best performance was achieved with a consensus regression model predicted an external test set of 972 and 963 small organic molecules achieving a mean absolute error up to 0.23 eV and 0.32 eV for modelling IP and EA energies, respectively.
KW - density functional theory (DFT)
KW - electron affinity energy
KW - ionization potential energy
KW - machine learning (ML)
KW - quantitative structure property relationships (QSPR)
UR - https://www.scopus.com/record/display.uri?eid=2-s2.0-85153742546&origin=resultslist&sort=plf-f&src=s&sid=c80d858e6735a451b3d7374f67c44dfa&sot=b&sdt=b&s=DOI%2810.1002%2Fslct.202300036%29&sl=13&sessionSearchId=c80d858e6735a451b3d7374f67c44dfa
U2 - 10.1002/slct.202300036
DO - 10.1002/slct.202300036
M3 - Article
SN - 2365-6549
VL - 8
JO - ChemistrySelect
JF - ChemistrySelect
IS - 16
M1 - e202300036
ER -