TY - JOUR
T1 - Machine Learning-Enhanced Quantum Chemistry-Assisted Refinement of the Active Site Structure of Metalloproteins
AU - Gigli, Lucia
AU - Silva, José Malanho
AU - Cerofolini, Linda
AU - Macedo, Anjos L.
AU - Geraldes, Carlos F. G. C.
AU - Suturina, Elizaveta A.
AU - Calderone, Vito
AU - Fragai, Marco
AU - Parigi, Giacomo
AU - Ravera, Enrico
AU - Luchinat, Claudio
N1 - Funding Information:
This work has been supported by the Fondazione Cassa di Risparmio di Firenze, by the project \u201CPotentiating the Italian Capacity for Structural Biology Services in Instruct-ERIC - ITACA.SB\u201D (Project no. IR0000009) within the call MUR 3264/2021 PNRR M4/C2/L3.1.1, funded by the European Union NextGenerationEU, by the Ministero dell\u2019Universita\u0300 e della Ricerca - Dipartimenti di Eccellenza 2023\u20132027 (DICUS 2.0) to the Department of Chemistry \u201CUgo Schiff\u201D of the University of Florence, by the Fundação para a Ciência e a Tecnologia (FCT-Portugal) for funding UCIBIO project (UIDP/04378/2020 and UIDB/04378/2020), the CQC-IMS project (UIDB/00313/2020 and UIDP/00313/2020), by the Associate Laboratory Institute for Health and Bioeconomy-i4HB Project (LA/P/0140/2020), by the Engineering and Physical Sciences Research Council grant number EP/W022028/1), as well as by the H2020 research and innovation Program Fragment-Screen (101094131). The authors acknowledge the support and the use of resources of Instruct-ERIC, a landmark ESFRI project, and specifically the CERM/CIRMMP Italy center. The authors also thank FCT-Portugal for the Ph.D. grant awarded to J.\u0301M.S. (PD/BD/135180/2017) under the PTNMRPhD Program - NMR applied to chemistry, materials, and biosciences (PD/00065/2013). Support from the Royal Society (IES\\R1\\201135) is also acknowledged. We also acknowledge the CINECA award to L.G. under the ISCRA initiative, for the availability of high-performance computing resources and support.
Publisher Copyright:
© 2024 American Chemical Society
PY - 2024/6/10
Y1 - 2024/6/10
N2 - Understanding the fine structural details of inhibitor binding at the active site of metalloenzymes can have a profound impact on the rational drug design targeted to this broad class of biomolecules. Structural techniques such as NMR, cryo-EM, and X-ray crystallography can provide bond lengths and angles, but the uncertainties in these measurements can be as large as the range of values that have been observed for these quantities in all the published structures. This uncertainty is far too large to allow for reliable calculations at the quantum chemical (QC) levels for developing precise structure-activity relationships or for improving the energetic considerations in protein-inhibitor studies. Therefore, the need arises to rely upon computational methods to refine the active site structures well beyond the resolution obtained with routine application of structural methods. In a recent paper, we have shown that it is possible to refine the active site of cobalt(II)-substituted MMP12, a metalloprotein that is a relevant drug target, by matching to the experimental pseudocontact shifts (PCS) those calculated using multireference ab initio QC methods. The computational cost of this methodology becomes a significant bottleneck when the starting structure is not sufficiently close to the final one, which is often the case with biomolecular structures. To tackle this problem, we have developed an approach based on a neural network (NN) and a support vector regression (SVR) and applied it to the refinement of the active site structure of oxalate-inhibited human carbonic anhydrase 2 (hCAII), another prototypical metalloprotein target. The refined structure gives a remarkably good agreement between the QC-calculated and the experimental PCS. This study not only contributes to the knowledge of CAII but also demonstrates the utility of combining machine learning (ML) algorithms with QC calculations, offering a promising avenue for investigating other drug targets and complex biological systems in general.
AB - Understanding the fine structural details of inhibitor binding at the active site of metalloenzymes can have a profound impact on the rational drug design targeted to this broad class of biomolecules. Structural techniques such as NMR, cryo-EM, and X-ray crystallography can provide bond lengths and angles, but the uncertainties in these measurements can be as large as the range of values that have been observed for these quantities in all the published structures. This uncertainty is far too large to allow for reliable calculations at the quantum chemical (QC) levels for developing precise structure-activity relationships or for improving the energetic considerations in protein-inhibitor studies. Therefore, the need arises to rely upon computational methods to refine the active site structures well beyond the resolution obtained with routine application of structural methods. In a recent paper, we have shown that it is possible to refine the active site of cobalt(II)-substituted MMP12, a metalloprotein that is a relevant drug target, by matching to the experimental pseudocontact shifts (PCS) those calculated using multireference ab initio QC methods. The computational cost of this methodology becomes a significant bottleneck when the starting structure is not sufficiently close to the final one, which is often the case with biomolecular structures. To tackle this problem, we have developed an approach based on a neural network (NN) and a support vector regression (SVR) and applied it to the refinement of the active site structure of oxalate-inhibited human carbonic anhydrase 2 (hCAII), another prototypical metalloprotein target. The refined structure gives a remarkably good agreement between the QC-calculated and the experimental PCS. This study not only contributes to the knowledge of CAII but also demonstrates the utility of combining machine learning (ML) algorithms with QC calculations, offering a promising avenue for investigating other drug targets and complex biological systems in general.
UR - http://www.scopus.com/inward/record.url?scp=85194427839&partnerID=8YFLogxK
U2 - 10.1021/acs.inorgchem.4c01274
DO - 10.1021/acs.inorgchem.4c01274
M3 - Article
C2 - 38805564
AN - SCOPUS:85194427839
SN - 0020-1669
VL - 63
SP - 10713
EP - 10725
JO - Inorganic Chemistry
JF - Inorganic Chemistry
IS - 23
ER -