Machine learning of chemical reactivity from databases of organic reactions

Research output: Contribution to journalArticlepeer-review

18 Citations (Scopus)


Databases of chemical reactions contain knowledge about the reactivity of specific reagents. Although information is in general only explicitly available for compounds reported to react, it is possible to derive information about substructures that do not react in the reported reactions. Both types of information (positive and negative) can be used to train machine learning techniques to predict if a compound reacts or not with a specific reagent. The whole process was implemented with two databases of reactions, one involving BuNH2 as the reagent, and the other NaCNBH3. Negative information was derived using MOLMAP molecular descriptors, and classification models were developed with Random Forests also based on MOLMAP descriptors. MOLMAP descriptors were based exclusively on calculated physicochemical features of molecules. Correct predictions were achieved for ∼90% of independent test sets. While NaCNBH3 is a selective reducing reagent widely used in organic synthesis, BuNH2 is a nucleophile that mimics the reactivity of the lysine side chain (involved in an initiating step of the mechanism leading to skin sensitization).

Original languageEnglish
Pages (from-to)419-429
Number of pages11
JournalJournal Of Computer-Aided Molecular Design
Issue number7
Publication statusPublished - 3 Jun 2009


  • Chemical reactivity
  • Databases
  • Electrophilicity
  • Machine learning


Dive into the research topics of 'Machine learning of chemical reactivity from databases of organic reactions'. Together they form a unique fingerprint.

Cite this