Random forest prediction of mutagenicity from empirical physicochemical descriptors

Research output: Contribution to journalArticlepeer-review

73 Citations (Scopus)


Fast-to-calculate empirical physicochemical descriptors were investigated for their ability to predict mutagenicity (positive or negative Ames test) from the molecular structure. Fast methods are highly desired for the screening of large libraries of compounds. Global molecular descriptors and MOLMAP descriptors of bond properties were used to train random forests. Error percentages as low as 15% and 16% were achieved for an external test set with 472 compounds and for the training set with 4083 structures, respectively. High sensitivity and specificity were observed. Random forests were able to associate meaningful probabilities to the predictions and to explain the predictions in terms of similarities between query structures and compounds in the training set.
Original languageUnknown
Pages (from-to)1-8
JournalJournal of Chemical Information and Modeling
Issue number1
Publication statusPublished - 1 Jan 2007

Cite this