高等学校化学学报

• 研究论文 • 上一篇    下一篇

机器学习方法用于二氢叶酸还原酶抑制剂的活性预测

陈晓梅1, 饶含兵1, 黄文丽2, 李泽荣1   

    1. 四川大学化学学院,
    2. 纳米生物医学技术与膜生物学研究所, 成都 610064
  • 收稿日期:2006-09-15 修回日期:1900-01-01 出版日期:2007-11-10 发布日期:2007-11-10
  • 通讯作者: 李泽荣

Prediction of Dihydrofolate Reductase Inhibitors Activity Using Machine Learning Methods

CHEN Xiao-Mei1, RAO Han-Bing1, HUANG Wen-Li2, LI Ze-Rong1*   

    1. College of Chemistry,
    2. Institute for Nanobiomedical Technology and Membrane Biology, Sichuan University, Chengdu 610064, China
  • Received:2006-09-15 Revised:1900-01-01 Online:2007-11-10 Published:2007-11-10
  • Contact: LI Ze-Rong

摘要: 分别采用支持向量学习机、人工神经网络、调节性逻辑回归和K-最临近等机器学习方法对761个二氢叶酸还原酶抑制剂建立了其活性分类预测模型. 采用组成描述符和拓扑描述符表征抑制剂的分子结构及物理化学性质, 使用Kennard-Stone方法进行训练集的设计, 并用Metropolis Monte Carlo模拟退火方法作变量选择. 结果表明, 支持向量学习机优于其它机器学习方法, 所得到的最优模型具有较好的预测结果, 其预测正确率为91.62%. 说明通过合适的训练集设计及变量选择, 支持向量学习机方法可以很好地用于二氢叶酸还原酶抑制剂的活性分类预测.

关键词: 二氢叶酸还原酶抑制剂, 支持向量学习机, 分子描述符

Abstract: Machine learning methods, including Support Vector Machine, Artificial Neural Network, Regularized Logistic Regression and K-Nearest Neighbor, are used to develop the classification models for a set of 761 DHFR inhibitors. Constitutional descriptors and topological descriptors are calculated to characterize the structural and physicochemical properties of compounds and Kennard-Stone method is used to design the training set and Metropolis Monte Carlo simulated method is used for feature selection. It is shown that SVM method outperforms other machine learning methods used in this study and the final SVM model after feature selection can give a prediction accuracy of 91.62%. This suggests that SVM method with proper training set design and feature selection is potentially useful for the prediction of the activity of a diversity set of DHFR inhibitors.

Key words: Dihydrofolate reductase inhibitor, Support Vector Machine, Molecular descriptor

中图分类号: 

TrendMD: