高等学校化学学报 ›› 2025, Vol. 46 ›› Issue (3): 20240373.doi: 10.7503/cjcu20240373

• 分析化学 • 上一篇    下一篇

基于随机森林的有机小分子的化学键解离能预测

栾玥, 孔丁羚, 郭莉莉, 张庆友(), 周艳梅()   

  1. 河南大学化学与分子科学学院, 河南省工业水循环利用工程技术研究中心, 开封 475004
  • 收稿日期:2024-07-30 出版日期:2025-03-10 发布日期:2024-09-12
  • 通讯作者: 张庆友,周艳梅 E-mail:qingyou@vip.henu.edu.cn;zhouym@henu.edu.cn
  • 基金资助:
    国家自然科学基金(22278112)

Prediction of Chemical Bond Dissociation Energies of Small Organic Molecules Based on Random Forest

LUAN Yue, KONG Dingling, GUO Lili, ZHANG Qingyou(), ZHOU Yanmei()   

  1. Henan Engineering Research Center of Industrial Circulating Water Treatment,College of Chemistry and Molecular Sciences,Henan University,Kaifeng 475004,China
  • Received:2024-07-30 Online:2025-03-10 Published:2024-09-12
  • Contact: ZHANG Qingyou, ZHOU Yanmei E-mail:qingyou@vip.henu.edu.cn;zhouym@henu.edu.cn
  • Supported by:
    the National Natural Science Foundation of China(22278112)

摘要:

从iBonD有机物键能数据库中手动收集1208个含C, H, O, N和S原子的有机分子, 并记录相应的化学键解离能实验值. 提出了化学键类型描述符、 杂原子描述符和支化度描述符, 并与此前提出的原子类型描述符结合, 从而更全面地描述目标化学键的周边环境. 采用随机森林建立键解离能的预测模型, 结果表明目标化学键周围的原子类型和化学键类型的描述符组合建模得到的预测结果最佳, 在没有量子化学辅助的情况下得到了较好的预测结果. 与已报道的预测结果进行比较发现, 本文结果优于文献中的相应结果. 此外, 还设计了一个应用域算法来初步判断预测结果的质量, 重新随机划分训练集和测试集来验证模型的稳定性, 与零模型比较来判断模型的可行性.

关键词: 键解离能, 随机森林, iBonD, 原子类型, 化学键类型

Abstract:

1208 organic molecules containing C, H, O, N, and S were manually collected from the iBonD organic bond energy database, and the corresponding experimental bond dissociation energy values were recorded. Chemical bond type descriptors, heteroatomic count descriptors, and branch descriptors were proposed and combined with previously suggested atomic type descriptors to provide a more comprehensive description of the surrounding environment of the target chemical bond. The prediction models for bond dissociation energy were constructed using random forest, and the results show that the combination of the descriptors of atomic types and chemical bond types around the target chemical bond achieves the best prediction results, and satisfactory results were obtained without quantum chemistry assistance. Compared with the results in published literature, the predicted results herein are better than the corresponding results in the literature. In addition, an algorithm on the application domain was designed to assess the quality of prediction results in advance, and the training set and the test set were randomly re-partitioned to verify the stability of the model, as well as the feasibility of the model was evaluated by comparing it with a zero model.

Key words: Bond dissociation energy, Random forest, iBonD, Atom type, Bond type

中图分类号: 

TrendMD: