高等学校化学学报 ›› 2023, Vol. 44 ›› Issue (7): 20230165.doi: 10.7503/cjcu20230165

• 研究论文 • 上一篇    下一篇

基于分子性能与器件制备的低泛化误差有机太阳电池光电转化效率预测模型

张妍, 蒋行健, 刘明, 郑植, 张勇()   

  1. 哈尔滨工业大学材料科学与工程学院, 哈尔滨 150001
  • 收稿日期:2023-04-01 出版日期:2023-07-10 发布日期:2023-05-05
  • 通讯作者: 张勇 E-mail:yongzhang@hit.edu.cn
  • 基金资助:
    国家重点研发计划项目(2021YFE0105800)

Predict Efficiency of Organic Solar Cell with Low Generalization Error Based on Molecular Property and Device Fabrication

ZHANG Yan, JIANG Xingjian, LIU Ming, ZHENG Zhi, ZHANG Yong()   

  1. School of Materials Science and Engineering,Harbin Institute of Technology,Harbin 150001,China
  • Received:2023-04-01 Online:2023-07-10 Published:2023-05-05
  • Contact: ZHANG Yong E-mail:yongzhang@hit.edu.cn
  • Supported by:
    the National Key Research and Development Program of China(2021YFE0105800)

摘要:

近年来, 有机太阳电池是一个非常活跃的研究领域, 为了提高其光电转化效率, 所采取的优化策略主要分为新型给体或受体的开发和器件制备工艺优化两类. 但由于其影响因素的数量较多及其复杂的相互作用机制, 几乎不可能建立一个完整的理论来预测器件的光电转化效率, 而机器学习可能是一个潜在的解决方案. 本文将描述分子性能与器件制备的参数相结合用以构建数据集. 为了降低泛化误差, 模型分别由随机森林、 支持向量机和多层感知器生成, 其中随机森林展现出最佳性能. 对随机森林进一步优化的结果显示, 100种不同随机状态的测试集R2的平均值收敛于0.9012, 并给出了数据集各参数重要性的定量结果. 研究发现, 数据集的构建对机器学习模型的性能与结论起着至关重要的作用.

关键词: 有机太阳电池, 机器学习, 随机森林, 泛化误差

Abstract:

Organic solar cells(OSCs) have been a very active research field in recent years. There are two main optimization strategies which are novel donor or acceptor and device fabrication. Due to the huge number of influencing factors and their complicated internal interaction mechanism, it’s almost impossible to build a complete theory to describe and analyze device power conversion efficiency(PCE). However, machine learning may be a feasible answer. In this research, molecular properties and device fabrication are combined to build dataset. To decrease generalization error, models are developed by random forest, support vector machine and multiple perceptron. Random forest shows the best performance and is determined to the final algorithm. After further optimization, the test set R2 average of 100 different random state converges on 0.9012 and the quantitative results of feature importance are given. The dataset plays a critical role in the performance of machine learning model. The results indicate the feasibility of applying results given by machine learning models as references for experiments and analysis.

Key words: Organic solar cell, Machine learning, Random forest, Generalization error

中图分类号: 

TrendMD: