高等学校化学学报 ›› 2020, Vol. 41 ›› Issue (1): 94.doi: 10.7503/cjcu20190400

• 分析化学 • 上一篇    下一篇

基于RF_AdaBoost模型的血液种属鉴别算法

魏曼曼1,路皓翔2,杨辉华1,3,*()   

  1. 1. 桂林电子科技大学计算机与信息安全学院
    2. 电子工程与自动化学院, 桂林 541004
    3. 北京邮电大学自动化学院, 北京 100876
  • 收稿日期:2019-07-19 出版日期:2020-01-10 发布日期:2019-10-11
  • 通讯作者: 杨辉华 E-mail:yhh@bupt.edu.cn
  • 基金资助:
    国家自然科学基金批准号:(21365008);国家自然科学基金批准号:(61105004);广西自动检测技术与仪器重点实验室主任基金项目资助批准号:(YQ18108)

Research on Boold Species Ide.pngication Algorithm Based on RF_AdaBoost Model

WEI Manman1,LU Haoxiang2,YANG Huihua1,3,*()   

  1. 1. School of Computer and Information Security
    2. School of Electronic Engineering and Automation, Guilin University of Electronic Technology, Guilin 541004, China
    3. School of Automation, Beijing University of Posts and Telecommunications, Beijing 100876, China
  • Received:2019-07-19 Online:2020-01-10 Published:2019-10-11
  • Contact: Huihua YANG E-mail:yhh@bupt.edu.cn
  • Supported by:
    ? Supported by the National Natural Science Foundation of China Nos(21365008);? Supported by the National Natural Science Foundation of China Nos(61105004);and the Guangxi Key Laboratory of Automatic Testing Technology and Instrumentation Director Fund Project, China No(YQ18108)

摘要:

针对人类和非人类血液种属鉴别对无损、 高效分析方法的需求, 结合随机森林(Random Forest)和AdaBoost(Adaptive Boosting Algorithm)算法, 提出了一种血液种属鉴别方法(RF_AdaBoost). 该方法将RF作为AdaBoost的弱分类器, 以达到提高模型鉴别准确度, 增强模型鲁棒性的目的. 采用RF、 支持向量机(SVM)、 极限学习机(ELM)、 核极限学习机(KELM)、 堆栈自编码网络(SAE)、 反向传播网络(BP)、 主成分分析-线性判别法(PCA-LDA)及偏最小二乘判别分析(PLS-DA)与RF_AdaBoost模型进行对比, 以不同规模血液拉曼光谱数据训练集进行鉴别实验评估其性能. 结果表明, 随着训练样本的增加, RF_AdaBoost鉴别准确度最高达100%, 预测标准偏差趋于0. 与其它模型相比, RF_AdaBoost具有较高的分类准确度及较强的稳定性, 为血液种属的鉴别工作提供了新方法.

关键词: 拉曼光谱, 随机森林, AdaBoost算法, 集成学习, 血液种属鉴别

Abstract: Aim ing at the requirements of non-destructive and high-efficiency analysis methods for ide.pngying human and non-human blood species, a method of blood species ide.pngication based on Random Forest combined Adaptive Boosting Algorithm(RF_AdaBoost) was proposed. This method uses RF as the weak classifier of AdaBoost to improve the ide.pngication accuracy and enhance the robustness of the model. RF, Extreme Learning Machine(ELM), Kernel Extreme Learning Machine(KELM), Stacked Auto-Encoder(SAE), Back Propagation(BP), Principal Component Analysis and Linear Discriminant Analysis(PCA-LDA), Partial Least Squares Discriminant Analysis(PLS-DA) are used to compare with the RF_AdaBoost model, and the training sets of different scales of blood Raman spectroscopy data were used for ide.pngication experiments to evaluate its performance. With the increase of training samples in the experiment, the ide.pngication accuracy of RF_AdaBoost is up to 100%, and the prediction standard deviation tends to zero. The results show that RF_AdaBoost has higher classification accuracy and stronger stability than other models, which provides an effective new method for the ide.pngication of blood species.

Key words: Raman spectroscopy, Random forest, AdaBoost algorithm, Ensemble learning, Boold species ide.pngication

中图分类号: 

TrendMD: