高等学校化学学报 ›› 2009, Vol. 30 ›› Issue (6): 1101.

• 研究论文 • 上一篇    下一篇

核磁共振代谢组学数据预处理中的自适应分段积分方法

董继扬, 徐乐, 许晶晶, 陈忠   

  1. 厦门大学物理系, 固体表面物理化学国家重点实验室, 厦门 361005
  • 收稿日期:2009-01-13 出版日期:2009-06-10 发布日期:2009-06-10
  • 通讯作者: 董继扬, 男, 博士, 副教授, 从事核磁共振代谢组学研究, E-mail: jydong@xmu.edu.cn; 陈忠, 男, 博士, 教授, 博士生导师, 从事核磁共振波谱学研究, E-mail: chenz@xmu.edu.cn
  • 基金资助:

    国家卫生部科学研究基金-福建省卫生教育联合攻关计划(批准号: WKJ2008-2-36)和国家自然科学基金(批准号: 10605019)资助.

Adaptive Binning Method for NMR Spectroscopic Metabonomics Data Preprocessing

DONG Ji-Yang*, XU Le, XU Jing-Jing, CHEN Zhong*   

  1. State Key Laboratory of Physical Chemistry of Solid Surface, Physics Department, Xiamen University, Xiamen 361005, China
  • Received:2009-01-13 Online:2009-06-10 Published:2009-06-10
  • Contact: DONG Ji-Yang, E-mail: jydong@xmu.edu.cn; CHEN Zhong, E-mail: chenz@xmu.edu.cn
  • Supported by:

    国家卫生部科学研究基金-福建省卫生教育联合攻关计划(批准号: WKJ2008-2-36)和国家自然科学基金(批准号: 10605019)资助.

摘要:

提出一种用于核磁共振代谢组学数据预处理的自适应分段积分方法. 通过计算各数据点统计特性, 并根据相邻数据点的统计差特性进行自适应积分, 克服了目前普遍采用的等间隔分段积分法可能存在的缺陷(如统计差异性相反的信号相互抵消、微弱特征信号被掩盖及谱图信噪比下降等), 从而避免了对后续统计分析所产生的负面影响. 为比较自适应分段积分和等间隔积分对数据预处理的效果, 分别采用计算机模拟数据和饮食差异人群两种模型进行分析. 研究结果表明, 新方法能够有效地削弱噪声和非特异信号的影响, 提高后续的主成分分析结果的可靠性, 使代谢组学数据分析更具生物学意义.

关键词: 核磁共振, 代谢组学, 自适应分段积分, 统计差异性, 数据预处理

Abstract:

A novel adaptive binning method was proposed for NMR metabonomic data preprocessing. The statistical discrepancy of each spectral data point is estimated, then the contiguous data points are integrated adaptively based on the statistical discrepancy. Comparing to the fixed width binning, the proposed method can overcome the following negative effects on the subsequently statistical analysis. For example, signals with opposite statistical discrepancies may be superposition in a same region. Both simulated NMR data and experimental spectra from dietary intervention individuals were employed to validate the performance of the adaptive binning. The results show that the proposed method effectively mitigates disturbance from spectral noises and signals without statistical significance. It can increase the interpretability of PCA loading results so that the metabonomics results are more biological significant.

Key words: NMR spectroscopy, Metabonomics, Adaptive binning, Statistical discrepancy, Data preproces-sing

TrendMD: