高等学校化学学报 ›› 2011, Vol. 32 ›› Issue (2): 262.

• 研究论文 • 上一篇    下一篇

核磁共振代谢组学数据的尺度归一化新方法

董继扬1,2,李伟1,邓伶莉1,许晶晶1,Julian L. Griffin2,陈忠1   

  1. 1. 厦门大学物理系, 福建省等离子体与磁共振研究重点实验室, 厦门 361005;
    2. Department of Biochemistry, University of Cambridge, Cambridge CB2 1GA, UK
  • 收稿日期:2010-05-19 修回日期:2010-08-10 出版日期:2010-02-10 发布日期:2011-02-23
  • 通讯作者: 陈忠 E-mail:chenz@xmu.edu.cn
  • 基金资助:

    国家卫生部科学研究基金-福建省卫生教育联合攻关计划(批准号: WKJ2008-2-36)和福建省自然科学基金(批准号: 2009J01299)资助.

New Variable Scaling Method for NMR-based Metabolomics Data Analysis

DONG Ji-Yang1,2, LI Wei1, DENG Ling-Li1, XU Jing-Jing1, Julian L. Griffin2, CHEN Zhong1*   

  1. 1. Fujian Key Laboratory of Plasma and Magnetic Resonance, Department of Physics, Xiamen University, Xiamen 361005, China;
    2. Department of Biochemistry, University of Cambridge, Cambridge CB2 1GA, UK
  • Received:2010-05-19 Revised:2010-08-10 Online:2010-02-10 Published:2011-02-23
  • Contact: CHEN Zhong E-mail:chenz@xmu.edu.cn
  • Supported by:

    国家卫生部科学研究基金-福建省卫生教育联合攻关计划(批准号: WKJ2008-2-36)和福建省自然科学基金(批准号: 2009J01299)资助.

摘要: 在核磁共振代谢组学数据预处理中,尺度归一化主要目的是提高特征代谢物信息的权重,减小噪声及无关代谢物信息的影响,从而降低后续模式识别分析的难度. 本文提出一种新的尺度归一化方法,该方法不强调各变量在尺度上的归一,而是在原始数据的基础上,通过提高那些稳定性高、且在不同类别样本中具有显著差异性的变量的权重,以增强与特征代谢物相关的信息. 文中分别采用模拟数据和真实代谢组学数据对新归一化方法的性能进行评估,并与单位方差法(Unit Variance)、变量稳定性(Variable Stability)和尺度缩放法(Level Scaling)等常用的尺度归一化方法做比较. 研究结果表明:新归一化方法能够提高多变量统计模型的预测能力,较好地保留核磁共振谱的分子信息,有助于特征代谢物的识别,并使后续的数据分析结果具有更好的可解释性.

关键词: 尺度归一化, 核磁共振, 代谢组学, 特征代谢物

Abstract: Variable scaling is an important data pre-processing step in NMR metabolomics, especially for biomarkers identification. It aims to make the subsequent multivariate analysis more reliable and easier by highlighting the biomarkers-related variables, and reducing the contamination of the noise and irrelevant variables. A new scaling method is proposed in this paper. The proposed method adjusts the weight of variables by their significance and stabilities in order to enhance the variable probably related to signature metabolites. Both of simulated dataset and real metabolomic dataset are used to estimate the performance of the proposed method. Comparing with Unit Variance (UV), VAriable STability (VAST) and Level Scaling (LS) methods, the new scaling method would be robust to preserve molecular information of NMR spectra, improving the predictive ability of multivariate statistical model and making the results of subsequent analysis more interpretable. Therefore, the method proposed herein is more suitable for biomarker identification.

Key words: Variable scaling, NMR, Metabolomics, Signature metabolites

TrendMD: