Chem. J. Chinese Universities ›› 2009, Vol. 30 ›› Issue (7): 1309.

• Articles • Previous Articles     Next Articles

Prediction of HLA-A*0201 Binding Peptides Using Binding-environment-based Peptide Representation

ZHAO Pu, LI Tong-Hua*   

  1. Department of Chemistry, Tongji University, Shanghai 200092, China
  • Received:2008-10-13 Online:2009-07-10 Published:2009-07-10
  • Contact: LI Tong-Hua. E-mail: lith@tongji.edu.cn
  • Supported by:

    国家自然科学基金(批准号: 20675057, 20705024)资助.

Abstract:

In all vertebrates, there is a large genomic region or gene family that has a major influence on graft survival referred to as the Major Histocompatibility Complex(MHC). T cells only recognize antigens as a complex with MHC molecules. Therefore MHC binding peptides prediction is an important step in T cells epitopes discovery. To facilitate vaccine design, computational methods have been developed for predicting MHC-binding peptides. A large variety of machine-learning techniques are commonly used in this field. This work explored Support Vector Machine(SVM) as such a method for developing prediction systems of HLA-A*0201 by using experiment dataset. Data representations play a key role in SVM models, so we examined different types of inputs variables for predicting HLA-binding peptides. The AUCs of these SVM models were 0.932—0.936. Then this work proposed a new way to encode peptides, which uses the information of peptides′ binding environment, and achieved an impressive AUC of 0.953. The results of independent dataset prediction showed that the overall performance of our novel environmental encoding based SVM models is improved in comparison to other traditional encodings.

Key words: HLA-A*0201; Binding peptides prediction; Support Vector Machine(SVM); Data representations; Receiver Operating Characteristic(ROC)

TrendMD: