色谱 ›› 2018, Vol. 36 ›› Issue (8): 772-779.DOI: 10.3724/SP.J.1123.2018.03001

• 研究论文 • 上一篇    下一篇

基于稀疏模型的气相色谱-质谱数据解析及其在严重重叠峰解析中的应用

伍毅子1, 魏维伟1, 杨华武1, 陈增萍2, 邹有3, 李燕春1, 庹苏行1, 尹双凤2, 钟科军1   

  1. 1. 湖南中烟工业有限责任公司, 湖南 长沙 410014;
    2. 湖南大学化学化工学院, 湖南 长沙 410082;
    3. 中南大学网络教育学院, 湖南 长沙 410083
  • 收稿日期:2018-03-05 出版日期:2018-08-08 发布日期:2014-06-28
  • 通讯作者: 杨华武,Tel:(0731)85559111-22109,E-mail:yanghw0918@hngytobacco.com;陈增萍,E-mail:zpchen@hnu.edu.cn
  • 基金资助:

    国家自然科学基金项目(21775038);湖南中烟工业有限责任公司博士后基金项目(KY2016JC0014).

Gas chromatography-mass spectrometry data analysis algorithm based on sparse model and its application in resolving severely overlapped peaks

WU Yizi1, WEI Weiwei1, YANG Huawu1, CHEN Zengping2, ZOU You3, LI Yanchun1, TUO Suxing1, YIN Shuangfeng2, ZHONG Kejun1   

  1. 1. Hunan Tobacco Industry Co., Ltd., Changsha 410014, China;
    2. College of Chemistry and Chemistry Engineering, Hunan University, Changsha 410082, China;
    3. College of Network Education, Central South University, Changsha 410083, China
  • Received:2018-03-05 Online:2018-08-08 Published:2014-06-28
  • Supported by:

    National Natural Science Foundation of China (No.21775038);Postdoctoral Fund of Hunan Tobacco Industry Co., Ltd.(No.KY2016JC0014).

摘要:

提出一种气相色谱-质谱(GC-MS)数据解析算法。以色谱峰顶点处的质谱作为待测谱,在谱库中检索一定量相关参考谱,求解关于各纯组分色谱响应值的方程。质谱检索采取分步策略,先利用质谱碎片规律建立高速索引进行粗选,然后使用强峰高概率出峰准则和耐挤压性准则排除无关质谱。为求解色谱响应值方程,提出基于稀疏模型的回归算法,相比传统算法,稀疏算法利于提取待测谱的主要结构,避免"过拟合"。实验结果表明,该质谱检索算法具有较高的精度和规模较小的剩余参考谱集,而所提稀疏模型算法可有效解析严重重叠峰。该算法可作为GC-MS重叠峰解析,特别是严重重叠峰解析的一种有效解决方案。

关键词: 气相色谱-质谱, 稀疏模型, 严重重叠峰, 质谱检索

Abstract:

A Gas chromatography-mass spectrometry (GC-MS) data analysis algorithm is proposed. The mass spectrum at the top of the chromatographic peak is the spectrum to be solved. A certain amount of related reference spectra is retrieved from the spectral library, then, the equation of the chromatographic response value of each pure component is solved. A step by step strategy is used for the mass spectra retrieval. Firstly, an efficient indexing technology is used for rough selection, then, the "strong peak out with high probability" and the "extrusion" rules are used to exclude more unrelated mass spectra. A regression algorithm based on a sparse model is proposed to solve the equation of the chromatographic response value. Compared with the traditional algorithm, this algorithm can extract the main structure of the spectrum to be solved, and avoid over-fitting. The experimental results show that the proposed algorithm has a higher accuracy and smaller residual reference spectrum set, and the sparse model achieves satisfactory experimental results in the analysis of severely overlapped peaks. The proposed method provides an effective solution for resolving overlapped peaks, especially severely overlapped peaks, in GC-MS data.

Key words: gas chromatography-mass spectrometry (GC-MS), mass spectrometry retrieval, severely overlapped peaks, sparse model

中图分类号: