色谱 ›› 2025, Vol. 43 ›› Issue (6): 696-704.DOI: 10.3724/SP.J.1123.2024.05012

• 研究论文 • 上一篇    下一篇

基于深度学习的血红蛋白等电聚焦电泳图谱快速识别分析

季伟宸1, 田佑吏1, 符浩东2, 查根晗1, 曹成喜1, 魏丽3,*(), 张强1,*()   

  1. 1.上海交通大学电子信息与电气工程学院,上海 200240
    2.海南师范大学化学化工学院,海南 海口 571158
    3.上海交通大学附属第六人民医院,上海 200235
  • 收稿日期:2024-05-15 出版日期:2025-06-08 发布日期:2025-05-21
  • 通讯作者: * E-mail:billy_zq@sjtu.edu.cn(张强); E-mail:18930173636@189.com(魏丽).
  • 基金资助:
    国家自然科学基金项目(22104082);国家自然科学基金项目(31727801)

Rapid identification and analysis of hemoglobin isoelectric focusing electrophoresis images based on deep learning

JI Weichen1, TIAN Youli1, FU Haodong2, ZHA Genhan1, CAO Chengxi1, WEI Li3,*(), ZHANG Qiang1,*()   

  1. 1. School of Electronic Information & Electrical Engineering,Shanghai Jiao Tong University,Shanghai 200240,China
    2. College of Chemistry and Chemical Engineering,Hainan Normal University,Haikou 571158,China
    3. Shanghai Sixth People’s Hospital,Shanghai Jiao Tong University,Shanghai 200235,China
  • Received:2024-05-15 Online:2025-06-08 Published:2025-05-21
  • Supported by:
    National Natural Science Foundation of China(22104082);National Natural Science Foundation of China(31727801)

摘要:

基于轮廓分析的传统条带检测算法步骤繁琐,并且需要校正算法来应对背景不均匀、泳道扭曲和条带变形等问题。为了避免校正算法在检测过程中给条带分析结果所带来的误差,本文提出了一种基于深度学习目标检测算法的凝胶电泳图谱条带快速识别方法,并将该方法应用于血红蛋白(Hb)等电聚焦(IEF)电泳图谱的分析中。将通过微阵列IEF(mIEF)电泳实验收集的1 665张Hb IEF电泳图谱作为训练数据集,结合YOLOv8模型进行训练。依据模型推理得到的条带边界框位置和分类结果,对条带区域的像素灰度强度进行加和,以此计算各蛋白质的含量。研究结果表明,YOLOv8n模型在保持较低计算资源占用的同时,达到了92.9%的检测精度,实现了0.6 ms的推理时间,并成功应用于无等电点(pI)标志物条件下的Hb IEF电泳图谱条带的准确检测。以血红蛋白A2(Hb A2)为例,将本方法测得的Hb A2含量与临床检测结果进行对比,回归分析显示二者的线性度高达0.981 2,相关系数为0.980 0;进一步通过Bland-Altman分析法评估两种方法之间的一致性,结果表明,该方法与临床方法具有较好的一致性。与传统的自动条带检测方法相比,本文提出的方法快速、准确,重复性和稳定性更好。该方法可应用于临床实践中Hb A2含量的测定,并在成人β-地中海贫血疾病辅助诊断方面具备应用潜力。

关键词: 深度学习, 条带检测, 无标志物, 等电聚焦电泳, 血红蛋白

Abstract:

Gel electrophoresis is used to separate and analyze macromolecules (such as DNA, RNA, and proteins) and their fragments, and highly reproducible and efficient automatic band-detection methods have been developed to analyze gel images. Uneven background, low contrast, lane distortion, blurred band edges, and geometric deformation pose detection-accuracy challenges during automatic band detection. In order to address these issues, various correction algorithms have been proposed; however, these algorithms rely on researcher experience to adjust and optimize parameters based on image characteristics, which introduces human error while qualitatively and quantitatively processing bands. Isoelectric focusing (IEF) gel electrophoresis separates proteins with high-resolution based on isoelectric point (pI) differences. Microarray IEF (mIEF) is used for the auxiliary diagnosis of diabetes and adult β-thalassemia owing to operational ease, low sample consumption, and high throughout. This diagnostic method relies on accurately positioning and precisely determining protein bands. To avoid errors associated with correction algorithms during band analysis, this paper introduces a method for rapidly recognizing bands in gel electrophoresis patterns that relies on a deep learning object detection algorithm, and uses it to quantify and classify the IEF electrophoresis pattern of hemoglobin (Hb). We used mIEF experiments to collect 1 665 pI-marker-free Hb IEF images as a model dataset to train the YOLOv8 model. The trained model accepts a Hb IEF image as input and infers band bounding boxes and classification results. Using inference data, the gray intensities of the pixels in each band area are summed to determine the content of each protein. The background and foreground of the image need to be separated prior to summing the abovementioned gray intensities, and the threshold method is used to achieve this. The threshold is defined as the average intensity of the background area, which is obtained by summing and averaging the background intensities of gel areas between the detection bounding boxes of each protein band. The baseline band areas are unified after removing the background. This method only requires the input image, directly outputs the corresponding electrophoretic band information, and does not rely on the experience of professionals nor is it affected by factors such as lane distortion or band deformation. In addition, the developed method does not depend on pI markers for qualitatively determining bands, thereby reducing experimental costs and improving detection efficiency. YOLOv8n delivered a detection accuracy of 92.9% and an inference time of 0.6 ms while using limited computing resources. Using Hb A2 as an example, we compared its content measured using the developed method with clinical data. The quantitative results were subjected to regression analysis, which delivered a linearity of 0.981 2 and a correlation coefficient of 0.980 0. We also used the Bland-Altman analysis method to verify that these two values are highly consistent. Compared with the traditional automatic band detection methods, the method developed in this study is fast, accurate, more repeatable, and stable, and can be used to determine the Hb A2 content in clinical practice, thereby potentially assisting in the auxiliary diagnosis of adult β-thalassemia.

Key words: deep learning, band detection, marker-free, isoelectric focusing (IEF) electrophoresis, hemoglobin (Hb)

中图分类号: