• 1. Department of Electronic Engineering, Tsinghua University, Beijing 100084, P. R. China;
  • 2. Beijing Institute of Technology, Beijing 100081, P. R. China;
ZENG Liang, Email: liang@bit.edu.cn
Export PDF Favorites Scan Get Citation

Leukemia is a common, multiple and dangerous blood disease, whose early diagnosis and treatment are very important. At present, the diagnosis of leukemia heavily relies on morphological examination of blood cell images by pathologists, which is tedious and time-consuming. Meanwhile, the diagnostic results are highly subjective, which may lead to misdiagnosis and missed diagnosis. To address the gap above, we proposed an improved Vision Transformer model for blood cell recognition. First, a faster R-CNN network was used to locate and extract individual blood cell slices from original images. Then, we split the single-cell image into multiple image patches and put them into the encoder layer for feature extraction. Based on the self-attention mechanism of the Transformer, we proposed a sparse attention module which could focus on the discriminative parts of blood cell images and improve the fine-grained feature representation ability of the model. Finally, a contrastive loss function was adopted to further increase the inter-class difference and intra-class consistency of the extracted features. Experimental results showed that the proposed module outperformed the other approaches and significantly improved the accuracy to 91.96% on the Munich single-cell morphological dataset of leukocytes, which is expected to provide a reference for physicians’ clinical diagnosis.

Citation: SUN Tianyu, ZHU Qingtao, YANG Jian, ZENG Liang. An improved Vision Transformer model for the recognition of blood cells. Journal of Biomedical Engineering, 2022, 39(6): 1097-1107. doi: 10.7507/1001-5515.202203008 Copy

  • Previous Article

    Single-channel electroencephalogram signal used for sleep state recognition based on one-dimensional width kernel convolutional neural networks and long-short-term memory networks
  • Next Article

    Image segmentation of skin lesions based on dense atrous spatial pyramid pooling and attention mechanism