west china medical publishers
Keyword
  • Title
  • Author
  • Keyword
  • Abstract
Advance search
Advance search

Search

find Keyword "attention mechanism" 8 results
  • A hybrid attention temporal sequential network for sleep stage classification

    Sleep stage classification is a necessary fundamental method for the diagnosis of sleep diseases, which has attracted extensive attention in recent years. Traditional methods for sleep stage classification, such as manual marking methods and machine learning algorithms, have the limitations of low efficiency and defective generalization. Recently, deep neural networks have shown improved results by the capability of learning complex pattern in the sleep data. However, these models ignore the intra-temporal sequential information and the correlation among all channels in each segment of the sleep data. To solve these problems, a hybrid attention temporal sequential network model is proposed in this paper, choosing recurrent neural network to replace traditional convolutional neural network, and extracting temporal features of polysomnography from the perspective of time. Furthermore, intra-temporal attention mechanism and channel attention mechanism are adopted to achieve the fusion of the intra-temporal representation and the fusion of channel-correlated representation. And then, based on recurrent neural network and inter-temporal attention mechanism, this model further realized the fusion of inter-temporal contextual representation. Finally, the end-to-end automatic sleep stage classification is accomplished according to the above hybrid representation. This paper evaluates the proposed model based on two public benchmark sleep datasets downloaded from open-source website, which include a number of polysomnography. Experimental results show that the proposed model could achieve better performance compared with ten state-of-the-art baselines. The overall accuracy of sleep stage classification could reach 0.801, 0.801 and 0.717, respectively. Meanwhile, the macro average F1-scores of the proposed model could reach 0.752, 0.728 and 0.700. All experimental results could demonstrate the effectiveness of the proposed model.

    Release date:2021-06-18 04:50 Export PDF Favorites Scan
  • Cascaded multi-level medical image registration method based on transformer

    In deep learning-based image registration, the deformable region with complex anatomical structures is an important factor affecting the accuracy of network registration. However, it is difficult for existing methods to pay attention to complex anatomical regions of images. At the same time, the receptive field of the convolutional neural network is limited by the size of its convolution kernel, and it is difficult to learn the relationship between the voxels with far spatial location, making it difficult to deal with the large region deformation problem. Aiming at the above two problems, this paper proposes a cascaded multi-level registration network model based on transformer, and equipped it with a difficult deformable region perceptron based on mean square error. The difficult deformation perceptron uses sliding window and floating window techniques to retrieve the registered images, obtain the difficult deformation coefficient of each voxel, and identify the regions with the worst registration effect. In this study, the cascaded multi-level registration network model adopts the difficult deformation perceptron for hierarchical connection, and the self-attention mechanism is used to extract global features in the basic registration network to optimize the registration results of different scales. The experimental results show that the method proposed in this paper can perform progressive registration of complex deformation regions, thereby optimizing the registration results of brain medical images, which has a good auxiliary effect on the clinical diagnosis of doctors.

    Release date: Export PDF Favorites Scan
  • Research on classification of Korotkoff sounds phases based on deep learning

    Objective To recognize the different phases of Korotkoff sounds through deep learning technology, so as to improve the accuracy of blood pressure measurement in different populations. Methods A classification model of the Korotkoff sounds phases was designed, which fused attention mechanism (Attention), residual network (ResNet) and bidirectional long short-term memory (BiLSTM). First, a single Korotkoff sound signal was extracted from the whole Korotkoff sounds signals beat by beat, and each Korotkoff sound signal was converted into a Mel spectrogram. Then, the local feature extraction of Mel spectrogram was processed by using the Attention mechanism and ResNet network, and BiLSTM network was used to deal with the temporal relations between features, and full-connection layer network was applied in reducing the dimension of features. Finally, the classification was completed by SoftMax function. The dataset used in this study was collected from 44 volunteers (24 females, 20 males with an average age of 36 years), and the model performance was verified using 10-fold cross-validation. Results The classification accuracy of the established model for the 5 types of Korotkoff sounds phases was 93.4%, which was higher than that of other models. Conclusion This study proves that the deep learning method can accurately classify Korotkoff sounds phases, which lays a strong technical foundation for the subsequent design of automatic blood pressure measurement methods based on the classification of the Korotkoff sounds phases.

    Release date: Export PDF Favorites Scan
  • Study on the method of polysomnography sleep stage staging based on attention mechanism and bidirectional gate recurrent unit

    Polysomnography (PSG) monitoring is an important method for clinical diagnosis of diseases such as insomnia, apnea and so on. In order to solve the problem of time-consuming and energy-consuming sleep stage staging of sleep disorder patients using manual frame-by-frame visual judgment PSG, this study proposed a deep learning algorithm model combining convolutional neural networks (CNN) and bidirectional gate recurrent neural networks (Bi GRU). A dynamic sparse self-attention mechanism was designed to solve the problem that gated recurrent neural networks (GRU) is difficult to obtain accurate vector representation of long-distance information. This study collected 143 overnight PSG data of patients from Shanghai Mental Health Center with sleep disorders, which were combined with 153 overnight PSG data of patients from the open-source dataset, and selected 9 electrophysiological channel signals including 6 electroencephalogram (EEG) signal channels, 2 electrooculogram (EOG) signal channels and a single mandibular electromyogram (EMG) signal channel. These data were used for model training, testing and evaluation. After cross validation, the accuracy was (84.0±2.0)%, and Cohen's kappa value was 0.77±0.50. It showed better performance than the Cohen's kappa value of physician score of 0.75±0.11. The experimental results show that the algorithm model in this paper has a high staging effect in different populations and is widely applicable. It is of great significance to assist clinicians in rapid and large-scale PSG sleep automatic staging.

    Release date: Export PDF Favorites Scan
  • Research on classification method of multimodal magnetic resonance images of Alzheimer’s disease based on generalized convolutional neural networks

    Alzheimer’s disease (AD) is a progressive and irreversible neurodegenerative disease. Neuroimaging based on magnetic resonance imaging (MRI) is one of the most intuitive and reliable methods to perform AD screening and diagnosis. Clinical head MRI detection generates multimodal image data, and to solve the problem of multimodal MRI processing and information fusion, this paper proposes a structural and functional MRI feature extraction and fusion method based on generalized convolutional neural networks (gCNN). The method includes a three-dimensional residual U-shaped network based on hybrid attention mechanism (3D HA-ResUNet) for feature representation and classification for structural MRI, and a U-shaped graph convolutional neural network (U-GCN) for node feature representation and classification of brain functional networks for functional MRI. Based on the fusion of the two types of image features, the optimal feature subset is selected based on discrete binary particle swarm optimization, and the prediction results are output by a machine learning classifier. The validation results of multimodal dataset from the AD Neuroimaging Initiative (ADNI) open-source database show that the proposed models have superior performance in their respective data domains. The gCNN framework combines the advantages of these two models and further improves the performance of the methods using single-modal MRI, improving the classification accuracy and sensitivity by 5.56% and 11.11%, respectively. In conclusion, the gCNN-based multimodal MRI classification method proposed in this paper can provide a technical basis for the auxiliary diagnosis of Alzheimer’s disease.

    Release date: Export PDF Favorites Scan
  • Electrocardiogram signal classification based on fusion method of residual network and self-attention mechanism

    In the diagnosis of cardiovascular diseases, the analysis of electrocardiogram (ECG) signals has always played a crucial role. At present, how to effectively identify abnormal heart beats by algorithms is still a difficult task in the field of ECG signal analysis. Based on this, a classification model that automatically identifies abnormal heartbeats based on deep residual network (ResNet) and self-attention mechanism was proposed. Firstly, this paper designed an 18-layer convolutional neural network (CNN) based on the residual structure, which helped model fully extract the local features. Then, the bi-directional gated recurrent unit (BiGRU) was used to explore the temporal correlation for further obtaining the temporal features. Finally, the self-attention mechanism was built to weight important information and enhance model's ability to extract important features, which helped model achieve higher classification accuracy. In addition, in order to mitigate the interference on classification performance due to data imbalance, the study utilized multiple approaches for data augmentation. The experimental data in this study came from the arrhythmia database constructed by MIT and Beth Israel Hospital (MIT-BIH), and the final results showed that the proposed model achieved an overall accuracy of 98.33% on the original dataset and 99.12% on the optimized dataset, which demonstrated that the proposed model can achieve good performance in ECG signal classification, and possessed potential value for application to portable ECG detection devices.

    Release date: Export PDF Favorites Scan
  • Research on Parkinson’s disease recognition algorithm based on sample enhancement

    Parkinson’s disease patients have early vocal cord damage, and their voiceprint characteristics differ significantly from those of healthy individuals, which can be used to identify Parkinson's disease. However, the samples of the voiceprint dataset of Parkinson's disease patients are insufficient, so this paper proposes a double self-attention deep convolutional generative adversarial network model for sample enhancement to generate high-resolution spectrograms, based on which deep learning is used to recognize Parkinson’s disease. This model improves the texture clarity of samples by increasing network depth and combining gradient penalty and spectral normalization techniques, and a family of pure convolutional neural networks (ConvNeXt) classification network based on Transfer learning is constructed to extract voiceprint features and classify them, which improves the accuracy of Parkinson’s disease recognition. The validation experiments of the effectiveness of this paper’s algorithm are carried out on the Parkinson’s disease speech dataset. Compared with the pre-sample enhancement, the clarity of the samples generated by the proposed model in this paper as well as the Fréchet inception distance (FID) are improved, and the network model in this paper is able to achieve an accuracy of 98.8%. The results of this paper show that the Parkinson’s disease recognition algorithm based on double self-attention deep convolutional generative adversarial network sample enhancement can accurately distinguish between healthy individuals and Parkinson’s disease patients, which helps to solve the problem of insufficient samples for early recognition of voiceprint data in Parkinson’s disease. In summary, the method effectively improves the classification accuracy of small-sample Parkinson's disease speech dataset and provides an effective solution idea for early Parkinson's disease speech diagnosis.

    Release date: Export PDF Favorites Scan
  • Skin lesion classification with multi-level fusion of Swin-T and ConvNeXt

    Skin cancer is a significant public health issue, and computer-aided diagnosis technology can effectively alleviate this burden. Accurate identification of skin lesion types is crucial when employing computer-aided diagnosis. This study proposes a multi-level attention cascaded fusion model based on Swin-T and ConvNeXt. It employed hierarchical Swin-T and ConvNeXt to extract global and local features, respectively, and introduced residual channel attention and spatial attention modules for further feature extraction. Multi-level attention mechanisms were utilized to process multi-scale global and local features. To address the problem of shallow features being lost due to their distance from the classifier, a hierarchical inverted residual fusion module was proposed to dynamically adjust the extracted feature information. Balanced sampling strategies and focal loss were employed to tackle the issue of imbalanced categories of skin lesions. Experimental testing on the ISIC2018 and ISIC2019 datasets yielded accuracy, precision, recall, and F1-Score of 96.01%, 93.67%, 92.65%, and 93.11%, respectively, and 92.79%, 91.52%, 88.90%, and 90.15%, respectively. Compared to Swin-T, the proposed method achieved an accuracy improvement of 3.60% and 1.66%, and compared to ConvNeXt, it achieved an accuracy improvement of 2.87% and 3.45%. The experiments demonstrate that the proposed method accurately classifies skin lesion images, providing a new solution for skin cancer diagnosis.

    Release date: Export PDF Favorites Scan
1 pages Previous 1 Next

Format

Content