Remarkable results have been realized by the U-Net network in the task of medical image segmentation. In recent years, many scholars have been researching the network and expanding its structure, such as improvement of encoder and decoder and improvement of skip connection. Based on the optimization of U-Net structure and its medical image segmentation techniques, this paper elucidates in the following: First, the paper elaborates on the application of U-Net in the field of medical image segmentation; Then, the paper summarizes the seven improvement mechanism of U-Net: dense connection mechanism, residual connection mechanism, multi-scale mechanism, ensemble mechanism, dilated mechanism, attention mechanism, and transformer mechanism; Finally, the paper states the ideas and methods on the U-Net structure improvement in a bid to provide a reference for later researches, which plays a significant part in advancing U-Net.
High resolution (HR) magnetic resonance images (MRI) or computed tomography (CT) images can provide clearer anatomical details of human body, which facilitates early diagnosis of the diseases. However, due to the imaging system, imaging environment and human factors, it is difficult to obtain clear high-resolution images. In this paper, we proposed a novel medical image super resolution (SR) reconstruction method via multi-scale information distillation (MSID) network in the non-subsampled shearlet transform (NSST) domain, namely NSST-MSID network. We first proposed a MSID network that mainly consisted of a series of stacked MSID blocks to fully exploit features from images and effectively restore the low resolution (LR) images to HR images. In addition, most previous methods predict the HR images in the spatial domain, producing over-smoothed outputs while losing texture details. Thus, we viewed the medical image SR task as the prediction of NSST coefficients, which make further MSID network keep richer structure details than that in spatial domain. Finally, the experimental results on our constructed medical image datasets demonstrated that the proposed method was capable of obtaining better peak signal to noise ratio (PSNR), structural similarity (SSIM) and root mean square error (RMSE) values and keeping global topological structure and local texture detail better than other outstanding methods, which achieves good medical image reconstruction effect.
Intelligent medical image segmentation methods have been rapidly developed and applied, while a significant challenge is domain shift. That is, the segmentation performance degrades due to distribution differences between the source domain and the target domain. This paper proposed an unsupervised end-to-end domain adaptation medical image segmentation method based on the generative adversarial network (GAN). A network training and adjustment model was designed, including segmentation and discriminant networks. In the segmentation network, the residual module was used as the basic module to increase feature reusability and reduce model optimization difficulty. Further, it learned cross-domain features at the image feature level with the help of the discriminant network and a combination of segmentation loss with adversarial loss. The discriminant network took the convolutional neural network and used the labels from the source domain, to distinguish whether the segmentation result of the generated network is from the source domain or the target domain. The whole training process was unsupervised. The proposed method was tested with experiments on a public dataset of knee magnetic resonance (MR) images and the clinical dataset from our cooperative hospital. With our method, the mean Dice similarity coefficient (DSC) of segmentation results increased by 2.52% and 6.10% to the classical feature level and image level domain adaptive method. The proposed method effectively improves the domain adaptive ability of the segmentation method, significantly improves the segmentation accuracy of the tibia and femur, and can better solve the domain transfer problem in MR image segmentation.
In recent years, the task of object detection and segmentation in medical image is the research hotspot and difficulty in the field of image processing. Instance segmentation provides instance-level labels for different objects belonging to the same class, so it is widely used in the field of medical image processing. In this paper, medical image instance segmentation was summarized from the following aspects: First, the basic principle of instance segmentation was described, the instance segmentation models were classified into three categories, the development context of the instance segmentation algorithm was displayed in two-dimensional space, and six classic model diagrams of instance segmentation were given. Second, from the perspective of the three models of two-stage instance segmentation, single-stage instance segmentation and three-dimensional (3D) instance segmentation, we summarized the ideas of the three types of models, discussed the advantages and disadvantages, and sorted out the latest developments. Third, the application status of instance segmentation in six medical images such as colon tissue image, cervical image, bone imaging image, pathological section image of gastric cancer, computed tomography (CT) image of lung nodule and X-ray image of breast was summarized. Fourth, the main challenges in the field of medical image instance segmentation were discussed and the future development direction was prospected. In this paper, the principle, models and characteristics of instance segmentation are systematically summarized, as well as the application of instance segmentation in the field of medical image processing, which is of positive guiding significance to the study of instance segmentation.
Computer-aided diagnosis (CAD) systems play a very important role in modern medical diagnosis and treatment systems, but their performance is limited by training samples. However, the training samples are affected by factors such as imaging cost, labeling cost and involving patient privacy, resulting in insufficient diversity of training images and difficulty in data obtaining. Therefore, how to efficiently and cost-effectively augment existing medical image datasets has become a research hotspot. In this paper, the research progress on medical image dataset expansion methods is reviewed based on relevant literatures at home and abroad. First, the expansion methods based on geometric transformation and generative adversarial networks are compared and analyzed, and then improvement of the augmentation methods based on generative adversarial networks are emphasized. Finally, some urgent problems in the field of medical image dataset expansion are discussed and the future development trend is prospected.
Aiming at the problems of missing important features, inconspicuous details and unclear textures in the fusion of multimodal medical images, this paper proposes a method of computed tomography (CT) image and magnetic resonance imaging (MRI) image fusion using generative adversarial network (GAN) and convolutional neural network (CNN) under image enhancement. The generator aimed at high-frequency feature images and used double discriminators to target the fusion images after inverse transform; Then high-frequency feature images were fused by trained GAN model, and low-frequency feature images were fused by CNN pre-training model based on transfer learning. Experimental results showed that, compared with the current advanced fusion algorithm, the proposed method had more abundant texture details and clearer contour edge information in subjective representation. In the evaluation of objective indicators, QAB/F, information entropy (IE), spatial frequency (SF), structural similarity (SSIM), mutual information (MI) and visual information fidelity for fusion (VIFF) were 2.0%, 6.3%, 7.0%, 5.5%, 9.0% and 3.3% higher than the best test results, respectively. The fused image can be effectively applied to medical diagnosis to further improve the diagnostic efficiency.
Medical image segmentation based on deep learning has become a powerful tool in the field of medical image processing. Due to the special nature of medical images, image segmentation algorithms based on deep learning face problems such as sample imbalance, edge blur, false positive, false negative, etc. In view of these problems, researchers mostly improve the network structure, but rarely improve from the unstructured aspect. The loss function is an important part of the segmentation method based on deep learning. The improvement of the loss function can improve the segmentation effect of the network from the root, and the loss function is independent of the network structure, which can be used in various network models and segmentation tasks in plug and play. Starting from the difficulties in medical image segmentation, this paper first introduces the loss function and improvement strategies to solve the problems of sample imbalance, edge blur, false positive and false negative. Then the difficulties encountered in the improvement of the current loss function are analyzed. Finally, the future research directions are prospected. This paper provides a reference for the reasonable selection, improvement or innovation of loss function, and guides the direction for the follow-up research of loss function.
Non-rigid registration plays an important role in medical image analysis. U-Net has been proven to be a hot research topic in medical image analysis and is widely used in medical image registration. However, existing registration models based on U-Net and its variants lack sufficient learning ability when dealing with complex deformations, and do not fully utilize multi-scale contextual information, resulting insufficient registration accuracy. To address this issue, a non-rigid registration algorithm for X-ray images based on deformable convolution and multi-scale feature focusing module was proposed. First, it used residual deformable convolution to replace the standard convolution of the original U-Net to enhance the expression ability of registration network for image geometric deformations. Then, stride convolution was used to replace the pooling operation of the downsampling operation to alleviate feature loss caused by continuous pooling. In addition, a multi-scale feature focusing module was introduced to the bridging layer in the encoding and decoding structure to improve the network model’s ability of integrating global contextual information. Theoretical analysis and experimental results both showed that the proposed registration algorithm could focus on multi-scale contextual information, handle medical images with complex deformations, and improve the registration accuracy. It is suitable for non-rigid registration of chest X-ray images.
Retinopathy of prematurity (ROP) is a major cause of vision loss and blindness among premature infants. Timely screening, diagnosis, and intervention can effectively prevent the deterioration of ROP. However, there are several challenges in ROP diagnosis globally, including high subjectivity, low screening efficiency, regional disparities in screening coverage, and severe shortage of pediatric ophthalmologists. The application of artificial intelligence (AI) as an assistive tool for diagnosis or an automated method for ROP diagnosis can improve the efficiency and objectivity of ROP diagnosis, expand screening coverage, and enable automated screening and quantified diagnostic results. In the global environment that emphasizes the development and application of medical imaging AI, developing more accurate diagnostic networks, exploring more effective AI-assisted diagnosis methods, and enhancing the interpretability of AI-assisted diagnosis, can accelerate the improvement of AI policies of ROP and the implementation of AI products, promoting the development of ROP diagnosis and treatment.
Medical image registration plays an important role in medical diagnosis and treatment planning. However, the current registration methods based on deep learning still face some challenges, such as insufficient ability to extract global information, large number of network model parameters, slow reasoning speed and so on. Therefore, this paper proposed a new model LCU-Net, which used parallel lightweight convolution to improve the ability of global information extraction. The problem of large number of network parameters and slow inference speed was solved by multi-scale fusion. The experimental results showed that the Dice coefficient of LCU-Net reached 0.823, the Hausdorff distance was 1.258, and the number of network parameters was reduced by about one quarter compared with that before multi-scale fusion. The proposed algorithm shows remarkable advantages in medical image registration tasks, and it not only surpasses the existing comparison algorithms in performance, but also has excellent generalization performance and wide application prospects.